Is scanf guaranteed to not change the value on failure?

Is scanf guaranteed to not change the value on failure? - c

If a scanf family function fails to match the current specifier, is it permitted to write to the storage where it would have stored the value on success?
On my system the following outputs 213 twice but is that guaranteed?
The language in the standard (C99 or C11) does not seem to clearly specify that the original value should remain unchanged (whether it was indeterminate or not).
#include <stdio.h>
int main()
{
int d = 213;
// matching failure
sscanf("foo", "%d", &d);
printf("%d\n", d);
// input failure
sscanf("", "%d", &d);
printf("%d\n", d);
}

The relevant part of the C11 standard is (7.21.6.2, for fscanf):
7 A directive that is a conversion specification defines a set of matching input sequences, as described below for each specifier. A conversion specification is executed in the following steps:
8 […]
9 An input item is read from the stream, unless the specification includes an n specifier. An input item is defined as the longest sequence of input characters which does not exceed any specified field width and which is, or is a prefix of, a matching input sequence.285) The first character, if any, after the input item remains unread. If the length of the input item is zero, the execution of the directive fails; this condition is a matching failure unless end-of-file, an encoding error, or a read error prevented input from the stream, in which case it is an input failure.
10 Except in the case of a % specifier, the input item (or, in the case of a %n directive, the count of input characters) is converted to a type appropriate to the conversion specifier. If the input item is not a matching sequence, the execution of the directive fails: this condition is a matching failure. Unless assignment suppression was indicated by a *, the result of the conversion is placed in the object pointed to by the first argument following the format argument that has not already received a conversion result. […]
To me, the words “step” and “If the length of the input item is zero, the execution of the directive fail” indicate that if the input does not match a specifier in the format, interpretation stops before any assignment for that specifier has occurred.
On the other hand, the subclause 4 about the ones quoted makes it clear that specifiers up to the failing one are assigned, again using language appropriate for ordered sequences of events:
4 The fscanf function executes each directive of the format in turn. When all directives have been executed, or if a directive fails (as detailed below), the function returns.

Judging from ISO/IEC 9899:2011 §7.21.6.2 The fscanf function:
¶10 Except in the case of a % specifier, the input item (or, in the case of a %n directive, the
count of input characters) is converted to a type appropriate to the conversion specifier. If
the input item is not a matching sequence, the execution of the directive fails: this
condition is a matching failure. Unless assignment suppression was indicated by a *, the
result of the conversion is placed in the object pointed to by the first argument following
the format argument that has not already received a conversion result. If this object
does not have an appropriate type, or if the result of the conversion cannot be represented
in the object, the behavior is undefined.
In the larger context, this seems to mean that the assignment to the target variable only occurs after the conversion is successful. For numeric types, that makes sense and is readily achievable. For string types, it is not so clear cut, but it should work the same way (the text quoted does state that the assignment only occurs if there is no matching failure or input failure). However, if there is an encoding error part way through a string (%s or %30c or %[a-z]), it would not be surprising to find that the first part of the string is changed even though the conversion as a whole failed. This could probably be regarded as a bug. Stimulating the bug accurately might be hard; for example, it might require UTF-8 input and an invalid byte such as 0xC0 or 0xF5 in the input stream.

Related

Why does scanf return a positive value even when the input is larger than the numercial type given?

Given the following code :
int x;
int r = scanf("%d", &x);
why is scanf returning 1 when the user inputs a number larger than INT_MAX or even larger than LONG_MAX?
From the documentation:
Number of receiving arguments successfully assigned.
Why is x considered successfully assigned? What does it mean exactly in this context? When the user gives numbers between INT_MAX and LONG_MAX, x appears to be the lower half of the result. I know scanf uses strtol internally but scanf could determine that the type int is too small to contain the result. Further, when passing a giant number, larger than LONG_MAX, the value of x is -1 and the return value is still 1 and I have to rely on errno to check that something went wrong (errno == ERANGE).
What does "successfully assigned" mean and why does scanf return 1 given that it could so easily tell that the result is, in fact, garbage?

Unfortunately you cannot rely on errno == ERANGE or such in portable C programs.
fscanf is not documented authoritatively on cppreference.com, but in the ISO C standard. Firstly, the standard states that
The fscanf function returns the value of the macro EOF if an input failure occurs before the first conversion (if any) has completed. Otherwise, the function returns the number of input items assigned, which can be fewer than provided for, or even zero, in the event of an early matching failure.
I.e. nowhere does it contain the word "successful".
On the contrary, it says:
[...] if the result of the conversion cannot be represented in the object, the behavior is undefined.
I.e. unfortunately there are no guarantees of behaviour in this case. In particular the standard never states that the result would be the largest number, or that errno would contain ERANGE or any other such thing.

why is scanf returning 1 when the user inputs a number larger then INT_MAX or even larger than LONG_MAX?
It is undefined behavior (UB) when the input text converts to outside the int range for "%d". It is a specification weakness of scanf().
Anything may happen.
Robust code separates input from conversion. Look to fgets() and strtol().

"scanf" is a function that reads data with specified format from a given string stream source. It allows the programmer to accept input from the standard input device (keyboard) and stores them in variables.
"scanf" stands for 'scan format', because it scans the input for valid tokens and parses them according to a specified format.
Reads a real number that (by default) the user has typed in into the variable miles
Each variable must be preceded by the ampersand (&) address-of operator.
We can read values into multiple variables with a single scanf as long as we have corresponding format strings for each variable.
"scanf" successfully accomplishing the 'scan format' function. It's not the function of "scanf" to check for the range for corresponding datatype.

Remember that scanf (and its brothers and sisters) is a variadic function. There's value in telling the caller how many arguments were successfully assigned, where "successfully" might mean less than you think it does. It's the responsibility of the caller to make sure the arguments agree.

Why is the output of this C code 1?

Why is the output always 1 no matter what string i enter.
Please explain
int main() {
char ch[]={};
printf("%d", scanf("%s", ch));
return 0;
}

Man page of scanf:
These functions return the number of input items successfully matched
and assigned, which can be fewer than provided for, or even zero in
the event of an early matching failure.
The value EOF is returned if the end of input is reached before either
the first successful conversion or a matching failure occurs. EOF is
also returned if a read error occurs, in which case the error
indicator for the stream (see ferror(3)) is set, and errno is set
indicate the error.
So, It means scanf() returns number of items successfully read.
Also, C not allowed zero size arrays.
C11 6.7.6.2 Array declarators :
Paragraph 1:
In addition to optional type qualifiers and the keyword static, the [
and ] may delimit an expression or *. If they delimit an expression
(which specifies the size of an array), the expression shall have an
integer type. If the expression is a constant expression, it shall
have a value greater than zero. The element type shall not be an
incomplete or function type. The optional type qualifiers and the
keyword static shall appear only in a declaration of a function
parameter with an array type, and then only in the outermost array
type derivation.

Because scanf returns the number of items it reads. In your case, it is reading one string and thus it returns 1 which is then printed to the standard output through printf.

Comma between the two integers during the input

What exactly happens if I do the following
scanf("%d,%d", &i, &j);
and provide an input which causes the matching failure? Will it store garbage into j?

The input has to exactly match the supplied format for scanf() to be success.
Quoting C11, chapter §7.21.6.2, fsacnf(), (emphasis mine)
Except in the case of a % specifier, the input item (or, in the case of a %n directive, the
count of input characters) is converted to a type appropriate to the conversion specifier. If
the input item is not a matching sequence, the execution of the directive fails: this
condition is a matching failure. Unless assignment suppression was indicated by a *, the
result of the conversion is placed in the object pointed to by the first argument following
the format argument that has not already received a conversion result. If this object
does not have an appropriate type, or if the result of the conversion cannot be represented
in the object, the behavior is undefined.
and,
When all directives
have been executed, or if a directive fails (as detailed below), the function returns.
So, consolidating the above cases,
For an input like 100, 200, the scanning will be success. Both i and j will hold the given values, 100 and 200, respectively.
For an input like 100 - 200, the scanning will fail (matching failure) and the content of j will remain unchanged, i.e., j is not assigned any value by scanf() operation.
Word of advice: always check the return value of scanf() function family to ensure the success of the function call.

Why is this program not printing the input I provided? (C)

Code I have:
int main(){
char readChars[3];
puts("Enter the value of the card please:");
scanf(readChars);
printf(readChars);
printf("done");
}
All I see is:
"done"
after I enter some value to terminal and pressing Enter, why?
Edit:
Isn't the prototype for scanf:
int scanf(const char *format, ...);
So I should be able to use it with just one argument?

The actual problem is that you are passing an uninitialized array as the format to scanf().
Also you are invoking scanf() the wrong way try this
if (scanf("%2s", readChars) == 1)
printf("%s\n", readChars);
scanf() as well as printf() use a format string and that's actually the cause for the f in their name.
And yes you are able to use it with just one argument, scanf() scans input according to the format string, the format string uses special values that are matched against the input, if you don't specify at least one then scanf() will only be useful for input validation.
The following was extracted from C11 draft
7.21.6.2 The fscanf function
The format shall be a multibyte character sequence, beginning and ending in its initial shift state. The format is composed of zero or more directives: one or more white-space characters, an ordinary multibyte character (neither % nor a white-space character), or a conversion specification. Each conversion specification is introduced by the character %. After the %, the following appear in sequence:
An optional assignment-suppressing character *.
An optional decimal integer greater than zero that specifies the maximum field width
(in characters).
An optional length modifier that specifies the size of the receiving object.
A conversion specifier character that specifies the type of conversion to be applied.
as you can read above, you need to pass at least one conversion specifier, and in that case the corresponding argument to store the converted value, if you pass the conversion specifier but you don't give an argument for it, the behavior is undefined.

Yes, it is possible to call scanf with just one parameter, and it may even be useful on occasion. But it wouldn't do what you apparently thought it would. (It would just expect the characters in the argument in the input stream and skip them.) You didn't notice because you failed to do due diligence as a programmer. I'll list what you should do:
RTFM. scanf's first parameter is a format string. Plain characters which are not part of conversion sequences and are not whitespace are expected literally in the input. They are read and discarded. If they do not appear, conversion stops there, and the position in the input stream where the unexpected character occured is the start of subsequent reads. In your case probably no character was ever successfully read from the input, but you don't know for sure, because you didn't initialize the format string (see below).
Another interesting detail is scanf's return value which indicates the number items successfully read. I'll discuss that below together with the importance to check return values.
Initialize locals. C doesn't automatically initialize local data for performance reasons (in today's light one would probably enforce user initialization like other languages do, or make auto initialization a default with an opt-out possibility for the few inner loops where it would hurt). Because you didn't initialize readchars, you don't know what's in it, so you don't know what scanf expected in the input stream. On top it probably is nominally undefined behaviour. (But on your PC it shouldn't do anything unexpected.)
Check return values. scanf probably returned 0 in your example. The manual states that scanf returns the number of items successfully read, here 0, i.e. no input conversion took place. This type of undetected failure can be fatal in long sequences of read operations because the following scanfs may read in one-off indexes from a sequence of tokens, or may stall as well (and not update their pointees at all), etc.
Please bear with me -- I do not always read the manual, check return values or (by error) initialize variables for little test programs. But if it doesn't work, it's part of my investigation. And before I ask anybody, let alone the world, I make damn sure that I have done my best to find out what I did wrong, beforehand.

You're not using scanf correctly:
scanf(formatstring, address_of_destination,...)
is the right way to do it.
EDIT:
Isn't the prototype for scanf:
int scanf(const char *format, ...);
So I should be able to use it with just one argument?
No, you should not. Please read documentation on scanf; format is a string specifying what scanf should read, and the ... are the things that scanf should read into.

The first argument to scanf is the format string. What you need is:
scanf("%2s", readChars);

It Should provided Format specifiers in scanf function
char readChars[3];
puts("Enter the value of the card please:");
scanf("%s",readChars);
printf("%s",readChars);
printf("done");
http://www.cplusplus.com/reference/cstdio/scanf/ more info...

Logical inconsistency with [ ] conversion specifier in scanf() in C

Please have a look at this code snippet:
char line1[10], line2[10];
int rtn;
rtn = scanf("%9[a]%9[^\n]", line1, line2);
printf("line1 = %s|\nline2 = %s|\n", line1, line2);
printf("rtn = %d\n", rtn);
Output:
$ gcc line.c -o line
$ ./line
abook
line1 = a|
line2 = book|
rtn = 2
$./line
book
line1 = |
line2 = �Js�|
rtn = 0
$
For input abook, %9[a] fails at b from the book and stores previously parsed a+\0 at line1.
Then %9[^\n] parses the remaining line and stores just now parsed book+\0 at line2.
Please note 2 points here:
At the time of storing the parsed input, \0 is appended at the end of it since %[] is a conversion specifier for a string.
When %9[a] failed at b, scanf didn't exit. It simply went on scanning further input.
Now for input book, %9[a] should fail at b from the book and should store just \0 at line1 since here nothing was parsed.
Then %9[^\n] should parse the remaining line and should store just now parsed book+\0 at line2.
Now, let's see what exactly happened:
Here return value is 0 that means scanf didn't assign value to any variable. scanf simply exited without assigning any values. So garbage data at line2. And in the case of line1 that garbage data happen to be a NULL character.
But this is quite strange! Isn't it?
I mean scanf exits if %[...] fails at the very first character of input. (Even if additional conversion specifier is there in scanf statement.)
But if the same %[...] fails at any other character other than first one then scanf simply continues scanning the further input. (If additional conversion specifier is there of course.) It doesn't exit.
So why this inconsistency?
Why not let scanf statement continue scan the input (if additional conversion specifier is there of course) even if %[...] fails at the very first char of input? Exactly like what happens in other case.
Is there any special reason behind this inconsistency?
$ gcc --version
gcc (Ubuntu 4.4.3-4ubuntu5.1) 4.4.3

2) When %9[a] failed at b, scanf didn't exit. It simply went on scanning further input.
Yes, the %9[a] directive means "store up to 9 'a's, but at least one"(1), so the conversion %9[a] did not fail, it succeeded. It found fewer 'a's than it could have consumed, but that's not a failure. The input matching failed at the 'b', but the conversion succeeded.
(1) Specified in 7.21.6.2 (12) where the conversions are described:
[ Matches a nonempty sequence of characters from a set of expected characters (the scanset).
Now for input book, %9[a] should fail at b from the book and should store just '\0' at line1 since here nothing was parsed. Then %9[^\n] should parse the remaining line and should store just now parsed book+\0 at line2.
No. It is supposed to exit when a conversion fails. The first conversion %9[a] failed, so scanf is supposed to stop and return 0, since no conversion succeeded.
Always check the return value of scanf.
That is specified (for fscanf, but scanf is equivalent to fscanf with stdin as input stream) in 7.21.6.2 (16):
The fscanf function returns the value of the macro EOF if an input failure occurs
before the first conversion (if any) has completed. Otherwise, the function returns the
number of input items assigned, which can be fewer than provided for, or even zero, in
the event of an early matching failure.
Here output for line1 is nothing which is exactly what we expected. An empty string!
You can't expect anything. The arrays line1 and line2 aren't initialised, so when the conversion fails, their contents is still indeterminate. In this case, line1 contained no printable character before the first 0 byte.
But for line2 it's garbage chars! We didn't expect this. So how did this happen ?
That's what happened to be the contents of line2. There were never any values assigned to the elements, so they are whatever they happened to be before the call to scanf.

Transferred from comments to the question since the response to the reply question requires more space than the comments allow.
This comment refers to an earlier version of the code:
Since you didn't check the return value from scanf(), you've no idea whether it said "I failed" or not. You can't blame it when you ignore its error returns; in the second example, it will have said '0 items scanned successfully', which means that none of the variables were set to anything useful at all. You must always check the return value from scanf() so you know whether it did what you expected.
The reply question is:
I updated the code and output to show the return value of scanf. And yes for case 2 the return value is 0. But this doesn't answer the question. Clearly scanf exited in case 2. But for case 1, return value is 2 which means scanf successfully assigned values to both the variables. So why this inconsistency?
I don't see any inconsistency. The fscanf() specification (copied from ISO/IEC 9899:2011, but the URL links to POSIX rather than the C standard) says:
¶3 [...] Each conversion specification is introduced by the character %.
After the %, the following appear in sequence:
— An optional assignment-suppressing character *.
— An optional decimal integer greater than zero that specifies the maximum field width
(in characters).
— An optional length modifier that specifies the size of the receiving object.
— A conversion specifier character that specifies the type of conversion to be applied.
Later, it says:
¶8 [...] Input white-space characters (as specified by the isspace function) are skipped, unless
the specification includes a [, c, or n specifier.284)
¶9 An input item is read from the stream, unless the specification includes an n specifier. An
input item is defined as the longest sequence of input characters which does not exceed
any specified field width and which is, or is a prefix of, a matching input sequence.285)
The first character, if any, after the input item remains unread. If the length of the input
item is zero, the execution of the directive fails; this condition is a matching failure unless
end-of-file, an encoding error, or a read error prevented input from the stream, in which
case it is an input failure.
¶12 [...]
[ Matches a nonempty sequence of characters from a set of expected characters
(the scanset).286)
[Bold italic emphasis added. I've left the footnote references in place, but the contents of the footnotes are not material to the discussion so I've omitted them.]
So, the behaviour you are seeing is exactly what the standard demands. When %9[a] is applied to the string abook, there is a sequence of one a which matches the %9[a] conversion specification, so the directive is successful, and the scan continues with book. When %9[a] is applied to the string book, there are zero characters matching the item, so the execution of the directive fails and it is a matching error and since it is the first conversion specification, the return value of 0 is correct.
Note that the length specifies a maximum field width, so the 9 in %9[a] means 1-9 letters a.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight