Why is the output of this C code 1? - c

Why is the output always 1 no matter what string i enter.
Please explain
int main() {
char ch[]={};
printf("%d", scanf("%s", ch));
return 0;
}

Man page of scanf:
These functions return the number of input items successfully matched
and assigned, which can be fewer than provided for, or even zero in
the event of an early matching failure.
The value EOF is returned if the end of input is reached before either
the first successful conversion or a matching failure occurs. EOF is
also returned if a read error occurs, in which case the error
indicator for the stream (see ferror(3)) is set, and errno is set
indicate the error.
So, It means scanf() returns number of items successfully read.
Also, C not allowed zero size arrays.
C11 6.7.6.2 Array declarators :
Paragraph 1:
In addition to optional type qualifiers and the keyword static, the [
and ] may delimit an expression or *. If they delimit an expression
(which specifies the size of an array), the expression shall have an
integer type. If the expression is a constant expression, it shall
have a value greater than zero. The element type shall not be an
incomplete or function type. The optional type qualifiers and the
keyword static shall appear only in a declaration of a function
parameter with an array type, and then only in the outermost array
type derivation.

Because scanf returns the number of items it reads. In your case, it is reading one string and thus it returns 1 which is then printed to the standard output through printf.

Related

Why does scanf return a positive value even when the input is larger than the numercial type given?

Given the following code :
int x;
int r = scanf("%d", &x);
why is scanf returning 1 when the user inputs a number larger than INT_MAX or even larger than LONG_MAX?
From the documentation:
Number of receiving arguments successfully assigned.
Why is x considered successfully assigned? What does it mean exactly in this context? When the user gives numbers between INT_MAX and LONG_MAX, x appears to be the lower half of the result. I know scanf uses strtol internally but scanf could determine that the type int is too small to contain the result. Further, when passing a giant number, larger than LONG_MAX, the value of x is -1 and the return value is still 1 and I have to rely on errno to check that something went wrong (errno == ERANGE).
What does "successfully assigned" mean and why does scanf return 1 given that it could so easily tell that the result is, in fact, garbage?
Unfortunately you cannot rely on errno == ERANGE or such in portable C programs.
fscanf is not documented authoritatively on cppreference.com, but in the ISO C standard. Firstly, the standard states that
The fscanf function returns the value of the macro EOF if an input failure occurs before the first conversion (if any) has completed. Otherwise, the function returns the number of input items assigned, which can be fewer than provided for, or even zero, in the event of an early matching failure.
I.e. nowhere does it contain the word "successful".
On the contrary, it says:
[...] if the result of the conversion cannot be represented in the object, the behavior is undefined.
I.e. unfortunately there are no guarantees of behaviour in this case. In particular the standard never states that the result would be the largest number, or that errno would contain ERANGE or any other such thing.
why is scanf returning 1 when the user inputs a number larger then INT_MAX or even larger than LONG_MAX?
It is undefined behavior (UB) when the input text converts to outside the int range for "%d". It is a specification weakness of scanf().
Anything may happen.
Robust code separates input from conversion. Look to fgets() and strtol().
"scanf" is a function that reads data with specified format from a given string stream source. It allows the programmer to accept input from the standard input device (keyboard) and stores them in variables.
"scanf" stands for 'scan format', because it scans the input for valid tokens and parses them according to a specified format.
Reads a real number that (by default) the user has typed in into the variable miles
Each variable must be preceded by the ampersand (&) address-of operator.
We can read values into multiple variables with a single scanf as long as we have corresponding format strings for each variable.
"scanf" successfully accomplishing the 'scan format' function. It's not the function of "scanf" to check for the range for corresponding datatype.
Remember that scanf (and its brothers and sisters) is a variadic function. There's value in telling the caller how many arguments were successfully assigned, where "successfully" might mean less than you think it does. It's the responsibility of the caller to make sure the arguments agree.

Comma between the two integers during the input

What exactly happens if I do the following
scanf("%d,%d", &i, &j);
and provide an input which causes the matching failure? Will it store garbage into j?
The input has to exactly match the supplied format for scanf() to be success.
Quoting C11, chapter §7.21.6.2, fsacnf(), (emphasis mine)
Except in the case of a % specifier, the input item (or, in the case of a %n directive, the
count of input characters) is converted to a type appropriate to the conversion specifier. If
the input item is not a matching sequence, the execution of the directive fails: this
condition is a matching failure. Unless assignment suppression was indicated by a *, the
result of the conversion is placed in the object pointed to by the first argument following
the format argument that has not already received a conversion result. If this object
does not have an appropriate type, or if the result of the conversion cannot be represented
in the object, the behavior is undefined.
and,
When all directives
have been executed, or if a directive fails (as detailed below), the function returns.
So, consolidating the above cases,
For an input like 100, 200, the scanning will be success. Both i and j will hold the given values, 100 and 200, respectively.
For an input like 100 - 200, the scanning will fail (matching failure) and the content of j will remain unchanged, i.e., j is not assigned any value by scanf() operation.
Word of advice: always check the return value of scanf() function family to ensure the success of the function call.

Is scanf guaranteed to not change the value on failure?

If a scanf family function fails to match the current specifier, is it permitted to write to the storage where it would have stored the value on success?
On my system the following outputs 213 twice but is that guaranteed?
The language in the standard (C99 or C11) does not seem to clearly specify that the original value should remain unchanged (whether it was indeterminate or not).
#include <stdio.h>
int main()
{
int d = 213;
// matching failure
sscanf("foo", "%d", &d);
printf("%d\n", d);
// input failure
sscanf("", "%d", &d);
printf("%d\n", d);
}
The relevant part of the C11 standard is (7.21.6.2, for fscanf):
7 A directive that is a conversion specification defines a set of matching input sequences, as described below for each specifier. A conversion specification is executed in the following steps:
8 […]
9 An input item is read from the stream, unless the specification includes an n specifier. An input item is defined as the longest sequence of input characters which does not exceed any specified field width and which is, or is a prefix of, a matching input sequence.285) The first character, if any, after the input item remains unread. If the length of the input item is zero, the execution of the directive fails; this condition is a matching failure unless end-of-file, an encoding error, or a read error prevented input from the stream, in which case it is an input failure.
10 Except in the case of a % specifier, the input item (or, in the case of a %n directive, the count of input characters) is converted to a type appropriate to the conversion specifier. If the input item is not a matching sequence, the execution of the directive fails: this condition is a matching failure. Unless assignment suppression was indicated by a *, the result of the conversion is placed in the object pointed to by the first argument following the format argument that has not already received a conversion result. […]
To me, the words “step” and “If the length of the input item is zero, the execution of the directive fail” indicate that if the input does not match a specifier in the format, interpretation stops before any assignment for that specifier has occurred.
On the other hand, the subclause 4 about the ones quoted makes it clear that specifiers up to the failing one are assigned, again using language appropriate for ordered sequences of events:
4 The fscanf function executes each directive of the format in turn. When all directives have been executed, or if a directive fails (as detailed below), the function returns.
Judging from ISO/IEC 9899:2011 §7.21.6.2 The fscanf function:
¶10 Except in the case of a % specifier, the input item (or, in the case of a %n directive, the
count of input characters) is converted to a type appropriate to the conversion specifier. If
the input item is not a matching sequence, the execution of the directive fails: this
condition is a matching failure. Unless assignment suppression was indicated by a *, the
result of the conversion is placed in the object pointed to by the first argument following
the format argument that has not already received a conversion result. If this object
does not have an appropriate type, or if the result of the conversion cannot be represented
in the object, the behavior is undefined.
In the larger context, this seems to mean that the assignment to the target variable only occurs after the conversion is successful. For numeric types, that makes sense and is readily achievable. For string types, it is not so clear cut, but it should work the same way (the text quoted does state that the assignment only occurs if there is no matching failure or input failure). However, if there is an encoding error part way through a string (%s or %30c or %[a-z]), it would not be surprising to find that the first part of the string is changed even though the conversion as a whole failed. This could probably be regarded as a bug. Stimulating the bug accurately might be hard; for example, it might require UTF-8 input and an invalid byte such as 0xC0 or 0xF5 in the input stream.

Logical inconsistency with [ ] conversion specifier in scanf() in C

Please have a look at this code snippet:
char line1[10], line2[10];
int rtn;
rtn = scanf("%9[a]%9[^\n]", line1, line2);
printf("line1 = %s|\nline2 = %s|\n", line1, line2);
printf("rtn = %d\n", rtn);
Output:
$ gcc line.c -o line
$ ./line
abook
line1 = a|
line2 = book|
rtn = 2
$./line
book
line1 = |
line2 = �Js�|
rtn = 0
$
For input abook, %9[a] fails at b from the book and stores previously parsed a+\0 at line1.
Then %9[^\n] parses the remaining line and stores just now parsed book+\0 at line2.
Please note 2 points here:
At the time of storing the parsed input, \0 is appended at the end of it since %[] is a conversion specifier for a string.
When %9[a] failed at b, scanf didn't exit. It simply went on scanning further input.
Now for input book, %9[a] should fail at b from the book and should store just \0 at line1 since here nothing was parsed.
Then %9[^\n] should parse the remaining line and should store just now parsed book+\0 at line2.
Now, let's see what exactly happened:
Here return value is 0 that means scanf didn't assign value to any variable. scanf simply exited without assigning any values. So garbage data at line2. And in the case of line1 that garbage data happen to be a NULL character.
But this is quite strange! Isn't it?
I mean scanf exits if %[...] fails at the very first character of input. (Even if additional conversion specifier is there in scanf statement.)
But if the same %[...] fails at any other character other than first one then scanf simply continues scanning the further input. (If additional conversion specifier is there of course.) It doesn't exit.
So why this inconsistency?
Why not let scanf statement continue scan the input (if additional conversion specifier is there of course) even if %[...] fails at the very first char of input? Exactly like what happens in other case.
Is there any special reason behind this inconsistency?
$ gcc --version
gcc (Ubuntu 4.4.3-4ubuntu5.1) 4.4.3
2) When %9[a] failed at b, scanf didn't exit. It simply went on scanning further input.
Yes, the %9[a] directive means "store up to 9 'a's, but at least one"(1), so the conversion %9[a] did not fail, it succeeded. It found fewer 'a's than it could have consumed, but that's not a failure. The input matching failed at the 'b', but the conversion succeeded.
(1) Specified in 7.21.6.2 (12) where the conversions are described:
[ Matches a nonempty sequence of characters from a set of expected characters (the scanset).
Now for input book, %9[a] should fail at b from the book and should store just '\0' at line1 since here nothing was parsed. Then %9[^\n] should parse the remaining line and should store just now parsed book+\0 at line2.
No. It is supposed to exit when a conversion fails. The first conversion %9[a] failed, so scanf is supposed to stop and return 0, since no conversion succeeded.
Always check the return value of scanf.
That is specified (for fscanf, but scanf is equivalent to fscanf with stdin as input stream) in 7.21.6.2 (16):
The fscanf function returns the value of the macro EOF if an input failure occurs
before the first conversion (if any) has completed. Otherwise, the function returns the
number of input items assigned, which can be fewer than provided for, or even zero, in
the event of an early matching failure.
Here output for line1 is nothing which is exactly what we expected. An empty string!
You can't expect anything. The arrays line1 and line2 aren't initialised, so when the conversion fails, their contents is still indeterminate. In this case, line1 contained no printable character before the first 0 byte.
But for line2 it's garbage chars! We didn't expect this. So how did this happen ?
That's what happened to be the contents of line2. There were never any values assigned to the elements, so they are whatever they happened to be before the call to scanf.
Transferred from comments to the question since the response to the reply question requires more space than the comments allow.
This comment refers to an earlier version of the code:
Since you didn't check the return value from scanf(), you've no idea whether it said "I failed" or not. You can't blame it when you ignore its error returns; in the second example, it will have said '0 items scanned successfully', which means that none of the variables were set to anything useful at all. You must always check the return value from scanf() so you know whether it did what you expected.
The reply question is:
I updated the code and output to show the return value of scanf. And yes for case 2 the return value is 0. But this doesn't answer the question. Clearly scanf exited in case 2. But for case 1, return value is 2 which means scanf successfully assigned values to both the variables. So why this inconsistency?
I don't see any inconsistency. The fscanf() specification (copied from ISO/IEC 9899:2011, but the URL links to POSIX rather than the C standard) says:
¶3 [...] Each conversion specification is introduced by the character %.
After the %, the following appear in sequence:
— An optional assignment-suppressing character *.
— An optional decimal integer greater than zero that specifies the maximum field width
(in characters).
— An optional length modifier that specifies the size of the receiving object.
— A conversion specifier character that specifies the type of conversion to be applied.
Later, it says:
¶8 [...] Input white-space characters (as specified by the isspace function) are skipped, unless
the specification includes a [, c, or n specifier.284)
¶9 An input item is read from the stream, unless the specification includes an n specifier. An
input item is defined as the longest sequence of input characters which does not exceed
any specified field width and which is, or is a prefix of, a matching input sequence.285)
The first character, if any, after the input item remains unread. If the length of the input
item is zero, the execution of the directive fails; this condition is a matching failure unless
end-of-file, an encoding error, or a read error prevented input from the stream, in which
case it is an input failure.
¶12 [...]
[ Matches a nonempty sequence of characters from a set of expected characters
(the scanset).286)
[Bold italic emphasis added. I've left the footnote references in place, but the contents of the footnotes are not material to the discussion so I've omitted them.]
So, the behaviour you are seeing is exactly what the standard demands. When %9[a] is applied to the string abook, there is a sequence of one a which matches the %9[a] conversion specification, so the directive is successful, and the scan continues with book. When %9[a] is applied to the string book, there are zero characters matching the item, so the execution of the directive fails and it is a matching error and since it is the first conversion specification, the return value of 0 is correct.
Note that the length specifies a maximum field width, so the 9 in %9[a] means 1-9 letters a.

What should scanf return on EOF after partial match?

Consider the following scanf conversion specifiers and inputs:
"%4c" and "abc"
"%x" and "0x"
"%f" and "1.0e+"
That is, cases where the input is an initial subsequence of a match, but not a match. Assuming EOF is reached after the incomplete match, is scanf supposed to return EOF or 0? The text in C99 reads:
The fscanf function returns the value of the macro EOF if an input failure occurs before any conversion. Otherwise, the function returns the number of input items assigned, which can be fewer than provided for, or even zero, in the event of an early matching failure.
And in POSIX 2008, it reads:
Upon successful completion, these functions shall return the number of successfully matched and assigned input items; this number can be zero in the event of an early matching failure. If the input ends before the first matching failure or conversion, EOF shall be returned. If any error occurs, EOF shall be returned, [CX] and errno shall be set to indicate the error. If a read error occurs, the error indicator for the stream shall be set.
What's unclear to me is whether the partial but incomplete match constitutes an "early matching failure". I would find the return value of 0 in this case a lot more useful (it distinguishes the cases of plain EOF versus invalid truncated data) but what I'm looking for is just help interpreting the standard.
Note that glibc's scanf is completely incorrect on all of these inputs and returns 1, treating the invalid input as a match. I'm pretty sure this issue has been reported and marked WONTFIX. :-(
I'm expecting 0.
7.16.6.2/4:
Failures are described as input failures (due to the occurence of an encoding error or the unavailability of inout characters), or matching failures (due to inappropriate input).

Resources