Does sscanf touch pointers if no match has been found? [duplicate] - c

This question already has answers here:
Is `sscanf` guaranteed not to change arguments that it doesn't find?
(5 answers)
Closed 8 years ago.
If a line does not match a [fsv]scanf format, does scanf guarantee not to touch the provided pointers that are not matched?
For example, if
int int1 = 3;
int int2 = 5;
sscanf(line, "%d %d", &int1, &int2);
returns 0, are the integers guaranteed to be still 3 and 5, or can int1 have been changed?

The short answer is yes, in your case you can guarantee that int1 and int2 have not changed.
However, I would advise against relying on this behaviour, as it's likely to produce code that is difficult to read - and because:
The long answer is it depends on your format string. Looking at the C11 standard for fscanf (s7.21.6.2.16), we have:
The fscanf function returns the value of the macro EOF if an input failure occurs
before the first conversion (if any) has completed. Otherwise, the function returns the
number of input items assigned, which can be fewer than provided for, or even zero, in
the event of an early matching failure.
Critically important is this definition of input items from later in 7.21.6.2:
An input item is defined as the longest sequence of input characters which does not exceed
any specified field width and which is, or is a prefix of, a matching input sequence
So. The number returned by scanf is the number of items read from the stream, not the number of pointers written to.
Additionally relevant is 7.21.6.2.2:
If the format is exhausted while arguments remain, the excess
arguments are evaluated (as always) but are otherwise ignored.
The behaviour of ignoring arguments that aren't written to is also made explicit in an example at the end of that section:
In:
#include <stdio.h>
/* ... */
int d1, d2, n1, n2, i;
i = sscanf("123", "%d%n%n%d", &d1, &n1, &n2, &d2);
the value 123 is assigned to d1 and the value 3 to n1. Because %n can never get an input failure the value of 3 is also assigned to n2. The value of d2 is not affected. The value 1 is assigned to i.
In case you're not familiar with %n, it's "the number of characters read from the stream so far".
This is a great example to illustrate your question - here we have three pointers written to, and one pointer untouched. But, fscanf only returns 1 here - because it only assigned one "input item" from the stream.
So, in your example, yes, if you've got %d %d and you pass it something which causes 0 reads, then yes, the pointers will be untouched.
But, if you've got a %n in there, then your function could still return 0 or EOF while still consuming some input and writing to pointers. For example:
sscanf("aaa","aaa%n%d",&n1,&n2);
This writes 3 to n1, leaves n2 untouched, and returns EOF. And:
sscanf("aaa bbb","aaa%n%d",&n1,&n2);
This writes 3 to n1, leaves n2 untouched, and returns 0.

There should be at least as many of these arguments as the number of values stored by the format specifier of sscanf. Additional arguments are ignored by the function sscanf.
If line contains say "10 20", then int1 and int2 will be changed to 10 and 20 respectively.
However, if line contains say "aa bb", then int1 and int2 will be retained as 3 and 5 respectively.

Related

Why does scanf return a positive value even when the input is larger than the numercial type given?

Given the following code :
int x;
int r = scanf("%d", &x);
why is scanf returning 1 when the user inputs a number larger than INT_MAX or even larger than LONG_MAX?
From the documentation:
Number of receiving arguments successfully assigned.
Why is x considered successfully assigned? What does it mean exactly in this context? When the user gives numbers between INT_MAX and LONG_MAX, x appears to be the lower half of the result. I know scanf uses strtol internally but scanf could determine that the type int is too small to contain the result. Further, when passing a giant number, larger than LONG_MAX, the value of x is -1 and the return value is still 1 and I have to rely on errno to check that something went wrong (errno == ERANGE).
What does "successfully assigned" mean and why does scanf return 1 given that it could so easily tell that the result is, in fact, garbage?
Unfortunately you cannot rely on errno == ERANGE or such in portable C programs.
fscanf is not documented authoritatively on cppreference.com, but in the ISO C standard. Firstly, the standard states that
The fscanf function returns the value of the macro EOF if an input failure occurs before the first conversion (if any) has completed. Otherwise, the function returns the number of input items assigned, which can be fewer than provided for, or even zero, in the event of an early matching failure.
I.e. nowhere does it contain the word "successful".
On the contrary, it says:
[...] if the result of the conversion cannot be represented in the object, the behavior is undefined.
I.e. unfortunately there are no guarantees of behaviour in this case. In particular the standard never states that the result would be the largest number, or that errno would contain ERANGE or any other such thing.
why is scanf returning 1 when the user inputs a number larger then INT_MAX or even larger than LONG_MAX?
It is undefined behavior (UB) when the input text converts to outside the int range for "%d". It is a specification weakness of scanf().
Anything may happen.
Robust code separates input from conversion. Look to fgets() and strtol().
"scanf" is a function that reads data with specified format from a given string stream source. It allows the programmer to accept input from the standard input device (keyboard) and stores them in variables.
"scanf" stands for 'scan format', because it scans the input for valid tokens and parses them according to a specified format.
Reads a real number that (by default) the user has typed in into the variable miles
Each variable must be preceded by the ampersand (&) address-of operator.
We can read values into multiple variables with a single scanf as long as we have corresponding format strings for each variable.
"scanf" successfully accomplishing the 'scan format' function. It's not the function of "scanf" to check for the range for corresponding datatype.
Remember that scanf (and its brothers and sisters) is a variadic function. There's value in telling the caller how many arguments were successfully assigned, where "successfully" might mean less than you think it does. It's the responsibility of the caller to make sure the arguments agree.

Using sscanf for an unknown number of variables

For a school assignment, I have to read in a string that has at least one but up to three variables(named command, one, and two). There is always a character at the beginning of the string, but it may or may not be followed by integers. The format could be like any of the following:
i 5 17
i 3
p
d 4
I am using fgets to read the string from the file, but I'm having trouble processing it. I've been trying to use sscanf, but I'm getting segfaults reading in a string that only has one or two variables instead of three.
Is there a different function I should be using?
Or is there a way to format sscanf to do what I need?
I've tried sscanf(buffer, "%c %d %d", command, one, two) and several variations with no luck.
sscanf is probably up to this task, depending on the exact requirements and ranges of inputs.
The key here is is that the scanf family functions returns a useful value which indicates how many conversions were made. This can be less than zero: the value EOF (a negative value) can be returned if the end of the input occurs or an I/O error, before the first conversion is even attempted.
Note that the %c conversion specifier doesn't produce a null-terminated string. By default, it reads only one character and stores it through the argument pointer. E.g.
char ch;
sscanf("abc", "%c", &ch);
this will write the character 'a' into ch.
Unless you have an iron-clad assurance that the first field is always one character wide, it's probably better to read it as a string with %s. Always use a maximum width with %s not to overflow the destination buffer. For instance:
char field1[64]; /* one larger than field width, for terminating null */
sscanf(..., "%63s", field1, ...);
sscanf doesn't perform any overflow checks on integers. If %d is used to scan a large negative or positive value that doesn't fit into int, the behavior is simply undefined according to ISO C. So, just like with %s, %d is best used with a field width limitation. For instance, %4d for reading a four digit year. Four decimal digits will not overflow int.

What does conversion specifier %n exactly do? [duplicate]

This question already has answers here:
What is the use of the %n format specifier in C?
(12 answers)
Closed 7 years ago.
I just encountered this by looking in the standard:
7.19.6.1 The fprintf function
in
8 The conversion specifiers and their meanings are:
regarding to:
n The argument shall be a pointer to signed integer into which is written the
number of characters written to the output stream so far by this call to
fprintf. No argument is converted, but one is consumed. If the conversion
specification includes any flags, a field width, or a precision, the behavior is
undefined.
What does this mean? What does %n do?
Did I get it correct, that acording to:
Returns
14 The fprintf function returns the number of characters transmitted
In this snippet:
int a, b;
b = printf ("Thi%n\s is just a test",&a);
a would equal to b?
the number of characters written to the output stream so far
"So far", means wherever you place your %n, the result will change. As of your example, it will be 3.
If your increase your %s position by one char, the resulting variable pointed will increase by one. Placing your %s at the very end of the string will make it equal to the value returned by printf
a = 3 and b = 19 for your case
a will be equal to number of character printed before %n.
Suppose you try to print printf ("This%sis%n just a test","coder", &a);
Then the value of a will be this + coder + is = 11.
And the value of b is always the total number of characters printed

fscanf return value

What does fscanf return when it reads data in the file. For example,
int number1, number2, number3, number4, c;
c = fscanf (spFile, "%d", &number1);
//c will be 1 in this case.
c = fscanf (spFile, "%d %d %d %d", &number1, &number1, &number3, &number4);
//in this case, c will return 4.
I just want to know why it returns such values depending on the number of arguments.
From the manpage for the Xscanf family of functions:
Upon successful completion, these functions shall return the number of
successfully matched and assigned input items; this number can be zero
in the event of an early matching failure. If the input ends before
the first matching failure or conversion, EOF shall be returned. If a
read error occurs, the error indicator for the stream is set, EOF
shall be returned, and errno shall be set to
indicate the error
So your first call to fscanf returns 1 because one input item (&number1) was successfully matched with the format specifier %d. Your second call to fscanf returns 4 because all 4 arguments were matched.
I quote from cplusplus.com .
On success, the function returns the number of items of the argument
list successfully filled. This count can match the expected number of
items or be less (even zero) due to a matching failure, a reading
error, or the reach of the end-of-file.
If a reading error happens or the end-of-file is reached while
reading, the proper indicator is set (feof or ferror). And, if either
happens before any data could be successfully read, EOF is returned.
--EDIT--
If you are intention is to determine the number of bytes read to a string.
int bytes;
char str[80];
fscanf (stdin, "%s%n",str,&bytes);
printf("Number of bytes read = %d",bytes);
From the manual page:
*These functions return the number of input items successfully matched and assigned, which can be fewer than provided for, or even zero in the event of an early matching failure. *
Hence 1st one returns 1 if able to read one integer from the file, 2nd one returns 4 if able to read 4 integers from the file.
This happens to be a very straight forward question , and has been aptly answered by charles and ed before me. But they didnt mention where you should be looking for such things the next time you get stuck.
first the question --
the fscanf belongs to the family of formated input(scan) functions that are supposed to read a input and report some info on the data read like bytes or the count of items(variable addresses) that got a appropriate input read and had successfull assignment made.
here the fscanf is supposed to check for matches in the input file with the format string provided in the function call and accordingly assign the (in order of their position) variable - address with the value and once completed it will return the total count for the number of successfull assignments it made. hence the result of 1 and next was 4 (assuming input was provided properly).
second part: where to look ? --
well described details for such function are easily found in your manual pages or posix doc if you refer to one.
if you noticed , the previous two answers also contain small extracts from the man pages .
hope this helps.
The return value is not depending on the number of arguments to fscanf ,it depends on number of values successfully scanned by fscanf.

What is the difference between %d and %*d in c language?

What is %*d ? I know that %d is used for integers, so I think %*d also must related to integer only? What is the purpose of it? What does it do?
int a=10,b=20;
printf("\n%d%d",a,b);
printf("\n%*d%*d",a,b);
Result is
10 20
1775 1775
The %*d in a printf allows you to use a variable to control the field width, along the lines of:
int wid = 4;
printf ("%*d\n", wid, 42);
which will give you:
..42
(with each of those . characters being a space). The * consumes one argument wid and the d consumes the 42.
The form you have, like:
printf ("%*d %*d\n", a, b);
is undefined behaviour as per the standard, since you should be providing four arguments after the format string, not two (and good compilers like gcc will tell you about this if you bump up the warning level). From C11 7.20.6 Formatted input/output functions:
If there are insufficient arguments for the format, the behavior is undefined.
It should be something like:
printf ("%*d %*d\n", 4, a, 4, b);
And the reason you're getting the weird output is due to that undefined behaviour. This excellent answer shows you the sort of things that can go wrong (and why) when you don't follow the rules, especially pertaining to this situation.
Now I wouldn't expect this to be a misalignment issue since you're using int for all data types but, as with all undefined behaviour, anything can happen.
When used with scanf() functions, it means that an integer is parsed, but the result is not stored anywhere.
When used with printf() functions, it means the width argument is specified by the next format argument.
The * is used as an indication that the width is passed as a parameter of printf
in "%*d", the first argument is defined as the total width of the output, the second argument is taken as normal integer.
for the below program
int x=6,p=10;
printf("%*d",x,p);
output: " 10"
the first argument ta passed for *, that defines the total width of the output... in this case, width is passed as 6. so the length of the entire output will be 5.
now to the number 10, two places are required (1 and 0, total 2). so remaining 5-2=3 empty string or '\0' or NULL character will be concatenated before the actual output

Resources