I've encountered a expressions that go something like this inside scanf and sscanf arguments:
sscanf(buffer, "%d,%100[^,]%*c%f", destination_pointer)
or
scanf("\n%99s", destination);
What is the correct way of interpreting these? I know what things like "%s %c %d" are, and also that the %100 or generally "%number" is the size of the input to be read. But what about the rest? All I can find are basic examples, nothing near this complex. Is there any reference guide?
What is the correct way to interpreted these?
sscanf(buffer, "%d,%100[^,]%*c%f", destinantion_pointer)
Is an invalid call. There are 3 conversion specifiers that need an argument - %d, %[], %f. That means exactly 3 arguments after formatting string are needed, but only one destinantion_pointer is provided.
%d - ignore any whitespace characters, read an int in base 10
, - read a comma
%100[^,] - read maximum number of 100 characters that are not a comma. Maximum up to 101 bytes (100 characters + null byte) are stored in destination buffer.
%[set] - reads characters in the set
%[^set] - reads characters that are not in the set
%*c - ignore one character (a comma, because %100[^,] reads up until a comma, or the string has ended, which would make scanf return here). Note - ignoring the result of conversion with * makes scanf not increment the return value in the case reading was successful.
%f - ignore any whitespace characters, read a float (in any format - decimal, scientific or hexadecimal)
scanf("\n%99s", destinantion);
\n - read (and ignore) any number of whitespace characters (whitespace, means anything for that isspace() returns nonzero, so either space, form feed, line feed, carriage return, tab or vertical tab)
%99s - ignore any leading whitespace characters (\n in front of it is useless...), then read up to 99 characters that are not whitespaces (the resulting buffer has to be at least 100 bytes long).
Related
For each of the following pairs of scanf format strings, indicate whether or not the two strings are equivalent. If they're not, show how they can be distinguished:
(b) "%d-%d-%d" versus "%d -%d -%d"
So in this case, my answer was that they were not equivalent. Because non-white-space characters except conversion specifier which start with %, cannot be preceded by spaces, it will not match with the non-white-space character. So in the first case, no spaces will be allowed after the first and second integer, while in the second case, any number of spaces will be allowed after the first 2 integers.
But I saw that the book had a different answer. It said that they were both equivalent to each other.
Is this the mistake of the book? Or am I just wrong with the concept of format string in the scanf function?
The book is wrong. As per the specification of the scanf():
Whitespace character: the function will read and ignore any whitespace characters encountered before the next non-whitespace character (whitespace characters include spaces, newline and tab characters -- see isspace). A single whitespace in the format string validates any quantity of whitespace characters extracted from the stream (including none).
Non-whitespace character, except format specifier (%): Any character that is not either a whitespace character (blank, newline or tab) or part of a format specifier (which begin with a % character) causes the function to read the next character from the stream, compare it to this non-whitespace character and if it matches, it is discarded and the function continues with the next character of format. If the character does not match, the function fails, returning and leaving subsequent characters of the stream unread.
So in first case when scanf arrives to the %d and gets the input, next is the - which means that scanf will expect next in the stream to see the non-whitespae character - and not any other whitespace character. So the legal input is 1- 2, but not 1 -2
In the second case, after first %d, scanf will allow the whitespace and than will arrive to non-whitespace, so it will allow the input 1 - 2 by the above definitions.
"%d-%d-%d" differs from "%d -%d -%d" and the difference has nothing to do with "%d".
Format "-" scans over input "-" and stops on the first space of input " -".
Format " -" scans over inputs "-" and " -" as the " " in the format matches 0 or more white-space characters in the input.
A directive composed of white-space character(s) is executed by reading input up to the first nonwhite-space character (which remains unread), or until no more characters can be read. The directive never fails. C17dr § 7.21.6.2 5
Had the question been: "%d-%d-%d" versus "%d- %d- %d",
These 2 are functionally identical.
We would need to dive input arcane stdin input errors to divine a potential difference.
Project is in C. I need to parse strings that are always formatted the following way: integer, whitespace, plus sign, multi-word string, plus sign, white space, integer, whitespace, integer, end-of-line
Example:
10 +This is 1 string+ 2 -1
I'm having a hard time figuring out what to enter in the formatting of sscanf so that the string surrounded by the '+' signs get parsed correctly, without including the + signs. Assuming sscanf can be used for this case.
I tried "%d +%s+ %d %d" and that didn't work.
You use %s but that reads up to the first white space character. You want to read a string of not-plus-signs, so say that's what sscanf() should do:
"%d +%[^+]+ %d %d"
That's a scan set — see POSIX sscanf(). You should also protect yourself from buffer overflow. If you have:
char buffer[256];
use:
"%d +%255[^+]+ %d %d"
Note the off-by-one in the lengths — this is a design feature of the scanf() family of functions. You could skip leading spaces by putting a space after the first + in the format string. It is not possible to skip trailing spaces before the second + in the data; you'll have to remove those separately.
You ask for 'end of line' after the 3rd number. That's fairly hard. You might use:
"%d +%255[^+]+ %d %d %n"
passing an extra pointer to int argument to hold the offset of the last character parsed. The blank before the %n skips white space, including newlines, so if you read into int nbytes; (passing &nbytes), then you'd check if (buffer[nbytes] != '\0') { …handle trailing garbage… } (but only after checking that you had four successful conversion specifications — %n conversion specifications are not counted in the return value from sscanf() et al). There are other solutions to that; they're all grubby to some extent.
The below is what I understood so far. Please confirm, add, correct as the case may be -
scanf (" %c %d %s", &a, &b, c);
In the above - the first space before %c makes sure the buffer is cleared before scanf starts accepting new string into it from stdin for this particular function scanf call. This clears any of the delimiters from any previous input function calls..
The remaining two spaces before %d and %s allow any number of spaces or tabs but not enter key press between the user's entry of a, b and b,c , respectively.
Even with the above, none of the inputs can contain a space in it i.e. space is the delimiter for each of the three inputs. To add space it has to be specified in the string control braces [] like "%[a-z A-Z_0-9]" can contain any upper or lower case alphabets, digits 0-9, a space and an underscore - but will treat all other characters as invalid - The invalid character will go to the next input in the format string, if any, so if %[___] above was followed by %c and an astrick is pressed, the astrick is put into the character corresponding to %c.
Please confirm, correct, add. Thanks.
From cppreference.com:
Any single whitespace character in the format string consumes all available consecutive whitespace characters from the input
So all the spaces in the format string just mean to skip over any whitespace in the input.
There's no difference between the spaces before %c and the other spaces. The initial spaces don't clear the buffer, it just skips over any initial whitespace in the input. This ensures that %c will read the first non-whitespace character in the input.
Whitespace includes space, TAB, and newline characters. So you can put spaces or press enter between each input.
You don't actually need the spaces before %d or %s. These formats don't read anything that contains whitespace, and they automatically skip over any whitespace before the object they read. The spaces in the format string are redundant and do no harm, and may make it easier to read.
All conversion specifiers other than [, c, and n consume and discard all leading whitespace characters (determined as if by calling isspace) before attempting to parse the input.
...
The conversion specifiers that do not consume leading whitespace, such as %c, can be made to do so by using a whitespace character in the format string
the first space before %c makes sure the buffer is cleared before scanf starts accepting new string into it from stdin for this particular function scanf call. This clears any of the delimiters from any previous input function calls.
No. The buffer is not cleared and there are no delimiters.
The remaining two spaces before %d and %s allow any number of spaces or tabs but not enter key press between the user's entry of a, b and b,c , respectively.
No. The enter key produces a newline character, which is whitespace.
Even with the above, none of the inputs can contain a space in it i.e. space is the delimiter for each of the three inputs.
No.
The invalid character will go to the next input in the format string
Yes. This isn't limited to %[ ]: If e.g. %d sees 12foo in the input stream, it will consume 12 and leave foo to be read by the rest of the format string (however, if there are no leading digits at all, %d will fail and abort processing).
Any whitespace character in the format string reads and consumes all available whitespace characters at this point in the input stream, including spaces, tabs, and newlines. It doesn't matter whether the space appears before %c or %d or %s: All whitespace (including newlines) in the input is skipped.
%c accepts spaces just fine. Space is not a delimiter because %c has no delimiters; it always reads a single character. The only reason it can't read a space in your code is that it is preceded by in the format string, which will have skipped over all available whitespace.
As for %d and %s, they implicitly skip leading whitespace. That is, " %d" is equivalent to "%d" and " %s" is equivalent to "%s".
Here is my c code:
int main()
{
int a;
for (int i = 0; i < 3; i++)
scanf("%d ", &a);
return 0;
}
When I input things like 1 2 3, it will ask me to input more, and I need to input something not ' '.
However, when I change it to (or other thing not ' ')
scanf("%d !", &a);
and input 1 ! 2! 3!, it will not ask more input.
The final space in scanf("%d ", &a); instructs scanf to consume all white space following the number. It will keep reading from stdin until you type something that is not white space. Simplify the format this way:
scanf("%d", &a);
scanf will still ignore white space before the numbers.
Conversely, the format "%d !" consumes any white space following the number and a single !. It stops scanning when it gets this character, or another non space character which it leaves in the input stream. You cannot tell from the return value whether it matched the ! or not.
scanf is very clunky, it is very difficult to use it correctly. It is often better to read a line of input with fgets() and parse that with sscanf() or even simpler functions such as strtol(), strspn() or strcspn().
scanf("%d", &a);
This should do the job.
Basically, scanf() consumes stdin input as much as its pattern matches. If you pass "%d" as the pattern, it will stop reading input after a integer is found. However, if you feed it with "%dx" for example, it matches with all integers followed by a character 'x'.
More Details:
Your pattern string could have the following characters:
Whitespace character: the function will read and ignore any whitespace
characters encountered before the next non-whitespace character
(whitespace characters include spaces, newline and tab characters --
see isspace). A single whitespace in the format string validates any
quantity of whitespace characters extracted from the stream (including
none).
Non-whitespace character, except format specifier (%): Any character that is not either a whitespace character (blank, newline or
tab) or part of a format specifier (which begin with a % character)
causes the function to read the next character from the stream,
compare it to this non-whitespace character and if it matches, it is
discarded and the function continues with the next character of
format. If the character does not match, the function fails, returning
and leaving subsequent characters of the stream unread.
Format specifiers: A sequence formed by an initial percentage sign (%) indicates a format specifier, which is used to specify the type
and format of the data to be retrieved from the stream and stored into
the locations pointed by the additional arguments.
Source: http://www.cplusplus.com/reference/cstdio/scanf/
Can anyone suggest what is the meaning of a in the following call to scanf?
scanf("%d a %f",&i,&f)
Characters preceded by a '%' in a call to scanf represent variables.
For instance %d represents an integer variable whereas %f represents a floating-point variable.
Characters which are not preceded by a % (or a \, which indicates an escape sequence) are taken literally, so, in your case, the scanf string "%d a %f" would match "233 a 4.5" but would not match "233 b 4.5".
(To be more accurate, a whitespace character matches any contiguous sequence of whitespace characters.)
scanf("%d a %f",&i,&f)
Means you have to type in data , in this format 25 a 33.3
Then when you print it using
printf("i=%d f=%f",i,f);
and then you get the output as
i = 25 , f = 33.3
You are not getting the value of the variable f because of the & in the line scanf("%d a %f",&i,&f)
The & means you are getting the address of the variable f in the memory. You should remove the '&'s to get the actual value of the variables.
And for the a:
Non-whitespace character, except format specifier (%): Any character that is not either a whitespace character (blank, newline or tab) or part of a format specifier (which begin with a % character) causes the function to read the next character from the stream, compare it to this non-whitespace character and if it matches, it is discarded and the function continues with the next character of format. If the character does not match, the function fails, returning and leaving subsequent characters of the stream unread.
Which means you are formatting the input as so:
type in a decimal integer(%d)
then a space
then the character 'a'
another space
then the floating point number(%f).
Reference: http://www.cplusplus.com/reference/cstdio/scanf/