What is the difference between %[^\n] and %[^\n]s? - c

Why some writers use %[^\n]s specifier instead of %[^\n]? Which one is correct?

If you want to read in a sequence of non-newline characters, %[^\n] is correct.
The %[ format specifier to scanf will accept any following characters until a ] is encountered. If the first given character is ^, then it accepts characters not in this list. The characters that are read are then placed in the given char * parameter.
%[^\n]s is the above format specifier followed by a literal 's'. The s is not part of the %[ format specifier. So this will read characters until it encounters a newline and put those characters in the given char *, then it will attempt to read an s character which it doesn't find because a newline is next.

Related

Question about format string in scanf function

For each of the following pairs of scanf format strings, indicate whether or not the two strings are equivalent. If they're not, show how they can be distinguished:
(b) "%d-%d-%d" versus "%d -%d -%d"
So in this case, my answer was that they were not equivalent. Because non-white-space characters except conversion specifier which start with %, cannot be preceded by spaces, it will not match with the non-white-space character. So in the first case, no spaces will be allowed after the first and second integer, while in the second case, any number of spaces will be allowed after the first 2 integers.
But I saw that the book had a different answer. It said that they were both equivalent to each other.
Is this the mistake of the book? Or am I just wrong with the concept of format string in the scanf function?
The book is wrong. As per the specification of the scanf():
Whitespace character: the function will read and ignore any whitespace characters encountered before the next non-whitespace character (whitespace characters include spaces, newline and tab characters -- see isspace). A single whitespace in the format string validates any quantity of whitespace characters extracted from the stream (including none).
Non-whitespace character, except format specifier (%): Any character that is not either a whitespace character (blank, newline or tab) or part of a format specifier (which begin with a % character) causes the function to read the next character from the stream, compare it to this non-whitespace character and if it matches, it is discarded and the function continues with the next character of format. If the character does not match, the function fails, returning and leaving subsequent characters of the stream unread.
So in first case when scanf arrives to the %d and gets the input, next is the - which means that scanf will expect next in the stream to see the non-whitespae character - and not any other whitespace character. So the legal input is 1- 2, but not 1 -2
In the second case, after first %d, scanf will allow the whitespace and than will arrive to non-whitespace, so it will allow the input 1 - 2 by the above definitions.
"%d-%d-%d" differs from "%d -%d -%d" and the difference has nothing to do with "%d".
Format "-" scans over input "-" and stops on the first space of input " -".
Format " -" scans over inputs "-" and " -" as the " " in the format matches 0 or more white-space characters in the input.
A directive composed of white-space character(s) is executed by reading input up to the first nonwhite-space character (which remains unread), or until no more characters can be read. The directive never fails. C17dr § 7.21.6.2 5
Had the question been: "%d-%d-%d" versus "%d- %d- %d",
These 2 are functionally identical.
We would need to dive input arcane stdin input errors to divine a potential difference.

What is the specifier %[^s] used for?

What is the specifier %[^s] used for?
s is a variable.
In which cases can I use this specifier?
The %[ format specifier to scanf will match a sequence of characters matching those that are listed between [ and ]. If the first character is ^, then it matches characters excluding those characters.
In your case %[^s] means "match any character besides the characters 's'. s is not a variable in this case.

What does scanf("%*[\n] %[^\n]", input_string); do?

I am not able to understand the difference. I use %[^\n]s, for taking phrases input from the user. But this was not working when I needed to add two phrases. But the above one did. Please help me understanding me the difference.
The %[\n] directive tells scanf() to match newline characters, and the * flag signals that no assignment should be made, so %*[\n] skips over any leading newline characters (assuming there is at least one leading \n character: more on this in a moment). There is a space following this first directive, so zero or more whitespace characters are skipped before the final %[^\n] directive, which matches characters until a newline is encountered. These are stored in input_string[], and the newline character is left behind in the input stream. Subsequent calls using this format string will skip over this remaining newline character.
But, there is probably no need for the %*[\n] directive here, since \n is a whitespace character; almost the same thing could be accomplished with a leading space in the format string: " %[^\n]".
One difference between the two: "%*[\n] %[^\n]" expects there to be a newline at the beginning of the input, and without this the match fails and scanf() returns without making any assignments, while " %[^\n]" does not expect a leading newline, or even a leading whitespace character (but skips them if present).
If you used "%[^\n]" instead, as suggested in the body of the question (note that the trailing s is not a part of the scanset directive), the first call to scanf() would match characters until a newline is encountered. The matching characters would be stored in input_string[], and the newline would remain in the input stream. Then, if scanf() is called again with this format string, no characters would be matched before encountering the newline, so the match would fail without assignment.
Please note that you should always specify a maximum width when using %s or %[] in a scanf() format string to avoid buffer overflow. With either of %s or %[], scanf() automatically adds the \0 terminator, so you must be sure to allow room for this. For an array of size 100, the maximum width should be 99, so that at most 99 characters are matched and stored in the array before the null terminator is added. For example: " %99[^\n]".
In scanf function, '*' tells the function to ignore a character from input.
%*[\n]
This tells the function to ignore the first '\n' character and then accept any string
Run the code and first give "ENTER" as input and then give "I am feeling great!!!"
Now print the buffer. You will get I am feeling great!!! as output
Try this code snippet
int main()
{
char buffer[100];
printf("Enter a string:"),scanf("%*[\n] %[^\n]', buffer),printf("buffer:%s\n", buffer);
return 0;
}
%[^\n] is an edit conversion code for scanf() as an alternative of gets(str).
Unlike gets(str), scanf() with %s cannot read more than one word.
Using %[^\n], scanf() can read even the string with whitespace.
It will terminate receiving string input from the user when it encounters a newline character.

what do modifiers like whitespace do in scanf?

#include<stdio.h>
void main()
{
char a,b;
printf("enter a,b\n");
scanf("%c %c",&a,&b);
printf("a is %c,b is %c,a,b");
}
1.what does the whitespace in between the two format specifiers tell the computer to do?
2.do format specifiers like %d other than %c clean input buffer before they read from there?
1.what does the whitespace in between the two format specifiers tell the computer to do?
Whitespace in the format string tells scanf to read (and discard) whitespace characters up to the first non-whitespace character (which remains unread)1. So
scanf("%c %c",&a,&b);
reads a single character into a (whitespace or not), then skips over any whitespace and reads the next non-whitespace character into b.
2.do format specifiers like %d other than %c clean input buffer before they read from there?
Not sure quite what you mean here - d will skip over any leading whitespace and start reading from the first non-whitespace character, c will read the next character whether it's whitespace or not. Neither will flush the input stream, nor will they write to the target variable if the directive fails (for example, if the next non-whitespace character in the input stream isn't a digit, the d directive fails, and the argument corresponding to that directive will not be updated).
N1570, §7.21.6.2, para 5:
"A directive composed of white-space character(s) is executed by reading input up to the
first non-white-space character (which remains unread), or until no more characters can
be read. The directive never fails."
Wikipedia says
whitespace: Any whitespace characters trigger a scan for zero or more
whitespace characters. The number and type of whitespace characters do
not need to match in either direction.
"%d" will skip whitespace until it finds an integer.
"%c" reads a single character (and space is a character, so it doesn't skip).

Reading tabs in C

I am trying to sscanf from a file. The pattern I am trying to match is the following
"%s\t%s\t%s\t%f"
Thing is that I am surprised because for an input like following:
Hello Hola Hallo 5.344434
it is reading all of the data properly...
Do you know why?
I was expecting it to be finding tabs like |---|---|---|---| not that only one space was matching.
Thanks
The standard reads:
A directive composed of white-space character(s) is executed by
reading input up to the first non-white-space character (which remains
unread), or until no more characters can be read.
In other words, a sequence of white-space characters (space, tab, newline, etc.; as defined by isspace()) in the format string matches any amount of white space in the input.
No way - scanf treat all white-space identically - they're used as delimiter, and just ignored. So if you really want to doing something with tab space, you should parse it yourself.
To parse, you need to read the whole line without any parsing, unlike scanf. So, you need to use fgets.
FILE *fp = /* init.. */;
char buf[1024];
fgets(buf, 1024, fp);
// parse yourself!
If you take a look at the documentation for scanf:
C string that contains a sequence of characters that control how characters extracted from the stream are treated:
Whitespace character: the function will read and ignore any whitespace characters
encountered before the next non-whitespace character (whitespace characters include
spaces, newline and tab characters -- see isspace). A single whitespace in the format
string validates any quantity of whitespace characters extracted from the stream
(including none).
Non-whitespace character, except format specifier (%): Any character that is not
either a whitespace character (blank, newline or tab) or part of a format specifier
(which begin with a % character) causes the function to read the next character
from the stream, compare it to this non-whitespace character and if it matches,
it is discarded and the function continues with the next character of format. If the
character does not match, the function fails, returning and leaving subsequent
characters of the stream unread.
Format specifiers: A sequence formed by an initial percentage sign (%) indicates a
format specifier, which is used to specify the type and format of the data to be
retrieved from the stream and stored into the locations pointed by the additional
arguments.
You will notice that the whitespace characters get ignored.
Did you carefully read scanf(3) documentation? You need to read the entire line using getline(3) then parse that line "manually"!

Resources