Meaning of asterisk in scanf format specifier? - c

What does
scanf("%[^\n]%*c", s); and
scanf("%[^\n]", s); mean?
What is the difference?

Please see the specifications for scanf, in particular for the format specifiers.
The first statement, scanf("%[^\n]%*c", s);, reads any characters into the variable called s of type char* until there is a newline character (\n) in the input. This newline character is not read and will remain in the input. It will append a null-terminator (\0) to the string read into s. This concludes the "string reading" part.
The function scanf will then read any character, which in this case must be the newline character, but will not assign its value to any variable as the asterisk (*) is the "assignment-suppressing character *".
The second statement, scanf("%[^\n]", s);, is identical to the first, except that it will leave the newline character in the input.
Note that the standard warnings for using scanf apply here. This function is tricky to use safely. When reading a string, always include a maximum number of characters to read to avoid buffer overflow attacks. E.g. scanf(%16[^\n]", s. This will read a maximum of 15 characters, as it does not include the null-terminator.

scanf("%[^\n]%*c", s);
means to scanf that the field length is provided by an extra parameter in the parameter list, just before the string, so to be correct, it should read as
void f(char *buffer, size_t buffer_sz)
{
scanf("%[^\n]%*c", buffer_sz, buffer);
}
(but I'm afraid that you are going to skip over anything that is not a \n, which probably is not what you wantk, I think that format is incorrect, as you didn't specify the main letter specifier)

Related

Why is this creating two inputs instead of one

https://i.imgur.com/FLxF9sP.png
As shown in the link above I have to input '<' twice instead of once, why is that? Also it seems that the first input is ignored but the second '<' is the one the program recognizes.
The same thing occurs even without a loop too.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(){
int randomGen, upper, lower, end, newRandomGen;
char answer;
upper = 100;
lower = 1;
end = 1;
do {
srand(time(0));
randomGen = rand()%(upper + lower);
printf("%d\n", randomGen);
scanf("%s\n", &answer);
}while(answer != '=');
}
Whitespace in scanf format strings, like the \n in "%c\n", tries to match any amount of whitespace, and scanf doesn’t know that there’s no whitespace left to skip until it encounters something that isn’t whitespace (like the second character you type) or the end of input. You provide it with =\n, which fills in the %c and waits until the whitespace is over. Then you provide it with another = and scanf returns. The second time around, the character could be anything and it’d still work.
Skip leading whitespace instead (and use the correct specifier for one character, %c, as has been mentioned):
scanf(" %c", &answer);
Also, it’s good practice to make sure you actually succeeded in reading something, especially when failing to read something means leaving it uninitialized and trying to read it later (another example of undefined behaviour). So check scanf’s return value, which should match the number of conversion specifiers you provided:
if (scanf(" %c", &answer) != 1) {
return EXIT_FAILURE;
}
As has been commented, you should not use the scanf format %s if you want to read a single character. Indeed, you should never use the scanf format %s for any purpose, because it will read an arbitrary number of characters into the buffer you supply, so you have no way to ensure that your buffer is large enough. So you should always supply a maximum character count. For example, %1s will read only one character. But note: that will still not work with a char variable, since it reads a string and in C, strings are arrays of char terminated with a NUL. (NUL is the character whose value is 0, also sometimes spelled \0. You could just write it as 0, but don't confuse that with the character '0' (whose value is 48, in most modern systems).
So a string containing a single character actually occupies two bytes: the character itself, and a NUL.
If you just want to read a single character, you could use the format %c. %c has a few differences from %s, and you need to be aware of all of them:
The default maximum length read by %s is "unlimited". The default for %c is 1, so %c is identical to %1c.
%s will put a NUL at the end of the characters read (which you need to leave space for), so the result is a C string. %c does not add the NUL, so you only need to leave enough space for the characters themselves.
%s skips whitespace before storing any characters. %c does not ignore whitespace. Note: a newline character (at the end of each line) is considered whitespace.
So, based on the first two rules, you could use either of the following:
char theShortString[2];
scanf("%1s", theShortString);
char theChar = theShortString[0];
or
char theChar;
scanf("%c", &theChar);
Now, when you used
scanf("%s", &theChar);
you will cause scanf to write a NUL (that is, a zero) in the byte following theChar, which quite possibly is part of a different variable. That's really bad. Don't do that. Ever. Even if you get away with it today, it will get you into serious trouble some time soon.
But that's not the problem here. The problem here is with what comes after the %s format code.
Let's take a minute (ok, maybe half an hour) to read the documentation of scanf, by typing man scanf. What we'll see, quite near the beginning, is: (emphasis added)
A directive is one of the following:
A sequence of white-space characters (space, tab, newline, etc.; see isspace(3)). This directive matches any amount of white space, including none, in the input.
So when you use "%s\n", scanf will do the following:
skip over any white-space characters in the input buffer.
read the following word up to but not including the next white-space character, and store it in the corresponding argument, followed by a NUL.
skip over any white-space following the word which it just read.
It does the last step because \n — a newline — is itself white-space, as noted in the quote from the manpage.
Now, what you actually typed was < followed by a newline, so the word read at step 2 will be just he character <. The newline you typed afterwards is white-space, so it will be ignored by step 3. But that doesn't satisfy step 3, because scanf (as documented) will ignore "any amount of white space". It doesn't know that there isn't more white space coming. You might, for example, be intending to type a blank line (that is, just a newline), in which case scanf must skip over that newline as well. So scanf keeps on reading.
Since the input buffer is now empty, the I/O library must now read the next line, which it does. And now you type another < followed by a newline. Clearly, the < is not white-space, so scanf leaves it in the input buffer and returns, knowing that it has done its duty.
Your program then checks the word read by scanf and realises that it is not an =. So it loops again, and the scanf executes again. Now there is already data in the input buffer (the second < which you typed), so scanf can immediately store that word. But it will again try to skip "any amount of white space" afterwards, which by the same logic as above will cause it to read a third line of input, which it leaves in the input buffer.
The end result is that you always need to type the next line before the previous line is passed back to your program. Obviously that's not what you want.
So what's the solution? Simple. Don't put a \n at the end of your format string.
Of course, you do want to skip that newline character. But you don't need to skip it until the next call to scanf. If you used a %1s format code, scanf would automatically skip white-space before returning input, but as we've seen above, %c is far simpler if you only want to read a single character. Since %c does not skip white-space before returning input, you need to insert an explicit directive to do so: a white-space character. It's usual to use an actual space rather than a newline for this purpose, so we would normally write this loop as:
char answer;
srand(time(0)); /* Only call srand once, at the beginning of the program */
do {
randomGen = rand()%(upper + lower); /* This is not right */
printf("%d\n", randomGen);
scanf(" %c", &answer);
} while (answer != '=');
scanf("%s\n", &answer);
Here you used the %s flag in the format string, which tells scanf to read as many characters as possible into a pre-allocated array of chars, then a null terminator to make it a C-string.
However, answer is a single char. Just writing the terminator is enough to go out of bounds, causing undefined behaviour and strange mishaps.
Instead, you should have used %c. This reads a single character into a char.

What does scanf("%[^\n]%*c", &s); actually do?

I know for a fact that [^\n] is a delimiter that makes scanf to scan everything until "enter" key is hit. But I don't know what the remainder of "%[^\n]%*c" is used for. Also why do we have to mention "&s" instead of "s" in the scanf function.
I tried running this:
char s[100];
scanf("%[^\n]s",s);
scanf("%[^\n]s",&s);
Both the above scanf statements worked exactly the same for me. If there is any difference between them, What is it?
Also Why should I prefer scanf("%[^\n]%*c", &s); to the above declarations?
scanf("%[^\n]s",&s); has many troubles:
No check of return value.
No overrun protection
No consumption of '\n'
No assignment of s on '\n' only
No need for s in "%[^\n]s"
Wrong type &s with format.
scanf("%[^\n]s",s); only has one less problem (last one).
Use fgets(s, sizeof s, stdin)
if (fgets(s, sizeof s, stdin)) {
s[strcspn(s, "\n")] = 0; // Lop off potential ending \n
// Success
} else {
// failure
}
No check of return value.
Without checking the return value, success of reading is unknown and the state of s indeterminate.
No overrun protection
What happens when the 100th character is inputted? - "Very bad. Not good. Not good." Necron 99, Wizards 1977
No consumption of '\n'
'\n' remains in stdin to foul up the next read.
No assignment of s on '\n' only
If the input begin with '\n', scanf() returns and s unchanged. It is not assigned "".
No need for "s" in "%[^\n]s"
The "s" is not part of the specifier "%[^\n]". Drop it.
Wrong type &s with format.
%[...] matches a char *, not a pointer to a char[100] like &s (UB).
You can take a string as input in C using scanf(“%s”, s). But, it accepts string only until it finds the first space.
In order to take a line as input, you can use scanf("%[^\n]%*c", s); where s is defined as char s[MAX_LEN] where MAX_LEN is the maximum size of s . Here, [] is the scanset character. ^\n stands for taking input until a newline isn't encountered. Then, with this %*c, it reads the newline character and here, the used * indicates that this newline character is discarded.
Note: After inputting the character and the string, inputting the sentence by the above mentioned statement won't work. This is because, at the end of each line, a new line character (\n) is present. So, the statement: scanf("%[^\n]%*c", s); will not work because the last statement will read a newline character from the previous line. This can be handled in a variety of ways and one of them being: scanf("\n"); before the last statement.

What does scanf("%*[\n] %[^\n]", input_string); do?

I am not able to understand the difference. I use %[^\n]s, for taking phrases input from the user. But this was not working when I needed to add two phrases. But the above one did. Please help me understanding me the difference.
The %[\n] directive tells scanf() to match newline characters, and the * flag signals that no assignment should be made, so %*[\n] skips over any leading newline characters (assuming there is at least one leading \n character: more on this in a moment). There is a space following this first directive, so zero or more whitespace characters are skipped before the final %[^\n] directive, which matches characters until a newline is encountered. These are stored in input_string[], and the newline character is left behind in the input stream. Subsequent calls using this format string will skip over this remaining newline character.
But, there is probably no need for the %*[\n] directive here, since \n is a whitespace character; almost the same thing could be accomplished with a leading space in the format string: " %[^\n]".
One difference between the two: "%*[\n] %[^\n]" expects there to be a newline at the beginning of the input, and without this the match fails and scanf() returns without making any assignments, while " %[^\n]" does not expect a leading newline, or even a leading whitespace character (but skips them if present).
If you used "%[^\n]" instead, as suggested in the body of the question (note that the trailing s is not a part of the scanset directive), the first call to scanf() would match characters until a newline is encountered. The matching characters would be stored in input_string[], and the newline would remain in the input stream. Then, if scanf() is called again with this format string, no characters would be matched before encountering the newline, so the match would fail without assignment.
Please note that you should always specify a maximum width when using %s or %[] in a scanf() format string to avoid buffer overflow. With either of %s or %[], scanf() automatically adds the \0 terminator, so you must be sure to allow room for this. For an array of size 100, the maximum width should be 99, so that at most 99 characters are matched and stored in the array before the null terminator is added. For example: " %99[^\n]".
In scanf function, '*' tells the function to ignore a character from input.
%*[\n]
This tells the function to ignore the first '\n' character and then accept any string
Run the code and first give "ENTER" as input and then give "I am feeling great!!!"
Now print the buffer. You will get I am feeling great!!! as output
Try this code snippet
int main()
{
char buffer[100];
printf("Enter a string:"),scanf("%*[\n] %[^\n]', buffer),printf("buffer:%s\n", buffer);
return 0;
}
%[^\n] is an edit conversion code for scanf() as an alternative of gets(str).
Unlike gets(str), scanf() with %s cannot read more than one word.
Using %[^\n], scanf() can read even the string with whitespace.
It will terminate receiving string input from the user when it encounters a newline character.

Difference between return values of scanf("%s", str) and scanf("%[^\n]s", str)

while(scanf("%s",a))
{
if(a[0]=='#')
break;
nextpermutation(a);
}
This while loop perfectly works for scanf("%s", a)
but if I put scanf("%[^\n]s", a) in the while it just runs for one time.
I checked the return value of both scanf and they were the same still.
I didn't get why this is happening...
The battle of two losers: "%s" vs. "%[^\n]s":
Both specifiers are bad and do not belong in robust code. Neither limits user input and can easily overrun the destination buffer.
The s in "%[^\n]s" serves no purpose. #Jonathan Leffler. So "%[^\n]" is reviewed following - still that is bad.
"%s" consumes leading white-space. "%[^\n]" does not #Igor Tandetnik
"%[^\n]" reads nothing if the leading character is a '\n', leaving a unchanged or rarely in an unknown state.
"%[^\n]" reads all characters in except '\n'.
"%s" reads all leading white-space, discards them, then reads and saves all non-white-spaces.
Better to use fgets() to read a line of user input and then parse it.
Example: to read a line of input, including spaces:
char a[100];
if (fgets(a, sizeof a, stdin) == NULL) Handle_EOF_or_Error();
// if the potential trailing \n is not wanted
a[strcspn(a, "\n")] = '\0';
What is that s doing there in %[^\n]s?
Your scanf("%[^\n]s") contains two format specifiers in its format string: %[^\n] followed by a lone s. This will require the input stream to contain s after the sequence matched by %[^\n]. This is, of course, is not very close in behavior to plain %s. Matching sequences for these formats will be very different for that reason alone.
Note that [] is not a modifier for %s format, as you seem to incorrectly believe. [] in scanf is a self-sufficient format specifier. Did you mean just %[^\n] by any chance instead of %[^\n]s?

How does scanf handle white space?

I am new to Cstring, sorry for asking dumb question, please help.
char string[10];
printf("Give me your last name:\n");
scanf ("%s", string); //if i type 123 123
printf("Original string:%s\n", string); //it shows 123
Compare to:
char string[] = "123 123";
printf("Original string:%s\n", string); //it shows 123 123
The problem is scanf. When you use scanf with %s it reads until it hits whitespace.
To read the entire line you may want to consider using fgets().
scanf treats space as the end of the string.
So it stops reading once it encounters a space character.
While in case of c style array \0 is considered the end of the array.
When you initialize,
char string[] = "123 123";
A \0 is implicitly added at the end of 7th character 3, and it marks the end of this array.
scanf scans for a string until any whitespace. If you need to read until a newline, then use fgets():
char string[64];
fgets(string, sizeof(string), stdin);
printf("%s", string); // should now contain any whitespace in the string.
This is the right behaviour. By default scanf reads the string until it reaches a whitespace.
From scanf manpage:
s
Matches a sequence of non-white-space characters;
the next pointer must be a pointer to character array that is long enough
to hold the input sequence and the terminating
null character ('\0'), which is added automatically.
The input string stops at white space or at the maximum field width,
whichever occurs first.
Sometimes gets() also creates some warnings. If you are so determined to use scanf(), you can use it like scanf("%[^\n]", character_array_variable);
#include<stdio.h>
#include<string.h>
void main()
{
char FullName[20];
scanf("%[^\n]s",FullName);
printf("%s",FullName);
}
The output image:
What you see here happens due to the way scanf() works. You can print strings with any character inside (no matter whether it's a whitespace or not). scanf() just stops once it encounters a whitespace as it considers this variable "done".

Resources