Why won't this scanf format-string work? "%[^\n]\n" - c

I've seen a few examples where people give scanf a "%[^\n]\n" format string to read a whole line of user input. If my understanding is correct, this will read every character until a newline character is reached, and then the newline is consumed by scanf (and not included in the resulting input).
But I can't get this to work on my machine. A simple example I've tried:
#include <stdio.h>
int main(void)
{
char input[64];
printf("Enter some input: ");
scanf("%[^\n]\n", input);
printf("You entered %s\n", input);
}
When I run this, I'm prompted for input, I type some characters, I hit Enter, and the cursor goes to the beginning of the next line but the scanf call doesn't finish.
I can hit Enter as many times as I like, and it will never finish.
The only ways I've found to conclude the scanf call are:
enter \n as the first (and only) character at the prompt
enter Ctrl-d as the first (and only) character at the prompt
enter some input, one or more \n, zero or more other characters, and enter Ctrl-d
I don't know if this is machine dependent, but I'm very curious to know what's going on. I'm on OS X, if that's relevant.

According to the documentation for scanf (emphasis mine):
The format string consists of whitespace characters (any single whitespace character in the format string consumes all available consecutive whitespace characters from the input), non-whitespace multibyte characters except % (each such character in the format string consumes exactly one identical character from the input) and conversion specifications.
Thus, your format string %[^\n]\n will first read (and store) an arbitrary number of non-whitespace characters from the input (because of the %[^\n] part) and then, because of the following newline, read (and discard) an arbitrary number of whitespace characters, such as spaces, tabs or newlines.
Thus, to make your scanf stop reading input, you either need to type at least one non-whitespace character after the newline, or else arrange for the input stream to end (e.g. by pressing Ctrl+D on Unix-ish systems).
Instead, to make your code work as you expect, just remove the last \n from the end of your format string (as already suggested by Umamahesh P).
Of course, this will leave the newline still in the input stream. To get rid of it (in case you want to read another line later), you can getc it off the stream, or just append %*c (which means "read one character and discard it") or even %*1[\n] (read one newline and discard it) to the end of your scanf format string.
Ps. Note that your code has a couple of other problems. For example, to avoid buffer overflow bugs, you really should use %63[^\n] instead of %[^\n] to limit the number of characters scanf will read into your buffer. (The limit needs to be one less than the size of your buffer, since scanf will always append a trailing null character.)
Also, the %[ format specifier always expects at least one matching character, and will fail if none is available. Thus, if you press enter immediately without typing anything, your scanf will fail (silently, since you don't check the return value) and will leave your input buffer filled with random garbage. To avoid this, you should a) check the return value of scanf, b) set input[0] = '\0' before calling scanf, or c) preferably both.
Finally, note that, if you just want to read input line by line, it's much easier to just use fgets. Yes, you'll need to strip the trailing newline character (if any) yourself if you don't want it, but that's still a lot easier and safer that trying to use scanf for a job it's not really meant for:
#include <stdio.h>
#include <string.h>
void chomp(char *string) {
int len = strlen(string);
if (len > 0 && string[len-1] == '\n') string[len-1] = '\0';
}
int main(void)
{
char input[64];
printf("Enter some input: ");
fgets(input, sizeof(input), stdin);
chomp(input);
printf("You entered \"%s\".\n", input);
}

Whitespace characters in format of scanf() has an special meaning:
Whitespace character: the function will read and ignore any whitespace
characters encountered before the next non-whitespace character
(whitespace characters include spaces, newline and tab characters --
see isspace). A single whitespace in the format string validates any
quantity of whitespace characters extracted from the stream (including
none).
Thus, "%[^\n]\n" is just equivalent to "%[^\n] ", telling scanf() to ignore all whitespace characters after %[^\n]. This is why all '\n's are ignored until a non-whitespace character is entered, which is happened in your case.
Reference: http://www.cplusplus.com/reference/cstdio/scanf/

Remove the the 2nd new line character and the following is sufficient.
scanf("%[^\n]", input);
To answer the original one,
scanf("%[^\n]\n", input);
This should also work, provided you enter a non white space character after the input. Example:
Enter some input: lkfjdlfkjdlfjdlfjldj
t
You entered lkfjdlfkjdlfjdlfjldj

Related

Why is this creating two inputs instead of one

https://i.imgur.com/FLxF9sP.png
As shown in the link above I have to input '<' twice instead of once, why is that? Also it seems that the first input is ignored but the second '<' is the one the program recognizes.
The same thing occurs even without a loop too.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(){
int randomGen, upper, lower, end, newRandomGen;
char answer;
upper = 100;
lower = 1;
end = 1;
do {
srand(time(0));
randomGen = rand()%(upper + lower);
printf("%d\n", randomGen);
scanf("%s\n", &answer);
}while(answer != '=');
}
Whitespace in scanf format strings, like the \n in "%c\n", tries to match any amount of whitespace, and scanf doesn’t know that there’s no whitespace left to skip until it encounters something that isn’t whitespace (like the second character you type) or the end of input. You provide it with =\n, which fills in the %c and waits until the whitespace is over. Then you provide it with another = and scanf returns. The second time around, the character could be anything and it’d still work.
Skip leading whitespace instead (and use the correct specifier for one character, %c, as has been mentioned):
scanf(" %c", &answer);
Also, it’s good practice to make sure you actually succeeded in reading something, especially when failing to read something means leaving it uninitialized and trying to read it later (another example of undefined behaviour). So check scanf’s return value, which should match the number of conversion specifiers you provided:
if (scanf(" %c", &answer) != 1) {
return EXIT_FAILURE;
}
As has been commented, you should not use the scanf format %s if you want to read a single character. Indeed, you should never use the scanf format %s for any purpose, because it will read an arbitrary number of characters into the buffer you supply, so you have no way to ensure that your buffer is large enough. So you should always supply a maximum character count. For example, %1s will read only one character. But note: that will still not work with a char variable, since it reads a string and in C, strings are arrays of char terminated with a NUL. (NUL is the character whose value is 0, also sometimes spelled \0. You could just write it as 0, but don't confuse that with the character '0' (whose value is 48, in most modern systems).
So a string containing a single character actually occupies two bytes: the character itself, and a NUL.
If you just want to read a single character, you could use the format %c. %c has a few differences from %s, and you need to be aware of all of them:
The default maximum length read by %s is "unlimited". The default for %c is 1, so %c is identical to %1c.
%s will put a NUL at the end of the characters read (which you need to leave space for), so the result is a C string. %c does not add the NUL, so you only need to leave enough space for the characters themselves.
%s skips whitespace before storing any characters. %c does not ignore whitespace. Note: a newline character (at the end of each line) is considered whitespace.
So, based on the first two rules, you could use either of the following:
char theShortString[2];
scanf("%1s", theShortString);
char theChar = theShortString[0];
or
char theChar;
scanf("%c", &theChar);
Now, when you used
scanf("%s", &theChar);
you will cause scanf to write a NUL (that is, a zero) in the byte following theChar, which quite possibly is part of a different variable. That's really bad. Don't do that. Ever. Even if you get away with it today, it will get you into serious trouble some time soon.
But that's not the problem here. The problem here is with what comes after the %s format code.
Let's take a minute (ok, maybe half an hour) to read the documentation of scanf, by typing man scanf. What we'll see, quite near the beginning, is: (emphasis added)
A directive is one of the following:
A sequence of white-space characters (space, tab, newline, etc.; see isspace(3)). This directive matches any amount of white space, including none, in the input.
So when you use "%s\n", scanf will do the following:
skip over any white-space characters in the input buffer.
read the following word up to but not including the next white-space character, and store it in the corresponding argument, followed by a NUL.
skip over any white-space following the word which it just read.
It does the last step because \n — a newline — is itself white-space, as noted in the quote from the manpage.
Now, what you actually typed was < followed by a newline, so the word read at step 2 will be just he character <. The newline you typed afterwards is white-space, so it will be ignored by step 3. But that doesn't satisfy step 3, because scanf (as documented) will ignore "any amount of white space". It doesn't know that there isn't more white space coming. You might, for example, be intending to type a blank line (that is, just a newline), in which case scanf must skip over that newline as well. So scanf keeps on reading.
Since the input buffer is now empty, the I/O library must now read the next line, which it does. And now you type another < followed by a newline. Clearly, the < is not white-space, so scanf leaves it in the input buffer and returns, knowing that it has done its duty.
Your program then checks the word read by scanf and realises that it is not an =. So it loops again, and the scanf executes again. Now there is already data in the input buffer (the second < which you typed), so scanf can immediately store that word. But it will again try to skip "any amount of white space" afterwards, which by the same logic as above will cause it to read a third line of input, which it leaves in the input buffer.
The end result is that you always need to type the next line before the previous line is passed back to your program. Obviously that's not what you want.
So what's the solution? Simple. Don't put a \n at the end of your format string.
Of course, you do want to skip that newline character. But you don't need to skip it until the next call to scanf. If you used a %1s format code, scanf would automatically skip white-space before returning input, but as we've seen above, %c is far simpler if you only want to read a single character. Since %c does not skip white-space before returning input, you need to insert an explicit directive to do so: a white-space character. It's usual to use an actual space rather than a newline for this purpose, so we would normally write this loop as:
char answer;
srand(time(0)); /* Only call srand once, at the beginning of the program */
do {
randomGen = rand()%(upper + lower); /* This is not right */
printf("%d\n", randomGen);
scanf(" %c", &answer);
} while (answer != '=');
scanf("%s\n", &answer);
Here you used the %s flag in the format string, which tells scanf to read as many characters as possible into a pre-allocated array of chars, then a null terminator to make it a C-string.
However, answer is a single char. Just writing the terminator is enough to go out of bounds, causing undefined behaviour and strange mishaps.
Instead, you should have used %c. This reads a single character into a char.

What does scanf("%*[\n] %[^\n]", input_string); do?

I am not able to understand the difference. I use %[^\n]s, for taking phrases input from the user. But this was not working when I needed to add two phrases. But the above one did. Please help me understanding me the difference.
The %[\n] directive tells scanf() to match newline characters, and the * flag signals that no assignment should be made, so %*[\n] skips over any leading newline characters (assuming there is at least one leading \n character: more on this in a moment). There is a space following this first directive, so zero or more whitespace characters are skipped before the final %[^\n] directive, which matches characters until a newline is encountered. These are stored in input_string[], and the newline character is left behind in the input stream. Subsequent calls using this format string will skip over this remaining newline character.
But, there is probably no need for the %*[\n] directive here, since \n is a whitespace character; almost the same thing could be accomplished with a leading space in the format string: " %[^\n]".
One difference between the two: "%*[\n] %[^\n]" expects there to be a newline at the beginning of the input, and without this the match fails and scanf() returns without making any assignments, while " %[^\n]" does not expect a leading newline, or even a leading whitespace character (but skips them if present).
If you used "%[^\n]" instead, as suggested in the body of the question (note that the trailing s is not a part of the scanset directive), the first call to scanf() would match characters until a newline is encountered. The matching characters would be stored in input_string[], and the newline would remain in the input stream. Then, if scanf() is called again with this format string, no characters would be matched before encountering the newline, so the match would fail without assignment.
Please note that you should always specify a maximum width when using %s or %[] in a scanf() format string to avoid buffer overflow. With either of %s or %[], scanf() automatically adds the \0 terminator, so you must be sure to allow room for this. For an array of size 100, the maximum width should be 99, so that at most 99 characters are matched and stored in the array before the null terminator is added. For example: " %99[^\n]".
In scanf function, '*' tells the function to ignore a character from input.
%*[\n]
This tells the function to ignore the first '\n' character and then accept any string
Run the code and first give "ENTER" as input and then give "I am feeling great!!!"
Now print the buffer. You will get I am feeling great!!! as output
Try this code snippet
int main()
{
char buffer[100];
printf("Enter a string:"),scanf("%*[\n] %[^\n]', buffer),printf("buffer:%s\n", buffer);
return 0;
}
%[^\n] is an edit conversion code for scanf() as an alternative of gets(str).
Unlike gets(str), scanf() with %s cannot read more than one word.
Using %[^\n], scanf() can read even the string with whitespace.
It will terminate receiving string input from the user when it encounters a newline character.

What is the difference between these two scanf statements?

I am having some doubt. The doubt is
What is the difference between the following two scanf statements.
scanf("%s",buf);
scanf("%[^\n]", buf);
If I am giving the second scanf in the while loop, it is going infinitely. Because the \n is in the stdin.
But in the first statement, reads up to before the \n. It also will not read the \n.
But The first statement does not go in infinitely. Why?
Regarding the properties of the %s format specifier, quoting C11 standrad, chapter §7.21.6.2, fscanf()
s Matches a sequence of non-white-space characters.
The newline is a whitespace character, so only a newlinew won't be a match for %s.
So, in case the newline is left in the buffer, it does not scan the newline alone, and wait for the next non-whitespace input to appear on stdin.
The %s format specifier specifies that scanf() should read all characters in the standard input buffer stdin until it encounters the first whitespace character, and then stop there. The whitespace ('\n') remains in the stdin buffer until consumed by another function, like getchar().
In the second case there is no mention of stopping.
You can think of scanf as extracting words separated by whitespace from a stream of characters. Imagine reading a file which contains a table of numbers, for example, without worrying about the exact number count per line or the exact space count and nature between numbers.
Whitespace, for the record, is horizontal and vertical (these exist) tabs, carriage returns, newlines, form feeds and last not least, actual spaces.
In order to free the user from details, scanf treats all whitespace the same: It normally skips it until it hits a non-whitespace and then tries to convert the character sequence starting there according to the specified input conversion. E.g. with "%d" it expects a sequence of digits, perhaps preceded by a minus sign.
The input conversion "%s" also starts with skipping whitespace (and that's clearer documented in the opengroup's man page than in the Linux one).
After skipping leading whitespace, "%s" accepts everything until another whitespace is read (and put back in the input, because it isn't made part of the "word" being read). That sequence of non-whitespace chars -- basically a "word" -- is stored in the buffer provided. For example, scanning a string from " a bc " results in skipping 3 spaces and storing "a" in the buffer. (The next scanf would skip the intervening space and put "bc" in the buffer. The next scanf after that would skip the remaining whitespace, encounter the end of file and return EOF.) So if a user is asked to enter three words they could give three words on one line or on three lines or on any number of lines preceded or separated by any number of empty lines, i.e. any number of subsequent newlines. Scanf couldn't care less.
There are a few exceptions to the "skip leading whitespace" strategy. Both concern conversions which usually indicate that the user wants to have more control about the input conversion. One of them is "%c" which just reads the next character. The other one is the "%[" spec which details exactly which characters are considered part of the next "word" to read. The conversion specification you use, "%[^\n]", reads everything except newline. Input from the keyboard is normally passed to a program line by line, and each line is by definition terminated by a newline. The newline of the first line passed to your program will be the first character from the input stream which does not match the conversion specification. Scanf will read it, inspect it and then put it back in the input stream (with ungetc()) for somebody else to consume. Unfortunately, it will itself be the next consumer, in another loop iteration (as I assume). Now the very first character it encounters (the newline) does not match the input conversion (which demands anything but the newline). Scanf therefore gives up immediately, puts the offending character dutifully back in the input for somebody else to consume and returns 0 indicating the failure to even perfom the very first conversion in the format string. But alas, it itself will be the next consumer. Yes, machines are stupid.
First scanf("%s",buf); scan only word or string, but second scanf("%[^\n]", buf); reads a string until a user inputs is new line character.
Let's take a look at these two code snippets :
#include <stdio.h>
int main(void){
char sentence[20] = {'\0'};
scanf("%s", sentence);
printf("\n%s\n", sentence);
return 0;
}
Input : Hello, my name is Claudio.
Output : Hello
#include <stdio.h>
int main(void){
char sentence[20] = {'\0'};
scanf("%[^\n]", sentence);
printf("\n%s\n", sentence);
return 0;
}
Input : Hello, my name is Claudio.
Output : Hello, my name is Claudio.
%[^\n] is an inverted group scan and this is how I personally use it, as it allows me to input a sentece with blank spaces in it.
Common
Both expect buf to be a pointer to a character array. Both append a null character to that array if at least 1 character was saved. Both return 1 if something was saved. Both return EOF if end-of-file detected before saving anything. Both return EOF in input error is detected. Both may save buf with embedded '\0' characters in it.
scanf("%s",buf);
scanf("%[^\n]", buf);
Differences
"%s" 1) consumes and discards leading white-space including '\n', space, tab, etc. 2) then saves non-white-space to buf until 3) a white-space is detected (which is then put back into stdin). buf will not contain any white-space.
"%[^\n]" 1) does not consume and discards leading white-space. 2) it saves non-'\n' characters to buf until 3) a '\n' is detected (which is then put back into stdin). If the first character read is a '\n', then nothing is saved in buf and 0 is returned. The '\n' remains in stdin and explains OP's infinite loop.
Failure to test the return value of scanf() is a common code oversight. Better code checks the return value of scanf().
IMO: code should never use either:
Both fail to limit the number of characters read. Use fgets().
You can think of %s as %[^\n \t\f\r\v], that is, after skipping any leading whitespace, a group a non-whitespace characters.

scanf() behaviour for strings with more than one word

Well I've been programming in C for quite a while now, and there is this question about the function scanf()
here is my problem:
I know that every element in ASCII table is a character and I even know that %s is a data specified for a string which is a collection of characters
My questions:
1.why does scanf() stops scanning after we press enter. If enter is also character why cant it be added as a component of the string that is being scanned.
2.My second question and what I require the most is why does it stops scanning after a space, when space is again a character?
Note: My question is not about how to avoid these but how does this happen
I'd be happy if this is already addressed, I'd gladly delete my question and even if I've presumed something wrong please let me know
"why does scanf() stops scanning after we press enter." is not always true.
The "%s" directs scanf() as follows
char buffer[100];
scanf("%s", buffer);
Scan and consume all white-space including '\n' generated from multiple Enters. This data is not saved.
Input white-space characters (as specified by the isspace function) are skipped, unless the specification includes a [, c, or n specifier C11dr §7.21.6.2 8
Scan and save all non-white-space characters. Continue doing so until a white-space is encountered.
Matches a sequence of non-white-space characters §7.21.6.2 12
This white-space is put back into stdin for the next input function. (OP's 2nd question)
A null character is appended to buffer.
Operations may stop short if EOF occurs.
If too much data is save in buffer, it is UB.
If some non-white-space data is saved, return 1. If EOF encountered, return EOF.
Note: stdin is usually line buffered, so no keyboard data is given to stdin until a '\n' occurs.
From my reading of your question, both of your numbered questions are the same:
Why does scanf with a format specifier of %s stop reading after encountering a space or newline.
And the answer to both of your questions is: Because that is what scanf with the %s format specifier is documented to do.
From the documentation:
%s Matches a sequence of bytes that are not white-space characters.
A space and a newline character (generated by the enter key) are white-space characters.
I made miniprogram with scanf for get multiple name without stop on space or ever enter.
i use while
Scanf("%s",text);
While (1)
{
Scanf("%s",text1)
If (text1=='.'){break;}
//here i simple add text1 to text
}
This way i get one line if use the .
Now i use
scanf("%[^\n]",text);
It work great.

Using scanf in for loop

Here is my c code:
int main()
{
int a;
for (int i = 0; i < 3; i++)
scanf("%d ", &a);
return 0;
}
When I input things like 1 2 3, it will ask me to input more, and I need to input something not ' '.
However, when I change it to (or other thing not ' ')
scanf("%d !", &a);
and input 1 ! 2! 3!, it will not ask more input.
The final space in scanf("%d ", &a); instructs scanf to consume all white space following the number. It will keep reading from stdin until you type something that is not white space. Simplify the format this way:
scanf("%d", &a);
scanf will still ignore white space before the numbers.
Conversely, the format "%d !" consumes any white space following the number and a single !. It stops scanning when it gets this character, or another non space character which it leaves in the input stream. You cannot tell from the return value whether it matched the ! or not.
scanf is very clunky, it is very difficult to use it correctly. It is often better to read a line of input with fgets() and parse that with sscanf() or even simpler functions such as strtol(), strspn() or strcspn().
scanf("%d", &a);
This should do the job.
Basically, scanf() consumes stdin input as much as its pattern matches. If you pass "%d" as the pattern, it will stop reading input after a integer is found. However, if you feed it with "%dx" for example, it matches with all integers followed by a character 'x'.
More Details:
Your pattern string could have the following characters:
Whitespace character: the function will read and ignore any whitespace
characters encountered before the next non-whitespace character
(whitespace characters include spaces, newline and tab characters --
see isspace). A single whitespace in the format string validates any
quantity of whitespace characters extracted from the stream (including
none).
Non-whitespace character, except format specifier (%): Any character that is not either a whitespace character (blank, newline or
tab) or part of a format specifier (which begin with a % character)
causes the function to read the next character from the stream,
compare it to this non-whitespace character and if it matches, it is
discarded and the function continues with the next character of
format. If the character does not match, the function fails, returning
and leaving subsequent characters of the stream unread.
Format specifiers: A sequence formed by an initial percentage sign (%) indicates a format specifier, which is used to specify the type
and format of the data to be retrieved from the stream and stored into
the locations pointed by the additional arguments.
Source: http://www.cplusplus.com/reference/cstdio/scanf/

Resources