scanf() behaviour for strings with more than one word - c

Well I've been programming in C for quite a while now, and there is this question about the function scanf()
here is my problem:
I know that every element in ASCII table is a character and I even know that %s is a data specified for a string which is a collection of characters
My questions:
1.why does scanf() stops scanning after we press enter. If enter is also character why cant it be added as a component of the string that is being scanned.
2.My second question and what I require the most is why does it stops scanning after a space, when space is again a character?
Note: My question is not about how to avoid these but how does this happen
I'd be happy if this is already addressed, I'd gladly delete my question and even if I've presumed something wrong please let me know

"why does scanf() stops scanning after we press enter." is not always true.
The "%s" directs scanf() as follows
char buffer[100];
scanf("%s", buffer);
Scan and consume all white-space including '\n' generated from multiple Enters. This data is not saved.
Input white-space characters (as specified by the isspace function) are skipped, unless the specification includes a [, c, or n specifier C11dr §7.21.6.2 8
Scan and save all non-white-space characters. Continue doing so until a white-space is encountered.
Matches a sequence of non-white-space characters §7.21.6.2 12
This white-space is put back into stdin for the next input function. (OP's 2nd question)
A null character is appended to buffer.
Operations may stop short if EOF occurs.
If too much data is save in buffer, it is UB.
If some non-white-space data is saved, return 1. If EOF encountered, return EOF.
Note: stdin is usually line buffered, so no keyboard data is given to stdin until a '\n' occurs.

From my reading of your question, both of your numbered questions are the same:
Why does scanf with a format specifier of %s stop reading after encountering a space or newline.
And the answer to both of your questions is: Because that is what scanf with the %s format specifier is documented to do.
From the documentation:
%s Matches a sequence of bytes that are not white-space characters.
A space and a newline character (generated by the enter key) are white-space characters.

I made miniprogram with scanf for get multiple name without stop on space or ever enter.
i use while
Scanf("%s",text);
While (1)
{
Scanf("%s",text1)
If (text1=='.'){break;}
//here i simple add text1 to text
}
This way i get one line if use the .
Now i use
scanf("%[^\n]",text);
It work great.

Related

What does scanf("%*[\n] %[^\n]" do? [duplicate]

I am not able to understand the difference. I use %[^\n]s, for taking phrases input from the user. But this was not working when I needed to add two phrases. But the above one did. Please help me understanding me the difference.
The %[\n] directive tells scanf() to match newline characters, and the * flag signals that no assignment should be made, so %*[\n] skips over any leading newline characters (assuming there is at least one leading \n character: more on this in a moment). There is a space following this first directive, so zero or more whitespace characters are skipped before the final %[^\n] directive, which matches characters until a newline is encountered. These are stored in input_string[], and the newline character is left behind in the input stream. Subsequent calls using this format string will skip over this remaining newline character.
But, there is probably no need for the %*[\n] directive here, since \n is a whitespace character; almost the same thing could be accomplished with a leading space in the format string: " %[^\n]".
One difference between the two: "%*[\n] %[^\n]" expects there to be a newline at the beginning of the input, and without this the match fails and scanf() returns without making any assignments, while " %[^\n]" does not expect a leading newline, or even a leading whitespace character (but skips them if present).
If you used "%[^\n]" instead, as suggested in the body of the question (note that the trailing s is not a part of the scanset directive), the first call to scanf() would match characters until a newline is encountered. The matching characters would be stored in input_string[], and the newline would remain in the input stream. Then, if scanf() is called again with this format string, no characters would be matched before encountering the newline, so the match would fail without assignment.
Please note that you should always specify a maximum width when using %s or %[] in a scanf() format string to avoid buffer overflow. With either of %s or %[], scanf() automatically adds the \0 terminator, so you must be sure to allow room for this. For an array of size 100, the maximum width should be 99, so that at most 99 characters are matched and stored in the array before the null terminator is added. For example: " %99[^\n]".
In scanf function, '*' tells the function to ignore a character from input.
%*[\n]
This tells the function to ignore the first '\n' character and then accept any string
Run the code and first give "ENTER" as input and then give "I am feeling great!!!"
Now print the buffer. You will get I am feeling great!!! as output
Try this code snippet
int main()
{
char buffer[100];
printf("Enter a string:"),scanf("%*[\n] %[^\n]', buffer),printf("buffer:%s\n", buffer);
return 0;
}
%[^\n] is an edit conversion code for scanf() as an alternative of gets(str).
Unlike gets(str), scanf() with %s cannot read more than one word.
Using %[^\n], scanf() can read even the string with whitespace.
It will terminate receiving string input from the user when it encounters a newline character.

What does scanf("%*[\n] %[^\n]", input_string); do?

I am not able to understand the difference. I use %[^\n]s, for taking phrases input from the user. But this was not working when I needed to add two phrases. But the above one did. Please help me understanding me the difference.
The %[\n] directive tells scanf() to match newline characters, and the * flag signals that no assignment should be made, so %*[\n] skips over any leading newline characters (assuming there is at least one leading \n character: more on this in a moment). There is a space following this first directive, so zero or more whitespace characters are skipped before the final %[^\n] directive, which matches characters until a newline is encountered. These are stored in input_string[], and the newline character is left behind in the input stream. Subsequent calls using this format string will skip over this remaining newline character.
But, there is probably no need for the %*[\n] directive here, since \n is a whitespace character; almost the same thing could be accomplished with a leading space in the format string: " %[^\n]".
One difference between the two: "%*[\n] %[^\n]" expects there to be a newline at the beginning of the input, and without this the match fails and scanf() returns without making any assignments, while " %[^\n]" does not expect a leading newline, or even a leading whitespace character (but skips them if present).
If you used "%[^\n]" instead, as suggested in the body of the question (note that the trailing s is not a part of the scanset directive), the first call to scanf() would match characters until a newline is encountered. The matching characters would be stored in input_string[], and the newline would remain in the input stream. Then, if scanf() is called again with this format string, no characters would be matched before encountering the newline, so the match would fail without assignment.
Please note that you should always specify a maximum width when using %s or %[] in a scanf() format string to avoid buffer overflow. With either of %s or %[], scanf() automatically adds the \0 terminator, so you must be sure to allow room for this. For an array of size 100, the maximum width should be 99, so that at most 99 characters are matched and stored in the array before the null terminator is added. For example: " %99[^\n]".
In scanf function, '*' tells the function to ignore a character from input.
%*[\n]
This tells the function to ignore the first '\n' character and then accept any string
Run the code and first give "ENTER" as input and then give "I am feeling great!!!"
Now print the buffer. You will get I am feeling great!!! as output
Try this code snippet
int main()
{
char buffer[100];
printf("Enter a string:"),scanf("%*[\n] %[^\n]', buffer),printf("buffer:%s\n", buffer);
return 0;
}
%[^\n] is an edit conversion code for scanf() as an alternative of gets(str).
Unlike gets(str), scanf() with %s cannot read more than one word.
Using %[^\n], scanf() can read even the string with whitespace.
It will terminate receiving string input from the user when it encounters a newline character.

Why won't this scanf format-string work? "%[^\n]\n"

I've seen a few examples where people give scanf a "%[^\n]\n" format string to read a whole line of user input. If my understanding is correct, this will read every character until a newline character is reached, and then the newline is consumed by scanf (and not included in the resulting input).
But I can't get this to work on my machine. A simple example I've tried:
#include <stdio.h>
int main(void)
{
char input[64];
printf("Enter some input: ");
scanf("%[^\n]\n", input);
printf("You entered %s\n", input);
}
When I run this, I'm prompted for input, I type some characters, I hit Enter, and the cursor goes to the beginning of the next line but the scanf call doesn't finish.
I can hit Enter as many times as I like, and it will never finish.
The only ways I've found to conclude the scanf call are:
enter \n as the first (and only) character at the prompt
enter Ctrl-d as the first (and only) character at the prompt
enter some input, one or more \n, zero or more other characters, and enter Ctrl-d
I don't know if this is machine dependent, but I'm very curious to know what's going on. I'm on OS X, if that's relevant.
According to the documentation for scanf (emphasis mine):
The format string consists of whitespace characters (any single whitespace character in the format string consumes all available consecutive whitespace characters from the input), non-whitespace multibyte characters except % (each such character in the format string consumes exactly one identical character from the input) and conversion specifications.
Thus, your format string %[^\n]\n will first read (and store) an arbitrary number of non-whitespace characters from the input (because of the %[^\n] part) and then, because of the following newline, read (and discard) an arbitrary number of whitespace characters, such as spaces, tabs or newlines.
Thus, to make your scanf stop reading input, you either need to type at least one non-whitespace character after the newline, or else arrange for the input stream to end (e.g. by pressing Ctrl+D on Unix-ish systems).
Instead, to make your code work as you expect, just remove the last \n from the end of your format string (as already suggested by Umamahesh P).
Of course, this will leave the newline still in the input stream. To get rid of it (in case you want to read another line later), you can getc it off the stream, or just append %*c (which means "read one character and discard it") or even %*1[\n] (read one newline and discard it) to the end of your scanf format string.
Ps. Note that your code has a couple of other problems. For example, to avoid buffer overflow bugs, you really should use %63[^\n] instead of %[^\n] to limit the number of characters scanf will read into your buffer. (The limit needs to be one less than the size of your buffer, since scanf will always append a trailing null character.)
Also, the %[ format specifier always expects at least one matching character, and will fail if none is available. Thus, if you press enter immediately without typing anything, your scanf will fail (silently, since you don't check the return value) and will leave your input buffer filled with random garbage. To avoid this, you should a) check the return value of scanf, b) set input[0] = '\0' before calling scanf, or c) preferably both.
Finally, note that, if you just want to read input line by line, it's much easier to just use fgets. Yes, you'll need to strip the trailing newline character (if any) yourself if you don't want it, but that's still a lot easier and safer that trying to use scanf for a job it's not really meant for:
#include <stdio.h>
#include <string.h>
void chomp(char *string) {
int len = strlen(string);
if (len > 0 && string[len-1] == '\n') string[len-1] = '\0';
}
int main(void)
{
char input[64];
printf("Enter some input: ");
fgets(input, sizeof(input), stdin);
chomp(input);
printf("You entered \"%s\".\n", input);
}
Whitespace characters in format of scanf() has an special meaning:
Whitespace character: the function will read and ignore any whitespace
characters encountered before the next non-whitespace character
(whitespace characters include spaces, newline and tab characters --
see isspace). A single whitespace in the format string validates any
quantity of whitespace characters extracted from the stream (including
none).
Thus, "%[^\n]\n" is just equivalent to "%[^\n] ", telling scanf() to ignore all whitespace characters after %[^\n]. This is why all '\n's are ignored until a non-whitespace character is entered, which is happened in your case.
Reference: http://www.cplusplus.com/reference/cstdio/scanf/
Remove the the 2nd new line character and the following is sufficient.
scanf("%[^\n]", input);
To answer the original one,
scanf("%[^\n]\n", input);
This should also work, provided you enter a non white space character after the input. Example:
Enter some input: lkfjdlfkjdlfjdlfjldj
t
You entered lkfjdlfkjdlfjdlfjldj

What is the difference between these two scanf statements?

I am having some doubt. The doubt is
What is the difference between the following two scanf statements.
scanf("%s",buf);
scanf("%[^\n]", buf);
If I am giving the second scanf in the while loop, it is going infinitely. Because the \n is in the stdin.
But in the first statement, reads up to before the \n. It also will not read the \n.
But The first statement does not go in infinitely. Why?
Regarding the properties of the %s format specifier, quoting C11 standrad, chapter §7.21.6.2, fscanf()
s Matches a sequence of non-white-space characters.
The newline is a whitespace character, so only a newlinew won't be a match for %s.
So, in case the newline is left in the buffer, it does not scan the newline alone, and wait for the next non-whitespace input to appear on stdin.
The %s format specifier specifies that scanf() should read all characters in the standard input buffer stdin until it encounters the first whitespace character, and then stop there. The whitespace ('\n') remains in the stdin buffer until consumed by another function, like getchar().
In the second case there is no mention of stopping.
You can think of scanf as extracting words separated by whitespace from a stream of characters. Imagine reading a file which contains a table of numbers, for example, without worrying about the exact number count per line or the exact space count and nature between numbers.
Whitespace, for the record, is horizontal and vertical (these exist) tabs, carriage returns, newlines, form feeds and last not least, actual spaces.
In order to free the user from details, scanf treats all whitespace the same: It normally skips it until it hits a non-whitespace and then tries to convert the character sequence starting there according to the specified input conversion. E.g. with "%d" it expects a sequence of digits, perhaps preceded by a minus sign.
The input conversion "%s" also starts with skipping whitespace (and that's clearer documented in the opengroup's man page than in the Linux one).
After skipping leading whitespace, "%s" accepts everything until another whitespace is read (and put back in the input, because it isn't made part of the "word" being read). That sequence of non-whitespace chars -- basically a "word" -- is stored in the buffer provided. For example, scanning a string from " a bc " results in skipping 3 spaces and storing "a" in the buffer. (The next scanf would skip the intervening space and put "bc" in the buffer. The next scanf after that would skip the remaining whitespace, encounter the end of file and return EOF.) So if a user is asked to enter three words they could give three words on one line or on three lines or on any number of lines preceded or separated by any number of empty lines, i.e. any number of subsequent newlines. Scanf couldn't care less.
There are a few exceptions to the "skip leading whitespace" strategy. Both concern conversions which usually indicate that the user wants to have more control about the input conversion. One of them is "%c" which just reads the next character. The other one is the "%[" spec which details exactly which characters are considered part of the next "word" to read. The conversion specification you use, "%[^\n]", reads everything except newline. Input from the keyboard is normally passed to a program line by line, and each line is by definition terminated by a newline. The newline of the first line passed to your program will be the first character from the input stream which does not match the conversion specification. Scanf will read it, inspect it and then put it back in the input stream (with ungetc()) for somebody else to consume. Unfortunately, it will itself be the next consumer, in another loop iteration (as I assume). Now the very first character it encounters (the newline) does not match the input conversion (which demands anything but the newline). Scanf therefore gives up immediately, puts the offending character dutifully back in the input for somebody else to consume and returns 0 indicating the failure to even perfom the very first conversion in the format string. But alas, it itself will be the next consumer. Yes, machines are stupid.
First scanf("%s",buf); scan only word or string, but second scanf("%[^\n]", buf); reads a string until a user inputs is new line character.
Let's take a look at these two code snippets :
#include <stdio.h>
int main(void){
char sentence[20] = {'\0'};
scanf("%s", sentence);
printf("\n%s\n", sentence);
return 0;
}
Input : Hello, my name is Claudio.
Output : Hello
#include <stdio.h>
int main(void){
char sentence[20] = {'\0'};
scanf("%[^\n]", sentence);
printf("\n%s\n", sentence);
return 0;
}
Input : Hello, my name is Claudio.
Output : Hello, my name is Claudio.
%[^\n] is an inverted group scan and this is how I personally use it, as it allows me to input a sentece with blank spaces in it.
Common
Both expect buf to be a pointer to a character array. Both append a null character to that array if at least 1 character was saved. Both return 1 if something was saved. Both return EOF if end-of-file detected before saving anything. Both return EOF in input error is detected. Both may save buf with embedded '\0' characters in it.
scanf("%s",buf);
scanf("%[^\n]", buf);
Differences
"%s" 1) consumes and discards leading white-space including '\n', space, tab, etc. 2) then saves non-white-space to buf until 3) a white-space is detected (which is then put back into stdin). buf will not contain any white-space.
"%[^\n]" 1) does not consume and discards leading white-space. 2) it saves non-'\n' characters to buf until 3) a '\n' is detected (which is then put back into stdin). If the first character read is a '\n', then nothing is saved in buf and 0 is returned. The '\n' remains in stdin and explains OP's infinite loop.
Failure to test the return value of scanf() is a common code oversight. Better code checks the return value of scanf().
IMO: code should never use either:
Both fail to limit the number of characters read. Use fgets().
You can think of %s as %[^\n \t\f\r\v], that is, after skipping any leading whitespace, a group a non-whitespace characters.

what is the purpose of putting a space in scanf like this scanf(" %c",&ch) in place of scanf("%c",&ch)? [duplicate]

This question already has answers here:
What does space in scanf mean? [duplicate]
(6 answers)
Closed 7 years ago.
what is the purpose of putting a space in scanf like this
scanf(" %c",&ch)
in place of
scanf("%c",&ch)?
Also what is input buffer in fflush(stdin)?
Because the space before %c ignores all whitespace. *scanf family of functions ignore all whitespace before any % by default except for %c, %[ and %n. This is mentioned in C11 at:
7.21.6.2.8
Input white-space characters (as specified by the isspace function) are skipped, unless
the specification includes a [, c, or n specifier.
To be complete, here's the part that says all whitespace will be ignored:
7.21.6.2.5
A directive composed of white-space character(s) is executed by reading input up to the
first non-white-space character (which remains unread), or until no more characters can
be read. The directive never fails.
Regarding your second question, fflush(stdin) causes undefined behavior and must not be used (emphasis mine):
7.21.5.2.2
If stream points to an output stream or an update stream in which the most recent
operation was not input, the fflush function causes any unwritten data for that stream
to be delivered to the host environment to be written to the file; otherwise, the behavior is undefined.
what is the purpose of putting a space in scanf like this scanf(" %c",&ch) in place of scanf("%c",&ch)?
So that scanf would ignore all spaces before the first non-space character is encountered in the stream.
Also what is input buffer in fflush(stdin)?
What you input into the console will exist in the stdin stream.
Don't flush that stream however, it's undefined behavior.
If you want to discard characters entered after scanf is called, you can read and discard them.
The space in the scanf in this case tells scanf to ignore any leading whitespace characters in front of the character you read. Still even if there is no whitespace in front of the character the code will work and read the character successfully.
I am not sure what you are asking in your last question, but stdin is the standard input stream for you program.
scanf(" %c",&ch);
As per the man page,
White space (such as blanks,
tabs, or newlines) in the format string match any amount of white space,
including none, in the input.
Stdin is standard input.The user enters the data for the program, this is first stored in a buffer and then when the program requests data transfers by use of the read operation the data is made available to the program. (using scanf etc).
I had the same problem a while ago in which if I would try to read a variable using scanf ("%c", &ans); it would not read anything. Thus I figured out that the \n character from the last input was being read.
Thus, doing scanf (" %c", &ans); solved my problem.
Although, I could not understand your second question clearly.
Just to give a space from the last object, if not, for example a string, everything will be together with no spaces between them.

Resources