Weird scanf behaviour when reading number and newline - c

I just realize this 'bug' of scanf now after 8 years with C.
Below scanf code will skip the leading whitespace characters from the second line of input.
int x;
char in[100];
scanf("%d\n",&x);
gets(in);
Input:
1
s
x will contain 1, but in will be just "s" not " s"
Is this standard C or just gcc behaviour?

A whitespace character in your scanf format string will cause scanf to consume any (and all) white space till a non-whitespace char occurs.
This seems to be standard scanf behaviour and is not limited to gcc.

Its not a Bug in scanf, the manual of scanf says,
A sequence of white-space characters (space, tab, newline, etc.; see
isspace(3)). This directive matches any amount of white space,
including none, in the input.
Which means any white space characters with directive as %d\n will read a number followed by consuming a sequence of white space characters in the input and only returns until you type a non white space character. That how you are able to see only "s" without a space before it.

The '\n' (and this is true for any whitespace character in the format string) in
scanf("%d\n", &x);
matches any number of whitespace characters in the input (characters for which isspace function returns 1, i.e, true, such as newline, space, tab etc.) and not just the newline character '\n'. This means that scanf will read all whitespace characters in the input and discard them till it encounters a non-whitespace character. This explains the behaviour you observed.
This is a part of the standard definition of the scanf function and not a gcc feature. Also, gets function is deprecated and unsafe. It does not check for buffer overrun and can lead to bugs and even program crash. In fact, gcc emits a warning against the use of gets on my machine. Use of fgets instead is recommended.
To do what you want, you can do the following:
int x;
char in[100];
scanf("%d", &x);
After scanf returns successfully, the input stream can contain any sequence of characters terminated by a newline depending on the input given by the user. Get rid of those extraneous characters before reading a string from the stdin.
char ch;
while((ch = getchar()) != '\n' || ch != EOF); // null statement
fgets(in, 100, stdin);
The above fgets call means that it will read at most 100-1 = 99 (it saves one character space for the terminating null byte which it adds to the buffer being read into before exiting) characters from the stream pointed to by stdin and store them in the buffer pointed to by in. fgets will exit if it encounters EOF, '\n' or it has already read 100-1 characters - whichever of the three condition occurs first. If it reads a newline, it will store it into the buffer.
Is the user enters 100 characters or more in this case, then the extraneous characters would be lying around in the input buffer which can mess up with the subsequent input operation of characters or strings by scanf, fgets, getchar etc. calls. You can check for this checking the length of the string in.
if(strlen(in) > 99) {
// extraneous chars lying around in the input buffer
// read and discard them
char ch;
while((ch = getchar()) != '\n' || ch != EOF); // null statement
}

Related

Difference between `"%[^\n]"` and `"%[^\n]\n"` and `"%[^\n]%*c"` in C how do they work?

I could not find the answer anywhere else.
%[^\n] - When I run this one, scanf is getting input and terminating after I press enter. ( Probably leaving \n in the input system)
%[^\n]\n - this one is getting the input but scanf is NOT terminating immediately after I press enter like the one above. I hit more enter and it makes more newlines. When I give a character and then press enter it finally terminates. Example:
int main(void)
{
char s[100];
scanf("%[^\n]\n", s);
printf("%s", s);
return 0;
}
The results:
Last one:
%[^\n]%*c - When I give some input and press enter. scanf immediately terminates.
How do those 3 work and how do they differ?
All 3 format begin with "%[^\n]".
"%[^\n]" is poor code2 that lacks a width limit and is susceptible to buffer overrun. Use a width limit like "%99[^\n]" with char s[100];.
"%[...]" does not consume leading whitespace like many other specifiers.
This specifier directs reading input until a '\n'1 is encountered. The '\n' is put back into stdin. All other characters are saved in the matching destination's array s.
If no characters were read (not counting the '\n'), the specifier fails and scanf() returns without changing s - rest of the format is not used. No null character appended.
If characters were read, they are saved and a null character is appended to s and scanning continues with the next portion of the format.
"\n" acts just like " ", "\t", "any_white_space_chracter" and reads and tosses 0 or more whitespace characters. It continues to do so until a non-white-space1 is read. That non-whitespace character is put back into stdin.
Given line buffered input, this means a line with non-whitespace following input is needed to see the next non-whitespace and allow scanf() to move on.
With scanf(), a "\n" at the end of a format is bad and caused OP's problem.
"%*c" reads 1 character1 and throws it away - even if it is not a '\n'. This specifier does not contribute to the return count due to the '*'.
A better alternative is to use fgets().
char s[100];
if (fgets(s, sizeof s, stdin)) {
s[strcspn(s, "\n")] = '\0'; // To lop off potential trailing \n
A lesser alternative is
char s[100] = { 0 };
scanf("%99[^\n]", s);
// With a separate scanf ....
scanf("%*1[\n]"); // read the next character if it is a \n and toss it.
1 ... or end-of-file or rare input error.
2 IMO, worse than gets().

Scanf clarification in c language

Is it possible to read an entire string including blank spaces like gets() function in scanf()?
I am able to do it using the gets() function.
char s[30];
gets(s);
This will read a line of characters. Can this be done in scanf()?
You can read a line, including blank spaces, with scanf(), but this function is subtle, and using it is very error-prone. Using the %[^\n] conversion specifier, you can tell scanf() to match characters to form a string, excluding '\n' characters. If you do this, you should specify a maximum field width. This width specifies the maximum number of characters to match, so you must leave room for the '\0' terminator.
It is possible that the first character in the input stream is a '\n'. In this case, scanf() would return a value of 0, since there were no matches before encountering the newline. But, nothing would be stored in s, so you may have undefined behavior. To avoid this, you can call scanf() first using the %*[\n] conversion specifier, discarding any leading '\n' characters.
After the string has been read, there will be additional characters in the input stream. At least a '\n' is present, and possibly more characters if the user entered more than the maximum field width specifies. You might then want to discard these extra characters so that they don't interfere with further inputs. The code below includes a loop to do this operation.
The first call to scanf() will consume all newline characters in the input stream until a non-newline character is encountered. While I believe that the second call to scanf() should always be successful, it is good practice to always check the return value of scanf() (which is the number of successful assignments made). I have stored this value in result, and check it before printing the string. If scanf() returns an unexpected result, an error message is printed.
It is better, and easier, to use fgets() to read entire lines. You must remember that fgets() keeps the trailing newline, so you may want to remove it. There is also a possibility that the user will enter more characters than the buffer will store, leaving the remaining characters in the input stream. You may want to remove these extra characters before prompting for more input.
Again, you should check the return value of fgets(); this function returns a pointer to the first element of the storage buffer, or a NULL pointer in the event of an error. The code below replaces any trailing newline character in the string, discards extra characters from the input stream, and prints the string only if the call to fgets() was successful. Otherwise, an error message is printed.
#include <stdio.h>
int main(void)
{
char s[30];
int result;
printf("Please enter a line of input:\n");
scanf("%*[\n]"); // throw away leading '\n' if present
result = scanf("%29[^\n]", s); // match up to 29 characters, excluding '\n'
/* Clear extra characters from input stream */
int c;
while ((c = getchar()) != '\n' && c != EOF)
continue; // discard extra characters
if (result == 1) {
puts(s);
} else {
fprintf(stderr, "EOF reached or error in scanf()\n");
}
printf("Please enter a line of input:\n");
char *ps = fgets(s, 30, stdin); // keeps '\n' character
if (ps) {
while (*ps && *ps != '\n') {
++ps;
}
if (*ps) { // replace '\n' with '\0'
*ps = '\0';
} else {
while ((c = getchar()) != '\n' && c != EOF)
continue; // discard extra characters
}
puts(s);
} else {
fprintf(stderr, "EOF reached or error in fgets()\n");
}
return 0;
}
Note that these two methods of getting a line of input are not exactly equivalent. The scanf() method, as written here, does not accept an empty line (i.e., a line consisting of only the '\n' character), but does accept lines consisting of other whitespace characters. The fscanf() method will accept an empty line as input.
Also, if it is acceptable to ignore leading whitespace characters, it would be simpler to follow the recommendation given by Jonathan Leffler in the comments to use only a single call to scanf():
result = scanf(" %29[^\n]", s);
This will ignore leading whitespace characters, including newlines.
Do not use scanf() or gets() function — use fgets() instead. But for the above question please find the answer.
int main() {
char a[30];
scanf ("%29[^\n]%*c", name);
printf("%s\n", a);
return 0;
}
Its also highly recommended like I told in the beginning to use fgets() instead. We clearly do not understand the weird requirement. I would have used the fgets() to read the character.
fgets(a, size(a), stdin);

What does %[^\n] mean in C?

What does %[^\n] mean in C?
I saw it in a program which uses scanf for taking multiple word input into a string variable. I don't understand though because I learned that scanf can't take multiple words.
Here is the code:
#include <stdio.h>
#include <stdlib.h>
int main() {
char line[100];
scanf("%[^\n]",line);
printf("Hello,World\n");
printf("%s",line);
return 0;
}
[^\n] is a kind of regular expression.
[...]: it matches a nonempty sequence of characters from the scanset (a set of characters given by ...).
^ means that the scanset is "negated": it is given by its complement.
^\n: the scanset is all characters except \n.
Furthermore fscanf (and scanf) will read the longest sequence of input characters matching the format.
So scanf("%[^\n]", s); will read all characters until you reach \n (or EOF) and put them in s. It is a common idiom to read a whole line in C.
See also §7.21.6.2 The fscanf function.
scanf("%[^\n]",line); is a problematic way to read a line. It is worse than gets().
C defines line as:
A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character. Whether the last line requires a terminating new-line character is implementation-defined.
The scanf("%[^\n]", line) has the specifier "%[^\n]". It scans for unlimited number of characters that match the scan-set ^\n. If none are read, the specifier fails and scanf() returns with line unaltered. If at least one character is read, all matching characters are read and saved and a null character is appended.
The scan-set ^\n implies all character that are not (due to the '^') '\n'.
'\n' is not read
scanf("%[^\n]",.... fails to read a new line character '\n'. It remains in stdin. The entire line is not read.
Buffer overflow
The below leads to undefined behavior (UB) should more than 99 characters get read.
char line[100];
scanf("%[^\n]",line); // buffer overflow possible
Does nothing on empty line
When the line consists of only "\n", scanf("%[^\n]",line); returns a 0 without setting line[] - no null character is appended. This can readily lead to undefined behavior should subsequent code use an uninitialized line[]. The '\n' remains in stdin.
Failure to check the return value
scanf("%[^\n]",line); assumes input succeeded. Better code would check the scanf() return value.
Recommendation
Do not use scanf() and instead use fgets() to read a line of input.
#define EXPECTED_INPUT_LENGTH_MAX 49
char line[EXPECTED_INPUT_LENGTH_MAX + 1 + 1 + 1];
// \n + \0 + extra to detect overly long lines
if (fgets(line, sizeof line, stdin)) {
size_t len = strlen(line);
// Lop off potential trailing \n if desired.
if (len > 0 && line[len-1] == '\n') {
line[--len] = '\0';
}
if (len > EXPECTED_INPUT_LENGTH_MAX) {
// Handle error
// Usually includes reading rest of line if \n not found.
}
The fgets() approach has it limitations too. e.g. (reading embedded null characters).
Handling user input, possible hostile, is challenging.
scanf("%[^\n]",line);
means: scan till \n or an enter key.
scanf("%[^\n]",line);
Will read user input until enter is pressed or a newline character is added (\n) and store it into a variable named line.
Question: what is %[^\n] mean in C?
Basically the \n command prints the output in the next line, but in
case of C gives the Null data followed by the above problem only.
Because of that to remove the unwanted data or null data, need to add
Complement/negotiated symbol[^\n]. It gives all characters until the next line
and keeps the data in the defined expression.
Means it is the Complemented data or rewritten data from the trash
EX:
char number[100]; //defined a character ex: StackOverflow
scanf("%[^\n]",number); //defining the number without this statement, the
character number gives the unwanted stuff `���`
printf("HI\n"); //normaly use of printf statement
printf("%s",number); //printing the output
return 0;

Why won't this scanf format-string work? "%[^\n]\n"

I've seen a few examples where people give scanf a "%[^\n]\n" format string to read a whole line of user input. If my understanding is correct, this will read every character until a newline character is reached, and then the newline is consumed by scanf (and not included in the resulting input).
But I can't get this to work on my machine. A simple example I've tried:
#include <stdio.h>
int main(void)
{
char input[64];
printf("Enter some input: ");
scanf("%[^\n]\n", input);
printf("You entered %s\n", input);
}
When I run this, I'm prompted for input, I type some characters, I hit Enter, and the cursor goes to the beginning of the next line but the scanf call doesn't finish.
I can hit Enter as many times as I like, and it will never finish.
The only ways I've found to conclude the scanf call are:
enter \n as the first (and only) character at the prompt
enter Ctrl-d as the first (and only) character at the prompt
enter some input, one or more \n, zero or more other characters, and enter Ctrl-d
I don't know if this is machine dependent, but I'm very curious to know what's going on. I'm on OS X, if that's relevant.
According to the documentation for scanf (emphasis mine):
The format string consists of whitespace characters (any single whitespace character in the format string consumes all available consecutive whitespace characters from the input), non-whitespace multibyte characters except % (each such character in the format string consumes exactly one identical character from the input) and conversion specifications.
Thus, your format string %[^\n]\n will first read (and store) an arbitrary number of non-whitespace characters from the input (because of the %[^\n] part) and then, because of the following newline, read (and discard) an arbitrary number of whitespace characters, such as spaces, tabs or newlines.
Thus, to make your scanf stop reading input, you either need to type at least one non-whitespace character after the newline, or else arrange for the input stream to end (e.g. by pressing Ctrl+D on Unix-ish systems).
Instead, to make your code work as you expect, just remove the last \n from the end of your format string (as already suggested by Umamahesh P).
Of course, this will leave the newline still in the input stream. To get rid of it (in case you want to read another line later), you can getc it off the stream, or just append %*c (which means "read one character and discard it") or even %*1[\n] (read one newline and discard it) to the end of your scanf format string.
Ps. Note that your code has a couple of other problems. For example, to avoid buffer overflow bugs, you really should use %63[^\n] instead of %[^\n] to limit the number of characters scanf will read into your buffer. (The limit needs to be one less than the size of your buffer, since scanf will always append a trailing null character.)
Also, the %[ format specifier always expects at least one matching character, and will fail if none is available. Thus, if you press enter immediately without typing anything, your scanf will fail (silently, since you don't check the return value) and will leave your input buffer filled with random garbage. To avoid this, you should a) check the return value of scanf, b) set input[0] = '\0' before calling scanf, or c) preferably both.
Finally, note that, if you just want to read input line by line, it's much easier to just use fgets. Yes, you'll need to strip the trailing newline character (if any) yourself if you don't want it, but that's still a lot easier and safer that trying to use scanf for a job it's not really meant for:
#include <stdio.h>
#include <string.h>
void chomp(char *string) {
int len = strlen(string);
if (len > 0 && string[len-1] == '\n') string[len-1] = '\0';
}
int main(void)
{
char input[64];
printf("Enter some input: ");
fgets(input, sizeof(input), stdin);
chomp(input);
printf("You entered \"%s\".\n", input);
}
Whitespace characters in format of scanf() has an special meaning:
Whitespace character: the function will read and ignore any whitespace
characters encountered before the next non-whitespace character
(whitespace characters include spaces, newline and tab characters --
see isspace). A single whitespace in the format string validates any
quantity of whitespace characters extracted from the stream (including
none).
Thus, "%[^\n]\n" is just equivalent to "%[^\n] ", telling scanf() to ignore all whitespace characters after %[^\n]. This is why all '\n's are ignored until a non-whitespace character is entered, which is happened in your case.
Reference: http://www.cplusplus.com/reference/cstdio/scanf/
Remove the the 2nd new line character and the following is sufficient.
scanf("%[^\n]", input);
To answer the original one,
scanf("%[^\n]\n", input);
This should also work, provided you enter a non white space character after the input. Example:
Enter some input: lkfjdlfkjdlfjdlfjldj
t
You entered lkfjdlfkjdlfjdlfjldj

scanf in a while loop reads on first iteration only

NOTE: Please notice this is not a duplicate of Why is scanf() causing infinite loop in this code? , I've already seen that question but the issue there is that he checks for ==0 instead of !=EOF. Also, his problem is different, the "infinite loop" there still waits for user input, it just does not exit.
I have the following while loop:
while ((read = scanf(" (%d,%d)\n", &src, &dst)) != EOF) {
if(read != 2 ||
src >= N || src < 0 ||
dst >= N || dst < 0) {
printf("invalid input, should be (N,N)");
} else
matrix[src][dst] = 1;
}
The intention of which is to read input in the format (int,int), to stop reading when EOF is read, and to try again if an invalid input is received.
The probelm is, that scanf works only for the first iteration, after that there is an infinite loop. The program does not wait for user input, it just keeps assuming that the last input is the same.
read, src, and dst are of type int.
I have looked at similar questions, but they seem to fail for checking if scanf returns 0 instead of checking for EOF, and the answers tells them to switch to EOF.
You need to use
int c;
while((c=getchar()) != '\n' && c != EOF);
at the end of the while loop in order to clear/flush the standard input stream(stdin). Why? The answer can be seen below:
The scanf with the format string(" (%d,%d)\n") you have requires the user to type
An opening bracket(()
A number(For the first %d)
A comma(,)
A number(For the last %d)
The space(First character of the format string of your scanf) and the newline character(\n which is the last character of the format string of your scanf) are considered to be whitespace characters. Lets see what the C11 standard has to say about whitespace characters in the format string of fscanf(Yes. I said fscanf because it is equivalent to scanf when the first argument is stdin):
7.21.6.2 The fscanf function
[...]
A directive composed of white-space character(s) is executed by reading input up to the first non-white-space character (which remains unread), or until no more characters can be read. The directive never fails
So, all whitespace characters skips/discards all whitespace characters, if any, until the first non-whitespace character as seen in the quote above. This means that the space at the start of the format string of your scanf cleans all leading whitespace until the first non-whitespace character and the \n character does the same.
When you enter the right data as per the format string in the scanf, the execution of the scanf does not end. This is because the \n hadn't found a non-whitespace character in the stdin and will stop scanning only when it finds one. So, you have to remove it.
The next problem lies when the user types something else which is not as per the format string of the scanf. When this happens, scanf fails and returns. The rest of the data which caused the scanf to fail prevails in the stdin. This character is seen by the scanf when it is called the next time. This can also make the scanf fail. This causes an infinite loop.
To fix it, you have to clean/clear/flush the stdin in each iteration of the while loop using the method shown above.
scanf prompts the user for some input. Assuming the user does what's expected of them, they will type some digits, and they will hit the enter key.
The digits will be stored in the input buffer, but so will a newline character, which was added by the fact that they hit the enter key.
scanf will parse the digits to produce an integer, which it stores in the src variable. It stops at the newline character, which remains in the input buffer.
Later, second scanf which looks for a newline character in the input buffer. It finds one immediately, so it doesn't need to prompt the user for any more input.

Resources