What does %[^\n] mean in C? - c

What does %[^\n] mean in C?
I saw it in a program which uses scanf for taking multiple word input into a string variable. I don't understand though because I learned that scanf can't take multiple words.
Here is the code:
#include <stdio.h>
#include <stdlib.h>
int main() {
char line[100];
scanf("%[^\n]",line);
printf("Hello,World\n");
printf("%s",line);
return 0;
}

[^\n] is a kind of regular expression.
[...]: it matches a nonempty sequence of characters from the scanset (a set of characters given by ...).
^ means that the scanset is "negated": it is given by its complement.
^\n: the scanset is all characters except \n.
Furthermore fscanf (and scanf) will read the longest sequence of input characters matching the format.
So scanf("%[^\n]", s); will read all characters until you reach \n (or EOF) and put them in s. It is a common idiom to read a whole line in C.
See also §7.21.6.2 The fscanf function.

scanf("%[^\n]",line); is a problematic way to read a line. It is worse than gets().
C defines line as:
A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character. Whether the last line requires a terminating new-line character is implementation-defined.
The scanf("%[^\n]", line) has the specifier "%[^\n]". It scans for unlimited number of characters that match the scan-set ^\n. If none are read, the specifier fails and scanf() returns with line unaltered. If at least one character is read, all matching characters are read and saved and a null character is appended.
The scan-set ^\n implies all character that are not (due to the '^') '\n'.
'\n' is not read
scanf("%[^\n]",.... fails to read a new line character '\n'. It remains in stdin. The entire line is not read.
Buffer overflow
The below leads to undefined behavior (UB) should more than 99 characters get read.
char line[100];
scanf("%[^\n]",line); // buffer overflow possible
Does nothing on empty line
When the line consists of only "\n", scanf("%[^\n]",line); returns a 0 without setting line[] - no null character is appended. This can readily lead to undefined behavior should subsequent code use an uninitialized line[]. The '\n' remains in stdin.
Failure to check the return value
scanf("%[^\n]",line); assumes input succeeded. Better code would check the scanf() return value.
Recommendation
Do not use scanf() and instead use fgets() to read a line of input.
#define EXPECTED_INPUT_LENGTH_MAX 49
char line[EXPECTED_INPUT_LENGTH_MAX + 1 + 1 + 1];
// \n + \0 + extra to detect overly long lines
if (fgets(line, sizeof line, stdin)) {
size_t len = strlen(line);
// Lop off potential trailing \n if desired.
if (len > 0 && line[len-1] == '\n') {
line[--len] = '\0';
}
if (len > EXPECTED_INPUT_LENGTH_MAX) {
// Handle error
// Usually includes reading rest of line if \n not found.
}
The fgets() approach has it limitations too. e.g. (reading embedded null characters).
Handling user input, possible hostile, is challenging.

scanf("%[^\n]",line);
means: scan till \n or an enter key.

scanf("%[^\n]",line);
Will read user input until enter is pressed or a newline character is added (\n) and store it into a variable named line.

Question: what is %[^\n] mean in C?
Basically the \n command prints the output in the next line, but in
case of C gives the Null data followed by the above problem only.
Because of that to remove the unwanted data or null data, need to add
Complement/negotiated symbol[^\n]. It gives all characters until the next line
and keeps the data in the defined expression.
Means it is the Complemented data or rewritten data from the trash
EX:
char number[100]; //defined a character ex: StackOverflow
scanf("%[^\n]",number); //defining the number without this statement, the
character number gives the unwanted stuff `���`
printf("HI\n"); //normaly use of printf statement
printf("%s",number); //printing the output
return 0;

Related

Difference between `"%[^\n]"` and `"%[^\n]\n"` and `"%[^\n]%*c"` in C how do they work?

I could not find the answer anywhere else.
%[^\n] - When I run this one, scanf is getting input and terminating after I press enter. ( Probably leaving \n in the input system)
%[^\n]\n - this one is getting the input but scanf is NOT terminating immediately after I press enter like the one above. I hit more enter and it makes more newlines. When I give a character and then press enter it finally terminates. Example:
int main(void)
{
char s[100];
scanf("%[^\n]\n", s);
printf("%s", s);
return 0;
}
The results:
Last one:
%[^\n]%*c - When I give some input and press enter. scanf immediately terminates.
How do those 3 work and how do they differ?
All 3 format begin with "%[^\n]".
"%[^\n]" is poor code2 that lacks a width limit and is susceptible to buffer overrun. Use a width limit like "%99[^\n]" with char s[100];.
"%[...]" does not consume leading whitespace like many other specifiers.
This specifier directs reading input until a '\n'1 is encountered. The '\n' is put back into stdin. All other characters are saved in the matching destination's array s.
If no characters were read (not counting the '\n'), the specifier fails and scanf() returns without changing s - rest of the format is not used. No null character appended.
If characters were read, they are saved and a null character is appended to s and scanning continues with the next portion of the format.
"\n" acts just like " ", "\t", "any_white_space_chracter" and reads and tosses 0 or more whitespace characters. It continues to do so until a non-white-space1 is read. That non-whitespace character is put back into stdin.
Given line buffered input, this means a line with non-whitespace following input is needed to see the next non-whitespace and allow scanf() to move on.
With scanf(), a "\n" at the end of a format is bad and caused OP's problem.
"%*c" reads 1 character1 and throws it away - even if it is not a '\n'. This specifier does not contribute to the return count due to the '*'.
A better alternative is to use fgets().
char s[100];
if (fgets(s, sizeof s, stdin)) {
s[strcspn(s, "\n")] = '\0'; // To lop off potential trailing \n
A lesser alternative is
char s[100] = { 0 };
scanf("%99[^\n]", s);
// With a separate scanf ....
scanf("%*1[\n]"); // read the next character if it is a \n and toss it.
1 ... or end-of-file or rare input error.
2 IMO, worse than gets().

Scanf clarification in c language

Is it possible to read an entire string including blank spaces like gets() function in scanf()?
I am able to do it using the gets() function.
char s[30];
gets(s);
This will read a line of characters. Can this be done in scanf()?
You can read a line, including blank spaces, with scanf(), but this function is subtle, and using it is very error-prone. Using the %[^\n] conversion specifier, you can tell scanf() to match characters to form a string, excluding '\n' characters. If you do this, you should specify a maximum field width. This width specifies the maximum number of characters to match, so you must leave room for the '\0' terminator.
It is possible that the first character in the input stream is a '\n'. In this case, scanf() would return a value of 0, since there were no matches before encountering the newline. But, nothing would be stored in s, so you may have undefined behavior. To avoid this, you can call scanf() first using the %*[\n] conversion specifier, discarding any leading '\n' characters.
After the string has been read, there will be additional characters in the input stream. At least a '\n' is present, and possibly more characters if the user entered more than the maximum field width specifies. You might then want to discard these extra characters so that they don't interfere with further inputs. The code below includes a loop to do this operation.
The first call to scanf() will consume all newline characters in the input stream until a non-newline character is encountered. While I believe that the second call to scanf() should always be successful, it is good practice to always check the return value of scanf() (which is the number of successful assignments made). I have stored this value in result, and check it before printing the string. If scanf() returns an unexpected result, an error message is printed.
It is better, and easier, to use fgets() to read entire lines. You must remember that fgets() keeps the trailing newline, so you may want to remove it. There is also a possibility that the user will enter more characters than the buffer will store, leaving the remaining characters in the input stream. You may want to remove these extra characters before prompting for more input.
Again, you should check the return value of fgets(); this function returns a pointer to the first element of the storage buffer, or a NULL pointer in the event of an error. The code below replaces any trailing newline character in the string, discards extra characters from the input stream, and prints the string only if the call to fgets() was successful. Otherwise, an error message is printed.
#include <stdio.h>
int main(void)
{
char s[30];
int result;
printf("Please enter a line of input:\n");
scanf("%*[\n]"); // throw away leading '\n' if present
result = scanf("%29[^\n]", s); // match up to 29 characters, excluding '\n'
/* Clear extra characters from input stream */
int c;
while ((c = getchar()) != '\n' && c != EOF)
continue; // discard extra characters
if (result == 1) {
puts(s);
} else {
fprintf(stderr, "EOF reached or error in scanf()\n");
}
printf("Please enter a line of input:\n");
char *ps = fgets(s, 30, stdin); // keeps '\n' character
if (ps) {
while (*ps && *ps != '\n') {
++ps;
}
if (*ps) { // replace '\n' with '\0'
*ps = '\0';
} else {
while ((c = getchar()) != '\n' && c != EOF)
continue; // discard extra characters
}
puts(s);
} else {
fprintf(stderr, "EOF reached or error in fgets()\n");
}
return 0;
}
Note that these two methods of getting a line of input are not exactly equivalent. The scanf() method, as written here, does not accept an empty line (i.e., a line consisting of only the '\n' character), but does accept lines consisting of other whitespace characters. The fscanf() method will accept an empty line as input.
Also, if it is acceptable to ignore leading whitespace characters, it would be simpler to follow the recommendation given by Jonathan Leffler in the comments to use only a single call to scanf():
result = scanf(" %29[^\n]", s);
This will ignore leading whitespace characters, including newlines.
Do not use scanf() or gets() function — use fgets() instead. But for the above question please find the answer.
int main() {
char a[30];
scanf ("%29[^\n]%*c", name);
printf("%s\n", a);
return 0;
}
Its also highly recommended like I told in the beginning to use fgets() instead. We clearly do not understand the weird requirement. I would have used the fgets() to read the character.
fgets(a, size(a), stdin);

Why %[^\n]s does not work in loop?

I was writing a code to take a space-separated string input for a multiple time in between a loop. I saw that %[^\n]s did not work in loop but %[^\n]%*c did. My question is why %[^\n]sdid not work. Here is my code:
#include<stdio.h>
main(){
while(1){
char str[10];
scanf("%[^\n]s",str);
printf("%s\n",str);
}
return 0;
}
The format specifier %[^\n] means "read a string containing any characters until a newline is found". The newline itself is not consumed. When the newline is found, it is left on the input stream for the next conversion.
The format specifier %*c means "read exactly one character and discard it".
So the combination
scanf( "%[^\n]%*c", str );
means "read a string up to the newline character, put the string into the memory that str points to, and then discard the newline".
Given the format %[^\n]s, the s is not part of the conversion specifier. That format says "read characters until a newline is found, and then the next character should be an s". But next character will never be an s, because the next character will always be the newline. So the s in that format serves no purpose.
%[^\n] does not look for space-separated strings. It reads the whole line up until (but not including) the newline.
If your line contains more than 9 characters this causes undefined behaviour by overflowing the buffer.
The buffer overflow could be prevented by writing %9[^\n] but then you have a new issue: on a longer line, %*c would discard the 10th character and the next scan would keep reading from that same line.
Another complicating factor is that if your file contains a blank line, then %[ considers it a matching failure. That means scanf stops, so it does not go on to process %*c. In this case, the newline is not consumed, and the output buffer is not written to at all.
Your code also goes into an infinite loop because you never break out of while(1).
This code shows correct use of ^[:
while (1)
{
str[0] = '\0'; // In case of matching failure
scanf("%9[^\n]", str); // read as much of the line as we can
scanf("%*[^\n]"); // discard the rest of the line
if ( getchar() == EOF ) // discard the newline
break; // exit loop when we finished the input
printf("%s\n", str);
}
If you really do want to read space-separated text then you could use %9s instead of %9[^\n] in my above example. That introduces a new difference though: %s skips over a blank line.
If that's OK then that's OK. If you want to not skip blank lines, then you could use my code above, but add in at the end:
char *p = strchr(str, ' ');
if ( p )
*p = '\0';
Robust string input is difficult in C!

Why fgets takes cursor to next line?

I have taken a string from the keyboard using the fgets() function. However, when I print the string using printf(), the cursor goes to a new line.
Below is the code.
#include<stdio.h>
int main()
{
char name[25];
printf("Enter your name: ");
fgets(name, 24, stdin);
printf("%s",name);
return 0;
}
And below is the output.
-bash-4.1$ ./a.out
Enter your name: NJACK1 HERO
NJACK1 HERO
-bash-4.1$
Why is the cursor going to the next line even though I have not added a \n in the printf()?
However, I have noticed that if I read a string using scanf(), and then print it using printf() (without using \n), the cursor does not go to next line.
Does fgets() append a \n in the string ? If it does, will it append \0 first then \n, or \n first and then \0?
The reason printf is outputting a newline is that you have one in your string.
fgets is not "adding" a newline --- it is simply reading it from the input as well. Reading for fgets stops just after the newline (if any).
Excerpt from the manpage, emphasis mine:
The fgets() function reads at most one less than the number of characters specified by size from the given stream and stores them in the string str. Reading stops when a newline character is found, at end-of-file or error. The newline, if any, is retained. If any characters are read and there is no error, a `\0' character is appended to end the string.
An easy way to check if there's a newline is to use the help of one of my favorite little-known functions --- strcspn():
size_t newline_pos = strcspn(name, "\r\n");
if(name[newline_pos])
{
/* we had a newline, so name is complete; do whatever you want here */
//...
/* if this is the only thing you do
you do *not* need the `if` statement above (just this line) */
name[newline_pos] = 0;
}
else
{
/* `name` was truncated (the line was longer than 24 characters) */
}
Or, as an one-liner:
// WARNING: This means you have no way of knowing if the name was truncated!
name[strcspn(name, "\r\n")] = 0;
Because if there is a '\n' in the read text it will be taken by fgets(), the following was extracted from the 1570 draft §7.21.7.2 ¶ 2
The fgets function reads at most one less than the number of characters specified by n
from the stream pointed to by stream into the array pointed to by s. No additional
characters are read after a new-line character (which is retained) or after end-of-file. A
null character is written immediately after the last character read into the array.
I highlighted by making bold the part which says that the '\n' is kept by fgets().

What is the behavior of %(limit)[^\n] in scanf ? It is safety from overflow?

The format %(limit)[^\n] for scanf function is unsafe ? (where (limit) is the length -1 of the string)
If it is unsafe, why ?
And there is a safe way to implement a function that catch strings just using scanf() ?
On Linux Programmer's Manual, (typing man scanf on terminal), the s format said:
Matches a sequence of non-white-space characters; the next pointer must be a pointer to character array that is long enough to hold the input sequence and the terminating null byte ('\0'),which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.
The input string stops at maximum field width always ? Or is just on GCC ?
Thanks.
%(limit)[^\n] for scanf" is usually safe.
In the below example, at most 99 char will be read and saved into buf. If any char are saved, a '\0' will be appended and cnt will be 1.
char buf[100];
int cnt = scanf("%99[^\n]", buf);
This functionality is certainly safe, but what about others?
Problems occur when the input is a lone "\n".
In this case, nothing is saved in buf and 0 is returned. Had the next line of code been the following, the output is Undefined Behavior as buf is not initialized to anything.
puts(buf);
A better following line would be
if (cnt == 1) puts(buf);
else printf("Return count = %d\n", cnt);
Problems because the '\n' was not consumed.
The '\n' is still waiting to be read and another call to scanf("%99[^\n]", buf); will not read the '\n'.
Q: is a safe way to implement a function that catch strings just using scanf()
A: Pedantically: Not easily.
scanf(), fgets(), etc. are best used for reading text, not strings. In C a string is an array of char terminated with a '\0'. Input via scanf(), fgets(), etc. typically have issues reading '\0' and typically that char is not in the input anyways. Usually input is thought of as groups of char terminated by '\n' or other white-space.
If code is reading input terminated with '\n', using fgets() works well and is portable. fgets() too has it weakness that are handled in various ways . getline() is a nice alternative.
A close approximate would be scanf(" %99[^\n]", buf) (note the added " "), but alone that does not solve handing excessive long lines, reading multiple empty lines, embedded '\0' detection, loss of ability to report length read (strlen() does not work due to embedded '\0') and its leaving the trailing '\n' in stdin.
Short of using scanf("%c", &ch) with lots of surrounding code (which is silly, just use fgetc()) , I see no way to use a single scanf() absolutely safely when reading a line of user input.
Q: The input string stops at maximum field width always ?
A: With scanf("%99[^\n]", input stops 1) when a '\n' is encountered - the '\n' is not saved and remains in the file input buffer 2) 99 char have been read 3) EOF occurs or 4) IO error occurs (rare).
The [^\n] is to make scanf read input until it meets a new line character...while the limit is the maximum number of characters scanf should read...

Resources