Get multiple words without getting \n in C language - c

How do you scanf several words (with spaces in between and an arbitrary number) into a string and not get the '\n' character in the end? I know similar questions has been asked but none of them gave a really satisfying answer. I hope to get an answer to achieve such mechanism in one statement.

char buffer[256];
if (scanf(" %255[^\n]", buffer) != 1)
…oops — EOF or something dramatically awry…
The scan set doesn't skip leading white space (neither does %c or %n), so I added the leading blank to skip leading white space. If you want the leading spaces too, drop that space in the format string, but the onus is on you to ensure that the next character in the input is not a newline (which it often will be if you've just read a number, for example). The conversion (scan set) stops when a newline is reached, or at EOF, or when 255 characters have been read. You could add %*[\n] to read the newline if the next character is a newline. You won't ever know whether that matched or not, though. If you must know, you need:
char buffer[256];
char nl[2];
int rc;
if ((rc = scanf(" %255[^\n]%[\n]", buffer, nl)) <= 0)
…oops — EOF or something dramatically awry…
else if (rc == 1)
…no newline — presumably the input line was longer than 255 characters…
else
…data in buffer is a complete line except for the newline, but the newline was read…
Note the use of 255 vs 256 — that is not an accident but is 100% necessary.

Related

Why is this creating two inputs instead of one

https://i.imgur.com/FLxF9sP.png
As shown in the link above I have to input '<' twice instead of once, why is that? Also it seems that the first input is ignored but the second '<' is the one the program recognizes.
The same thing occurs even without a loop too.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(){
int randomGen, upper, lower, end, newRandomGen;
char answer;
upper = 100;
lower = 1;
end = 1;
do {
srand(time(0));
randomGen = rand()%(upper + lower);
printf("%d\n", randomGen);
scanf("%s\n", &answer);
}while(answer != '=');
}
Whitespace in scanf format strings, like the \n in "%c\n", tries to match any amount of whitespace, and scanf doesn’t know that there’s no whitespace left to skip until it encounters something that isn’t whitespace (like the second character you type) or the end of input. You provide it with =\n, which fills in the %c and waits until the whitespace is over. Then you provide it with another = and scanf returns. The second time around, the character could be anything and it’d still work.
Skip leading whitespace instead (and use the correct specifier for one character, %c, as has been mentioned):
scanf(" %c", &answer);
Also, it’s good practice to make sure you actually succeeded in reading something, especially when failing to read something means leaving it uninitialized and trying to read it later (another example of undefined behaviour). So check scanf’s return value, which should match the number of conversion specifiers you provided:
if (scanf(" %c", &answer) != 1) {
return EXIT_FAILURE;
}
As has been commented, you should not use the scanf format %s if you want to read a single character. Indeed, you should never use the scanf format %s for any purpose, because it will read an arbitrary number of characters into the buffer you supply, so you have no way to ensure that your buffer is large enough. So you should always supply a maximum character count. For example, %1s will read only one character. But note: that will still not work with a char variable, since it reads a string and in C, strings are arrays of char terminated with a NUL. (NUL is the character whose value is 0, also sometimes spelled \0. You could just write it as 0, but don't confuse that with the character '0' (whose value is 48, in most modern systems).
So a string containing a single character actually occupies two bytes: the character itself, and a NUL.
If you just want to read a single character, you could use the format %c. %c has a few differences from %s, and you need to be aware of all of them:
The default maximum length read by %s is "unlimited". The default for %c is 1, so %c is identical to %1c.
%s will put a NUL at the end of the characters read (which you need to leave space for), so the result is a C string. %c does not add the NUL, so you only need to leave enough space for the characters themselves.
%s skips whitespace before storing any characters. %c does not ignore whitespace. Note: a newline character (at the end of each line) is considered whitespace.
So, based on the first two rules, you could use either of the following:
char theShortString[2];
scanf("%1s", theShortString);
char theChar = theShortString[0];
or
char theChar;
scanf("%c", &theChar);
Now, when you used
scanf("%s", &theChar);
you will cause scanf to write a NUL (that is, a zero) in the byte following theChar, which quite possibly is part of a different variable. That's really bad. Don't do that. Ever. Even if you get away with it today, it will get you into serious trouble some time soon.
But that's not the problem here. The problem here is with what comes after the %s format code.
Let's take a minute (ok, maybe half an hour) to read the documentation of scanf, by typing man scanf. What we'll see, quite near the beginning, is: (emphasis added)
A directive is one of the following:
A sequence of white-space characters (space, tab, newline, etc.; see isspace(3)). This directive matches any amount of white space, including none, in the input.
So when you use "%s\n", scanf will do the following:
skip over any white-space characters in the input buffer.
read the following word up to but not including the next white-space character, and store it in the corresponding argument, followed by a NUL.
skip over any white-space following the word which it just read.
It does the last step because \n — a newline — is itself white-space, as noted in the quote from the manpage.
Now, what you actually typed was < followed by a newline, so the word read at step 2 will be just he character <. The newline you typed afterwards is white-space, so it will be ignored by step 3. But that doesn't satisfy step 3, because scanf (as documented) will ignore "any amount of white space". It doesn't know that there isn't more white space coming. You might, for example, be intending to type a blank line (that is, just a newline), in which case scanf must skip over that newline as well. So scanf keeps on reading.
Since the input buffer is now empty, the I/O library must now read the next line, which it does. And now you type another < followed by a newline. Clearly, the < is not white-space, so scanf leaves it in the input buffer and returns, knowing that it has done its duty.
Your program then checks the word read by scanf and realises that it is not an =. So it loops again, and the scanf executes again. Now there is already data in the input buffer (the second < which you typed), so scanf can immediately store that word. But it will again try to skip "any amount of white space" afterwards, which by the same logic as above will cause it to read a third line of input, which it leaves in the input buffer.
The end result is that you always need to type the next line before the previous line is passed back to your program. Obviously that's not what you want.
So what's the solution? Simple. Don't put a \n at the end of your format string.
Of course, you do want to skip that newline character. But you don't need to skip it until the next call to scanf. If you used a %1s format code, scanf would automatically skip white-space before returning input, but as we've seen above, %c is far simpler if you only want to read a single character. Since %c does not skip white-space before returning input, you need to insert an explicit directive to do so: a white-space character. It's usual to use an actual space rather than a newline for this purpose, so we would normally write this loop as:
char answer;
srand(time(0)); /* Only call srand once, at the beginning of the program */
do {
randomGen = rand()%(upper + lower); /* This is not right */
printf("%d\n", randomGen);
scanf(" %c", &answer);
} while (answer != '=');
scanf("%s\n", &answer);
Here you used the %s flag in the format string, which tells scanf to read as many characters as possible into a pre-allocated array of chars, then a null terminator to make it a C-string.
However, answer is a single char. Just writing the terminator is enough to go out of bounds, causing undefined behaviour and strange mishaps.
Instead, you should have used %c. This reads a single character into a char.

What does %[^\n] mean in C?

What does %[^\n] mean in C?
I saw it in a program which uses scanf for taking multiple word input into a string variable. I don't understand though because I learned that scanf can't take multiple words.
Here is the code:
#include <stdio.h>
#include <stdlib.h>
int main() {
char line[100];
scanf("%[^\n]",line);
printf("Hello,World\n");
printf("%s",line);
return 0;
}
[^\n] is a kind of regular expression.
[...]: it matches a nonempty sequence of characters from the scanset (a set of characters given by ...).
^ means that the scanset is "negated": it is given by its complement.
^\n: the scanset is all characters except \n.
Furthermore fscanf (and scanf) will read the longest sequence of input characters matching the format.
So scanf("%[^\n]", s); will read all characters until you reach \n (or EOF) and put them in s. It is a common idiom to read a whole line in C.
See also §7.21.6.2 The fscanf function.
scanf("%[^\n]",line); is a problematic way to read a line. It is worse than gets().
C defines line as:
A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character. Whether the last line requires a terminating new-line character is implementation-defined.
The scanf("%[^\n]", line) has the specifier "%[^\n]". It scans for unlimited number of characters that match the scan-set ^\n. If none are read, the specifier fails and scanf() returns with line unaltered. If at least one character is read, all matching characters are read and saved and a null character is appended.
The scan-set ^\n implies all character that are not (due to the '^') '\n'.
'\n' is not read
scanf("%[^\n]",.... fails to read a new line character '\n'. It remains in stdin. The entire line is not read.
Buffer overflow
The below leads to undefined behavior (UB) should more than 99 characters get read.
char line[100];
scanf("%[^\n]",line); // buffer overflow possible
Does nothing on empty line
When the line consists of only "\n", scanf("%[^\n]",line); returns a 0 without setting line[] - no null character is appended. This can readily lead to undefined behavior should subsequent code use an uninitialized line[]. The '\n' remains in stdin.
Failure to check the return value
scanf("%[^\n]",line); assumes input succeeded. Better code would check the scanf() return value.
Recommendation
Do not use scanf() and instead use fgets() to read a line of input.
#define EXPECTED_INPUT_LENGTH_MAX 49
char line[EXPECTED_INPUT_LENGTH_MAX + 1 + 1 + 1];
// \n + \0 + extra to detect overly long lines
if (fgets(line, sizeof line, stdin)) {
size_t len = strlen(line);
// Lop off potential trailing \n if desired.
if (len > 0 && line[len-1] == '\n') {
line[--len] = '\0';
}
if (len > EXPECTED_INPUT_LENGTH_MAX) {
// Handle error
// Usually includes reading rest of line if \n not found.
}
The fgets() approach has it limitations too. e.g. (reading embedded null characters).
Handling user input, possible hostile, is challenging.
scanf("%[^\n]",line);
means: scan till \n or an enter key.
scanf("%[^\n]",line);
Will read user input until enter is pressed or a newline character is added (\n) and store it into a variable named line.
Question: what is %[^\n] mean in C?
Basically the \n command prints the output in the next line, but in
case of C gives the Null data followed by the above problem only.
Because of that to remove the unwanted data or null data, need to add
Complement/negotiated symbol[^\n]. It gives all characters until the next line
and keeps the data in the defined expression.
Means it is the Complemented data or rewritten data from the trash
EX:
char number[100]; //defined a character ex: StackOverflow
scanf("%[^\n]",number); //defining the number without this statement, the
character number gives the unwanted stuff `���`
printf("HI\n"); //normaly use of printf statement
printf("%s",number); //printing the output
return 0;

Why won't this scanf format-string work? "%[^\n]\n"

I've seen a few examples where people give scanf a "%[^\n]\n" format string to read a whole line of user input. If my understanding is correct, this will read every character until a newline character is reached, and then the newline is consumed by scanf (and not included in the resulting input).
But I can't get this to work on my machine. A simple example I've tried:
#include <stdio.h>
int main(void)
{
char input[64];
printf("Enter some input: ");
scanf("%[^\n]\n", input);
printf("You entered %s\n", input);
}
When I run this, I'm prompted for input, I type some characters, I hit Enter, and the cursor goes to the beginning of the next line but the scanf call doesn't finish.
I can hit Enter as many times as I like, and it will never finish.
The only ways I've found to conclude the scanf call are:
enter \n as the first (and only) character at the prompt
enter Ctrl-d as the first (and only) character at the prompt
enter some input, one or more \n, zero or more other characters, and enter Ctrl-d
I don't know if this is machine dependent, but I'm very curious to know what's going on. I'm on OS X, if that's relevant.
According to the documentation for scanf (emphasis mine):
The format string consists of whitespace characters (any single whitespace character in the format string consumes all available consecutive whitespace characters from the input), non-whitespace multibyte characters except % (each such character in the format string consumes exactly one identical character from the input) and conversion specifications.
Thus, your format string %[^\n]\n will first read (and store) an arbitrary number of non-whitespace characters from the input (because of the %[^\n] part) and then, because of the following newline, read (and discard) an arbitrary number of whitespace characters, such as spaces, tabs or newlines.
Thus, to make your scanf stop reading input, you either need to type at least one non-whitespace character after the newline, or else arrange for the input stream to end (e.g. by pressing Ctrl+D on Unix-ish systems).
Instead, to make your code work as you expect, just remove the last \n from the end of your format string (as already suggested by Umamahesh P).
Of course, this will leave the newline still in the input stream. To get rid of it (in case you want to read another line later), you can getc it off the stream, or just append %*c (which means "read one character and discard it") or even %*1[\n] (read one newline and discard it) to the end of your scanf format string.
Ps. Note that your code has a couple of other problems. For example, to avoid buffer overflow bugs, you really should use %63[^\n] instead of %[^\n] to limit the number of characters scanf will read into your buffer. (The limit needs to be one less than the size of your buffer, since scanf will always append a trailing null character.)
Also, the %[ format specifier always expects at least one matching character, and will fail if none is available. Thus, if you press enter immediately without typing anything, your scanf will fail (silently, since you don't check the return value) and will leave your input buffer filled with random garbage. To avoid this, you should a) check the return value of scanf, b) set input[0] = '\0' before calling scanf, or c) preferably both.
Finally, note that, if you just want to read input line by line, it's much easier to just use fgets. Yes, you'll need to strip the trailing newline character (if any) yourself if you don't want it, but that's still a lot easier and safer that trying to use scanf for a job it's not really meant for:
#include <stdio.h>
#include <string.h>
void chomp(char *string) {
int len = strlen(string);
if (len > 0 && string[len-1] == '\n') string[len-1] = '\0';
}
int main(void)
{
char input[64];
printf("Enter some input: ");
fgets(input, sizeof(input), stdin);
chomp(input);
printf("You entered \"%s\".\n", input);
}
Whitespace characters in format of scanf() has an special meaning:
Whitespace character: the function will read and ignore any whitespace
characters encountered before the next non-whitespace character
(whitespace characters include spaces, newline and tab characters --
see isspace). A single whitespace in the format string validates any
quantity of whitespace characters extracted from the stream (including
none).
Thus, "%[^\n]\n" is just equivalent to "%[^\n] ", telling scanf() to ignore all whitespace characters after %[^\n]. This is why all '\n's are ignored until a non-whitespace character is entered, which is happened in your case.
Reference: http://www.cplusplus.com/reference/cstdio/scanf/
Remove the the 2nd new line character and the following is sufficient.
scanf("%[^\n]", input);
To answer the original one,
scanf("%[^\n]\n", input);
This should also work, provided you enter a non white space character after the input. Example:
Enter some input: lkfjdlfkjdlfjdlfjldj
t
You entered lkfjdlfkjdlfjdlfjldj

What is the behavior of %(limit)[^\n] in scanf ? It is safety from overflow?

The format %(limit)[^\n] for scanf function is unsafe ? (where (limit) is the length -1 of the string)
If it is unsafe, why ?
And there is a safe way to implement a function that catch strings just using scanf() ?
On Linux Programmer's Manual, (typing man scanf on terminal), the s format said:
Matches a sequence of non-white-space characters; the next pointer must be a pointer to character array that is long enough to hold the input sequence and the terminating null byte ('\0'),which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.
The input string stops at maximum field width always ? Or is just on GCC ?
Thanks.
%(limit)[^\n] for scanf" is usually safe.
In the below example, at most 99 char will be read and saved into buf. If any char are saved, a '\0' will be appended and cnt will be 1.
char buf[100];
int cnt = scanf("%99[^\n]", buf);
This functionality is certainly safe, but what about others?
Problems occur when the input is a lone "\n".
In this case, nothing is saved in buf and 0 is returned. Had the next line of code been the following, the output is Undefined Behavior as buf is not initialized to anything.
puts(buf);
A better following line would be
if (cnt == 1) puts(buf);
else printf("Return count = %d\n", cnt);
Problems because the '\n' was not consumed.
The '\n' is still waiting to be read and another call to scanf("%99[^\n]", buf); will not read the '\n'.
Q: is a safe way to implement a function that catch strings just using scanf()
A: Pedantically: Not easily.
scanf(), fgets(), etc. are best used for reading text, not strings. In C a string is an array of char terminated with a '\0'. Input via scanf(), fgets(), etc. typically have issues reading '\0' and typically that char is not in the input anyways. Usually input is thought of as groups of char terminated by '\n' or other white-space.
If code is reading input terminated with '\n', using fgets() works well and is portable. fgets() too has it weakness that are handled in various ways . getline() is a nice alternative.
A close approximate would be scanf(" %99[^\n]", buf) (note the added " "), but alone that does not solve handing excessive long lines, reading multiple empty lines, embedded '\0' detection, loss of ability to report length read (strlen() does not work due to embedded '\0') and its leaving the trailing '\n' in stdin.
Short of using scanf("%c", &ch) with lots of surrounding code (which is silly, just use fgetc()) , I see no way to use a single scanf() absolutely safely when reading a line of user input.
Q: The input string stops at maximum field width always ?
A: With scanf("%99[^\n]", input stops 1) when a '\n' is encountered - the '\n' is not saved and remains in the file input buffer 2) 99 char have been read 3) EOF occurs or 4) IO error occurs (rare).
The [^\n] is to make scanf read input until it meets a new line character...while the limit is the maximum number of characters scanf should read...

Weird scanf behaviour when reading number and newline

I just realize this 'bug' of scanf now after 8 years with C.
Below scanf code will skip the leading whitespace characters from the second line of input.
int x;
char in[100];
scanf("%d\n",&x);
gets(in);
Input:
1
s
x will contain 1, but in will be just "s" not " s"
Is this standard C or just gcc behaviour?
A whitespace character in your scanf format string will cause scanf to consume any (and all) white space till a non-whitespace char occurs.
This seems to be standard scanf behaviour and is not limited to gcc.
Its not a Bug in scanf, the manual of scanf says,
A sequence of white-space characters (space, tab, newline, etc.; see
isspace(3)). This directive matches any amount of white space,
including none, in the input.
Which means any white space characters with directive as %d\n will read a number followed by consuming a sequence of white space characters in the input and only returns until you type a non white space character. That how you are able to see only "s" without a space before it.
The '\n' (and this is true for any whitespace character in the format string) in
scanf("%d\n", &x);
matches any number of whitespace characters in the input (characters for which isspace function returns 1, i.e, true, such as newline, space, tab etc.) and not just the newline character '\n'. This means that scanf will read all whitespace characters in the input and discard them till it encounters a non-whitespace character. This explains the behaviour you observed.
This is a part of the standard definition of the scanf function and not a gcc feature. Also, gets function is deprecated and unsafe. It does not check for buffer overrun and can lead to bugs and even program crash. In fact, gcc emits a warning against the use of gets on my machine. Use of fgets instead is recommended.
To do what you want, you can do the following:
int x;
char in[100];
scanf("%d", &x);
After scanf returns successfully, the input stream can contain any sequence of characters terminated by a newline depending on the input given by the user. Get rid of those extraneous characters before reading a string from the stdin.
char ch;
while((ch = getchar()) != '\n' || ch != EOF); // null statement
fgets(in, 100, stdin);
The above fgets call means that it will read at most 100-1 = 99 (it saves one character space for the terminating null byte which it adds to the buffer being read into before exiting) characters from the stream pointed to by stdin and store them in the buffer pointed to by in. fgets will exit if it encounters EOF, '\n' or it has already read 100-1 characters - whichever of the three condition occurs first. If it reads a newline, it will store it into the buffer.
Is the user enters 100 characters or more in this case, then the extraneous characters would be lying around in the input buffer which can mess up with the subsequent input operation of characters or strings by scanf, fgets, getchar etc. calls. You can check for this checking the length of the string in.
if(strlen(in) > 99) {
// extraneous chars lying around in the input buffer
// read and discard them
char ch;
while((ch = getchar()) != '\n' || ch != EOF); // null statement
}

Resources