how to read scanf with spaces - c

I'm having a weird problem
i'm trying to read a string from a console with scanf()
like this
scanf("%[^\n]",string1);
but it doesnt read anything. it just skips the entire scanf.
I'm trying it in gcc compiler

Trying to use scanf to read strings with spaces can bring unnecessary problems of buffer overflow and stray newlines staying in the input buffer to be read later. gets() is often suggested as a solution to this, however,
From the manpage:
Never use gets(). Because it is
impossible to tell without knowing the
data in advance how many characters
gets() will read, and because gets()
will continue to store characters past
the end of the buffer, it is extremely
dangerous to use. It has been used to
break computer security. Use fgets()
instead.
So instead of using gets, use fgets with the STDIN stream to read strings from the keyboard

That should work fine, so something else is going wrong. As hobbs suggests, you might have a newline on the input, in which case this won't match anything. It also won't consume a newline, so if you do this in a loop, the first call will get up to the newline and then the next call will get nothing. If you want to read the newline, you need another call, or use a space in the format string to skip whitespace. Its also a good idea to check the return value of scanf to see if it actually matched any format specifiers.
Also, you probably want to specify a maximum length in order to avoid overflowing the buffer. So you want something like:
char buffer[100];
if (scanf(" %99[^\n]", buffer) == 1) {
/* read something into buffer */
This will skip (ignore) any blank lines and whitespace on the beginning of a line and read up to 99 characters of input up to and not including a newline. Trailing or embedded whitespace will not be skipped, only leading whitespace.

I'll bet your scanf call is inside a loop. I'll bet it works the first time you call it. I'll bet it only fails on the second and later times.
The first time, it will read until it reaches a newline character. The newline character will remain unread. (Odds are that the library internally does read it and calls ungetc to unread it, but that doesn't matter, because from your program's point of view the newline is unread.)
The second time, it will read until it reaches a newline character. That newline character is still waiting at the front of the line and scanf will read all 0 of the characters that are waiting ahead of it.
The third time ... the same.
You probably want this:
if (scanf("%99[^\n]%*c", buffer) == 1) {
Edit: I accidentally copied and pasted from another answer instead of from the question, before inserting the %*c as intended. This resulting line of code will behave strangely if you have a line of input longer than 100 bytes, because the %*c will eat an ordinary byte instead of the newline.
However, notice how dangerous it would be to do this:
scanf("%[^n]%*c", string1);
because there, if you have a line of input longer than your buffer, the input will walk all over your other variables and stack and everything. This is called buffer overflow (even if the overflow goes onto the stack).

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *text(int n);
int main()
{
char str[10];
printf("enter username : ");
scanf(text(9),str);
printf("username = %s",str);
return 0;
}
char *text(int n)
{
fflush(stdin);fflush(stdout);
char str[50]="%",buf[50],st2[10]="[^\n]s";
char *s;itoa(n,buf,10);
// n == -1 no buffer protection
if(n != -1) strcat(str,buf);
strcat(str,st2);s=strdup(str);
fflush(stdin);fflush(stdout);
return s;
}

Related

Why is this creating two inputs instead of one

https://i.imgur.com/FLxF9sP.png
As shown in the link above I have to input '<' twice instead of once, why is that? Also it seems that the first input is ignored but the second '<' is the one the program recognizes.
The same thing occurs even without a loop too.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(){
int randomGen, upper, lower, end, newRandomGen;
char answer;
upper = 100;
lower = 1;
end = 1;
do {
srand(time(0));
randomGen = rand()%(upper + lower);
printf("%d\n", randomGen);
scanf("%s\n", &answer);
}while(answer != '=');
}
Whitespace in scanf format strings, like the \n in "%c\n", tries to match any amount of whitespace, and scanf doesn’t know that there’s no whitespace left to skip until it encounters something that isn’t whitespace (like the second character you type) or the end of input. You provide it with =\n, which fills in the %c and waits until the whitespace is over. Then you provide it with another = and scanf returns. The second time around, the character could be anything and it’d still work.
Skip leading whitespace instead (and use the correct specifier for one character, %c, as has been mentioned):
scanf(" %c", &answer);
Also, it’s good practice to make sure you actually succeeded in reading something, especially when failing to read something means leaving it uninitialized and trying to read it later (another example of undefined behaviour). So check scanf’s return value, which should match the number of conversion specifiers you provided:
if (scanf(" %c", &answer) != 1) {
return EXIT_FAILURE;
}
As has been commented, you should not use the scanf format %s if you want to read a single character. Indeed, you should never use the scanf format %s for any purpose, because it will read an arbitrary number of characters into the buffer you supply, so you have no way to ensure that your buffer is large enough. So you should always supply a maximum character count. For example, %1s will read only one character. But note: that will still not work with a char variable, since it reads a string and in C, strings are arrays of char terminated with a NUL. (NUL is the character whose value is 0, also sometimes spelled \0. You could just write it as 0, but don't confuse that with the character '0' (whose value is 48, in most modern systems).
So a string containing a single character actually occupies two bytes: the character itself, and a NUL.
If you just want to read a single character, you could use the format %c. %c has a few differences from %s, and you need to be aware of all of them:
The default maximum length read by %s is "unlimited". The default for %c is 1, so %c is identical to %1c.
%s will put a NUL at the end of the characters read (which you need to leave space for), so the result is a C string. %c does not add the NUL, so you only need to leave enough space for the characters themselves.
%s skips whitespace before storing any characters. %c does not ignore whitespace. Note: a newline character (at the end of each line) is considered whitespace.
So, based on the first two rules, you could use either of the following:
char theShortString[2];
scanf("%1s", theShortString);
char theChar = theShortString[0];
or
char theChar;
scanf("%c", &theChar);
Now, when you used
scanf("%s", &theChar);
you will cause scanf to write a NUL (that is, a zero) in the byte following theChar, which quite possibly is part of a different variable. That's really bad. Don't do that. Ever. Even if you get away with it today, it will get you into serious trouble some time soon.
But that's not the problem here. The problem here is with what comes after the %s format code.
Let's take a minute (ok, maybe half an hour) to read the documentation of scanf, by typing man scanf. What we'll see, quite near the beginning, is: (emphasis added)
A directive is one of the following:
A sequence of white-space characters (space, tab, newline, etc.; see isspace(3)). This directive matches any amount of white space, including none, in the input.
So when you use "%s\n", scanf will do the following:
skip over any white-space characters in the input buffer.
read the following word up to but not including the next white-space character, and store it in the corresponding argument, followed by a NUL.
skip over any white-space following the word which it just read.
It does the last step because \n — a newline — is itself white-space, as noted in the quote from the manpage.
Now, what you actually typed was < followed by a newline, so the word read at step 2 will be just he character <. The newline you typed afterwards is white-space, so it will be ignored by step 3. But that doesn't satisfy step 3, because scanf (as documented) will ignore "any amount of white space". It doesn't know that there isn't more white space coming. You might, for example, be intending to type a blank line (that is, just a newline), in which case scanf must skip over that newline as well. So scanf keeps on reading.
Since the input buffer is now empty, the I/O library must now read the next line, which it does. And now you type another < followed by a newline. Clearly, the < is not white-space, so scanf leaves it in the input buffer and returns, knowing that it has done its duty.
Your program then checks the word read by scanf and realises that it is not an =. So it loops again, and the scanf executes again. Now there is already data in the input buffer (the second < which you typed), so scanf can immediately store that word. But it will again try to skip "any amount of white space" afterwards, which by the same logic as above will cause it to read a third line of input, which it leaves in the input buffer.
The end result is that you always need to type the next line before the previous line is passed back to your program. Obviously that's not what you want.
So what's the solution? Simple. Don't put a \n at the end of your format string.
Of course, you do want to skip that newline character. But you don't need to skip it until the next call to scanf. If you used a %1s format code, scanf would automatically skip white-space before returning input, but as we've seen above, %c is far simpler if you only want to read a single character. Since %c does not skip white-space before returning input, you need to insert an explicit directive to do so: a white-space character. It's usual to use an actual space rather than a newline for this purpose, so we would normally write this loop as:
char answer;
srand(time(0)); /* Only call srand once, at the beginning of the program */
do {
randomGen = rand()%(upper + lower); /* This is not right */
printf("%d\n", randomGen);
scanf(" %c", &answer);
} while (answer != '=');
scanf("%s\n", &answer);
Here you used the %s flag in the format string, which tells scanf to read as many characters as possible into a pre-allocated array of chars, then a null terminator to make it a C-string.
However, answer is a single char. Just writing the terminator is enough to go out of bounds, causing undefined behaviour and strange mishaps.
Instead, you should have used %c. This reads a single character into a char.

How fget works?

I am using gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2
I am writing a very simple script to take string as input and print the same with the some custom message. First user enters the T(no. of times to take string) and then takes input by fgets.. I used this&this as reference. I am getting a very strange output ie fgets is adding some extra new lines and even loop is not working properly for T=2 it is asking input only once. Can anybody explain me what is wrong with snippet. Thanks in advance!
#include<stdio.h>
#include<string.h>
int main()
{
int T;
scanf("%d",&T);
while(T--){
char str[100]={""};
int i;
printf("Hello\n");
fgets(str,80,stdin);
i = strlen(str)-1;
if(str[i]=='\n')
str[i]='\0';
printf("World\n");
printf("%s\n",str);
}
return 0;
}
Please see the image reference for T=2, even T=2 it is taking string only once and order of printing statement is also not as expected.
This is because your scanf() call
scanf("%d",&T);
does not consume the new line character \n accompanied by your input.
When you input T, your input is 2Enter.
scanf() consumes the 2, sees \n and stops consuming, leaving \n in the input buffer.
When the fgets() gets called, it sees the \n in the buffer and treats this as the input.
To solve this, you have a few choices (in decreasing order of my subjective preference):
Use fgets() to get the first input, then parse it using strtol() to get T.
Add an extra fgets() after the scanf().
Add getchar() to consume one extra character. This works if you are certain that exactly one \n will be present after the input. So for example it won't work if you type 2SpaceEnter. Alternatively, you may use while(getchar() != '\n'); after the scanf() to consume everything up to a new line, but this may cause problems if an empty line is a valid input to your later fgets() calls.
If your implementation supports it, you may use fpurge(stdin) or __fpurge(stdin).
And, very importantly, do not use fflush(stdin) unless your implementation clearly defines its behavior. Otherwise it is undefined behavior. You may refer to this question for more details. Also, note that the fpurge() and fflush() methods may not work correctly if your input can be piped into the program.
This line
scanf("%d",&T);
reads the input until the first non-numeral is found, which is newline. Then fgets() reads that newline for its first input.
I suggest using fgets() to read the number too.
fgets(str,80,stdin);
sscanf(str, "%d", &T);
With your first call to scanf you allow the user to input an integer for the number of loops. They do so, and use a carriage return to signal the end of input. At this point scanf reads the integer, but leaves the end-of-line (\n) in the stream..
So when you call fgets, fgets gets from the stream until it reaches the first newline -- in you code, on the first loop, this is the very first character it encounters.
If you discard the newline before calling fgets, you should get your desired results. You can do this, for example by changing your first three lines to:
int T;
char newline;
scanf("%d%c",&T, &newline);
Although there are probably stylistically superior ways of doing this.
You have a newline character in the input stream when you press "enter" after typing 2 for the first scanf. This is consumed by fgets() initially, hence it prints nothing (You replace the newline by \0).
In the next line, your input is read and echoed after World because you are printing it after it.

What is the behavior of %(limit)[^\n] in scanf ? It is safety from overflow?

The format %(limit)[^\n] for scanf function is unsafe ? (where (limit) is the length -1 of the string)
If it is unsafe, why ?
And there is a safe way to implement a function that catch strings just using scanf() ?
On Linux Programmer's Manual, (typing man scanf on terminal), the s format said:
Matches a sequence of non-white-space characters; the next pointer must be a pointer to character array that is long enough to hold the input sequence and the terminating null byte ('\0'),which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.
The input string stops at maximum field width always ? Or is just on GCC ?
Thanks.
%(limit)[^\n] for scanf" is usually safe.
In the below example, at most 99 char will be read and saved into buf. If any char are saved, a '\0' will be appended and cnt will be 1.
char buf[100];
int cnt = scanf("%99[^\n]", buf);
This functionality is certainly safe, but what about others?
Problems occur when the input is a lone "\n".
In this case, nothing is saved in buf and 0 is returned. Had the next line of code been the following, the output is Undefined Behavior as buf is not initialized to anything.
puts(buf);
A better following line would be
if (cnt == 1) puts(buf);
else printf("Return count = %d\n", cnt);
Problems because the '\n' was not consumed.
The '\n' is still waiting to be read and another call to scanf("%99[^\n]", buf); will not read the '\n'.
Q: is a safe way to implement a function that catch strings just using scanf()
A: Pedantically: Not easily.
scanf(), fgets(), etc. are best used for reading text, not strings. In C a string is an array of char terminated with a '\0'. Input via scanf(), fgets(), etc. typically have issues reading '\0' and typically that char is not in the input anyways. Usually input is thought of as groups of char terminated by '\n' or other white-space.
If code is reading input terminated with '\n', using fgets() works well and is portable. fgets() too has it weakness that are handled in various ways . getline() is a nice alternative.
A close approximate would be scanf(" %99[^\n]", buf) (note the added " "), but alone that does not solve handing excessive long lines, reading multiple empty lines, embedded '\0' detection, loss of ability to report length read (strlen() does not work due to embedded '\0') and its leaving the trailing '\n' in stdin.
Short of using scanf("%c", &ch) with lots of surrounding code (which is silly, just use fgetc()) , I see no way to use a single scanf() absolutely safely when reading a line of user input.
Q: The input string stops at maximum field width always ?
A: With scanf("%99[^\n]", input stops 1) when a '\n' is encountered - the '\n' is not saved and remains in the file input buffer 2) 99 char have been read 3) EOF occurs or 4) IO error occurs (rare).
The [^\n] is to make scanf read input until it meets a new line character...while the limit is the maximum number of characters scanf should read...

scanf problem in input

hi I am having problems using scanf when reading two strings with spaces consecutively
char name[50] = {0};
char address[100] = {0};
char name1[50] = {0};
char address1[100] = {0};
int size = 0;
//input = fopen("/dev/tty","r+");
//output = fopen("/dev/tty","w");
printf("\nenter the name:");
scanf("%[^\n]s",name);
//fgets(name,sizeof(name),input); // this works fine
printf("\nenter the address:");
scanf("%[^\n]s",address);
//fgets(address,sizeof(address),input); // this works fine
the input for address is not taken at all.. maybe it takes the return key as an input?
The newline ('\n') character is still on the input stream after the first scanf call, so the second scanf call sees it immediately, and immediately stops reading.
I notice that you mention fgets in comments - why not use that ? It does what you want to do quite well.
You have a couple of problems.
As #sander pointed out, you're not doing anything to clear the newline out of the input buffer.
You've also used %[^\n]s -- but the s isn't needed for (nor it is part of) a scanset conversion. Since it's not part of the conversion, scanf attempts to match that character in the input -- but since you've just read up to a new-line (and not read the newline itself) it's more or less demanding that 's' == '\n' -- which obviously can't be, so the scan fails.
To make this work, you could use something like this:
scanf("%49[^\n]%*c", name);
scanf("%99[^\n]%*c", address);
As to why you'd want to use this instead of fgets, one obvious reason is that it does not include the trailing (and rarely desired) newline character when things work correctly. fgets retaining the newline does give you one result that can be useful though: the newline is present if and only if the entire content of the line has been read. If you're really concerned about that (e.g., you want to resize your buffer and continue reading if the name exceeds the specified size) you can get the same with scanf as well -- instead of using %*c for the second conversion, use %c, and pass the address of a char. You read the entire line if and only if you've read a newline into that char afterwards.

How do I return a printf statement if the user inputs nothing?

I want to execute a statement based on the input of the user:
#include <stdio.h>
#include <string.h>
void main() {
char string_input[40];
int i;
printf("Enter data ==> ");
scanf("%s", string_input);
if (string_input[0] == '\n') {
printf("ERROR - no data\n");
}
else if (strlen(string_input) > 40) {
printf("Hex equivalent is ");
}
else {
printf("Hex equivalent is ");
}
}
When I run it, and just press enter, it goes to a new line instead of saying "ERROR - no data".
What do I do?
CANNOT USE FGETS as we have not gone over this in class.
Use
char enter[1];
int chk = scanf("%39[^\n]%c", string_input, enter);
but string_input will not have a '\n' inside. Your test
if (string_input[0] == '\n') {
printf("ERROR - no data\n");
}
will have to be changed to, for example
if (chk != 2) {
printf("ERROR - bad data\n");
}
use fgets instead of scanf. scanf doesn't check if user enters a string longer than 40 chars in your example above so for your particular case fgets should be simpler(safer).
Can you use a while loop and getch, then test for the <Enter> key on each keystroke?
scanf won't return until it sees something other than whitespace. It also doesn't distinguish between newlines and other whitespace. In practice, using scanf is almost always a mistake; I suggest that you call fgets instead and then (if you need to) use sscanf on the resulting data.
If you do that, you really ought to deal with the possibility that the user enters a line longer than the buffer you pass to fgets; you can tell when this has happened because your entire buffer gets filled and the last character isn't a newline. In that situation, you should reallocate a larger buffer and fgets again onto the end of it, and repeat until either you see a newline or the buffer gets unreasonably large.
You should really be similarly careful when calling scanf or sscanf -- what if the user enters a string 100 characters long? (You can tell scanf or sscanf to accept only a limited length of string.)
On the other hand, if this is just a toy program you can just make your buffer reasonably long and hope the user doesn't do anything nasty.
fgets does what you need. Avoid using scanf or gets. If you can't use fgets try using getchar
The problem is that "%s" attempts to skip white-space, and then read a string -- and according to scanf, a new-line is "whitespace".
The obvious alternative would be to use "%c" instead of "%s". The difference between the two is that "%c" does not attempt to skip leading whitespace.
A somewhat less obvious (or less known, anyway) alternative would be to use "%[^\n]%*[\n]". This reads data until it encounters a new-line, then reads the new-line and doesn't assign it to anything.
Regardless of which conversion you use, you want (need, really) to limit the amount of input entered so it doesn't overflow the buffer you've provided, so you'd want to use "%39c" or "%39[^\n]". Note that when you're specifying the length for scanf, you need to subtract one to leave space for the NUL terminator (in contrast to fgets, for which you specify the full buffer size).
What platform are you running on?
Is the character sent when your press the ENTER key actually '\n', or might it be '\r'? Or even both one after the other (ie. "\r\n").

Resources