What's differernce betwen scanf("%s"), scanf(" %s") and scanf("%s ")? - c

I am confused with this syntax. At first, I thought it was a printing error in the book. But, while programming for quite a long time I came to know that they have different meaning. Still, I'm not able to get clear vision about that syntax.
Likewise, what's difference between:
gets( str);
and
gets(str);
Does whitespace matter? If yes, then how?

When adding a space in the scanf format string, you tell scanf to read and skip whitespace. It can be usefull to skip newlines in the input for example. Also note that some formats automatically skip whitespace anyway.
See e.g. here for a good reference of the scanf family of functions.
The difference between
gets(str);
and
gets( str );
is none at all. Actual code outside of string literals can be formatted with any amount of whitespace. You could even write the above call as
gets
(
str
)
;
It would still be the same.
Oh, and the gets function is deprecated since long ago, and even removed from the latest C standard. You should use fgets instead.

White space (such as blanks, tabs, or newlines) in the format string match any amount of white space, including none, in the input.
http://www.manpagez.com/man/3/scanf/
In gets the space does not mean anything. Its ignored on compile time.

Compiler has many phases and in the first phase lexical analysis,
all unnecessary white spaces are removed this is also unnecessary space which will be removed at that time and so,
there is no difference between gets(a) and gets( a).

There are two important things to learn about scanf here:
All conversion modifiers except %c and %[ ignore whitespace before the scanned item.
You can explicitly invoke this behavior of ignoring all whitespace as follows:
scanf(" %c", &mychar)
scanf("\n%c", &mychar)
scanf("\t%c", &mychar)
That is, any whitespace character (including spaces) in your conversion string instructs scanf to ignore any and all whitespace up until the scanned item.
Since all conversion modifiers except %c and %[ do this automatically, the answer to your original question about scanf("%s") versus scanf(" %s") is that there is no difference.
I would recommend reading all the scanf questions at the C FAQ and writing some test programs to get a better grasp of it all:
http://c-faq.com/stdio/scanfprobs.html

Related

why does this part of the program skip the second input?

On entering the first inputs this part of my program terminates.
float p_x,p_y,q_x,q_y,r_x,r_y;
printf("Note :- coordinates are entered as follows (x,y) \n");
printf("Enter the coordinates of 'p' :");
scanf("(%f,%f)",&p_x,&p_y);
printf("Enter the coordinates of 'q' :");
scanf("(%f,%f)",&q_x,&q_y);
Short answer: Change the second scanf call to
scanf(" (%f,%f)", &q_x, &q_y);
Note the extra space in the format statement.
Longer explanation: One of scanf's quirks is that it only reads what it needs, and leaves everything else sitting on the input stream for next time. So when you read p_x and p_y, scanf reads the (, the value for p_x, the ,, the value for p_y, and the ), but it doesn't do anything with the invisible final \n on the line, the "newline" character that's there as a result of the Enter key you typed.
So, later, when you try to read q_x and q_y, the first thing scanf wants to see is the ( you said should be there, but the first thing scanf actually sees is that \n character left over from last time. So the second scanf call fails, reading nothing.
In a scanf format statement, a space character means "read and ignore whitespace". So if you change the second format string to " (%f,%f)", scanf will read and ignore the stray newline, and then it will be able to read the input you asked for.
The reason you don't have this problem all the time is that most (though not all) of the scanf format characters automatically skip leading whitespace as a part of doing their work. If you had, say, a simpler call to
scanf("%f", &q_x);
followed by a later call to
scanf("%f", &q_y);
it would work just fine, without any extra explicit space characters in the format strings.
General advice: You may be thinking that this is a pretty lame situation. You may have gotten the impression that scanf was supposed to be a nice, simple way to do input in your C programs, and you may be thinking, "But this is not simple! How was I supposed to know to write " (%f,%f)"?" And I would absolutely agree with you: This is a pretty lame situation!
The fact is that scanf only seems to be a nice, simple way to do input. It's actually a terribly complicated, unforgiving, nearly useless mess. It's only simple if you're reading very simple input, such as single numbers or simple strings (not containing spaces), or maybe single characters. For anything more complicated than that, there are so many quirks and foibles and exceptions and special cases that it almost never works the first time, and it's often more trouble than it's worth to even try to get it working.
So the general advice is to only try to use scanf for very simple input, during the first few weeks of your C programming career. You can read what I mean by "very simple input" in this answer to a previous question. Then, once you've gotten a few skills under your belt, or when you need to do something a little more complicated, its time to learn how to do input using better techniques than scanf.
Check return value to identify problems.
// scanf("(%f,%f)",&q_x,&q_y);
if (scanf("(%f,%f)",&q_x,&q_y) != 2) {
puts("Bad input");
}
Tolerate and consume spaces, new lines, tabs, etc. between portion of input. OP did not do this. The '\n' of the first entry was not consumed and caused the 2nd scanf() to fail. "%f" already allows optional leading white-spaces. Add a " " before the fixed formats characters: (,) so optional white-spaces will get consumed there too.
// v---v---v add spaces to format to allow any white-space in input.
if (scanf(" (%f ,%f )",&q_x,&q_y) != 2) {
puts("Bad input");
}

What is the difference between %s and %s%*c [duplicate]

This question already has answers here:
%*c in scanf() - what does it mean?
(4 answers)
Closed 3 years ago.
Hi I am reading some code and this line has been used:
scanf("%s%*c",dati[i].part);
What does %s%*c do and why not just use %s?
What does %s%*c do
The %s has the same meaning as anywhere else -- skip leading whitespace and scan the next sequence of non-whitespace characters into the specified character array.
The %*c means the same thing as %c -- read the next input character, whatever it is (i.e. without skipping leading whitespace) -- except that the * within means that the result should not be assigned anywhere, and therefore that no corresponding pointer argument should be expected. Also, assignment suppression means that scanf's return value is not affected by whether that field is successfully scanned.
and why not just use %s?
We cannot say for sure why the author of the code in which you saw it used %s%*c, except for the unsatisfying "because that's what the author thought was appropriate." We have no context at all for making any other judgement.
Certainly the actual effect is to consume the next input character after the string, if any. If there is such a character then it will necessarily be a whitespace character, else it would have been scanned by the preceding %s directive. We might therefore speculate that the author's idea was to consume a trailing newline.
There are at least two problems with that:
The next character might not be a newline. For example, there might be trailing space characters before a newline, in which case the first of those space characters would be consumed, but the newline would remain in the stream. If that's a genuine problem then %*c does not reliably solve it.
In practice, it's not very useful. Most scanf directives are like %s in that they automatically skip leading whitespace, including newlines. The %*c serves only to confuse if the next directive that will be processed is any of those. Moreover, it is possible for a scanf format to explicitly express that a run of whitespace at a given position should be skipped, and it is clearer to make use of that in conjunction with the next directive to be processed if that next directive is one of those that don't automatically skip whitespace (and whitespace skipping is in fact desired).
That doesn't mean that assignment suppression generally or %*c specifically is useless, mind. It's just trying to use that technique to attempt to consume trailing newlines that is poorly conceived.
The %* format specifier in a scanf call instructs the function to read data in the following format (c in your case) from the input buffer but not to store it anywhere (i.e. discard it).
In your specific case, the %*c is being used to read and discard the trailing newline character (added when the user hits the Enter key), which will otherwise remain in the input buffer, and likely upset any subsequent calls to scanf.

Confused about getchar() and fflush(stdin) behavior

I thought getchar() or fflush(stdin) was used to take the newline or space left by the previous input, because the gets() function is unable to differentiate between that newline and the input we provide. We did not need those when we use %s on the scanf function. Why do we need it when we use %c or %d on the same function?
The fact is that fflush() has no defined behavior for input streams, there are some specific implementation defined behaviors. Also, there is fpurge() in glibc which does what you want.
When using scanf() the rule is that it will stop when it finds a white space character as returned by isspace(), except when using %c which behaves differently as it can capture white spaces, or ignore them all if preceded by a white space.
Also, the gets() function is dangerous and was recently removed from the c standard so you should not use it or refer to it as an example.
The fgets() function, which is a better version of gets() does capture the final '\n' if it can, i.e. if the number of characters captured so far does not exceed the value of it's second parameter so you don't need to flush anything after it.
You should consider, that scanf() not only leaves the last '\n' in the input buffer but any white space following it too. So a single getchar() is usually not enough, you should getchar() as many white spaces as were left there to ensure the apparent behavior of fflush(stdin).

Force fscanf to Consume Possible Whitespace

I have a multiline TSV file with the following format:
Type\tBasic Name\tAttribute\tA Long Description\n
As you can see, the Basic Name and the Description can both contain some number of spaces. I am trying to read each line in and extract the elements. For now, I've narrowed it down to just extracting the basic name. My fscanf is as follows:
fscanf(file_in, "%*[^ ]s\t%128[^ ]s\t%*[^ ]s\t%[^ ]s\n", name_string, desc_string);
This doesn't work as I have hoped, and I'm having trouble narrowing down the error. Does anyone know how I could read in the lines properly?
I mostly agree with Pablo (that the scanf family don't make great parsers), but it's worth understanding how to write a scanf pattern. The pattern you're looking for is something like this:
fscanf(" %*[^\t] %128[^\t] %*[^\t] %128[^\n]", name_string, desc_string)
Notes:
%[xyz] is a directive. %[xyz]s is two directives, the second of which matches a literal s
As far a I know, there is no way to match a single literal tab character, since any whitespace in the pattern matches any amount of whitespace (including none) in the input. I used a space in my example, which will match a terminating tab, but it will also match any number of consecutive tabs so empty fields won't be parsed correctly.
The 128-character limit does not include the terminating NUL character.
Also, if the scan stops because the chracter limit is exceeded, it won't skip the rest of the field automatically, so you'll end up out of synch with the input.
A better pattern would be:
fscanf(" %*[^\t] %128[^\t]%*[^\t] %*[^\t] %128[^\n]%*[^\n]", name_string, desc_string)
which explicitly skips the remaining characters in the field, if necessary. An even better solution would be to use the a modifier and get fscanf to malloc memory for you.
I'd rather use strtok for this. It's more acurate than fscanf since this function family only work when the format is 100% OK, otherwise you end up missing values.
Take a look at Parallel to PHP's "explode" in C: Split char* into char* using delimiter, where I explain in more detail how to use strtok.
So, read each line with fgets and parse it with strtok.
Firstly, as it has already been noted, the %[] is a conversion specifier by itself. There's no s after the []. The s-es that you have in your format string will not be considered parts of the conversion specifiers. You have to get rid of those s-es.
Secondly, as you said yourself, your file is TAB-separated. Which immediately means that you should extract the continuous portions of the sequence by using the %[^\t] conversion specifier (or the %[^\n] specifier for the last portion). Why did you use %[^ ] and how did you expect it to work? The %[^ ] actually stops parsing at space character, which is the opposite of what you wanted.
In your example the proper combination of specifiers would be
fscanf(file_in, "%*[^\t]\t%128[^\t]\t%*[^\t]\t%[^\n]\n", name_string, desc_string);
This format string assumes that all 4 portions of the string are guaranteed to be present and that the last portion is guaranteed to be terminated by \n.

fscanf usage in C

I have a file like this:
10 15
something
I want to read this into tree variables, let's say number1, number2, and mystring. I have doubts about what kind of pattern to give to fscanf. I am thinking something like this;
fscanf(fp,"%i %i\n%s",number1,number2,mystring);
Should this work, and also, is this the correct way of reading this file? If not, what would you suggest?
fscanf(fp,"%i %i\n%s",&number1,&number2,mystring);
fscanf takes pointers.
Read each line with fgets (or getline if you have it), split up the line with strsep (better, if available) or strtok_r (more awkward API but more portable), and then use strtoul to convert strings to numbers as necessary.
*scanf should never be used, because:
Some format strings (e.g. a bare "%s") are just as eager to overflow your buffers as gets is.
Behavior on integer overflow is undefined -- invalid input can potentially crash your program.
They do not report the character position of the first scan error, making it nigh-impossible to recover from a parse error. (This can be somewhat mitigated by using fgets and then sscanf instead of fscanf.)
Besides the problem with pointers, generally using spaces in the scanf format is a mistake -- in most cases scanf skips whitespace automatically. So I would use something like:
int number1, number2;
char mystring[32];
fscanf("%i%i%31s", &number1, &number2, &mystring)
This will read two numbers followed by a string of up to 31 non-whitespace characters, all separated by any whitespace. Note that "whitespace" includes spaces, tabs, and newlines, so it doesn't matter if its all on one line, or spread out over 3 lines or anything in between.
Note also using a limit on the size of the string -- without that, the input might overflow any fixed size buffer you provide (and there's no way to provide a variable sized buffer with scanf)

Resources