What is really happening on removing and adding white space characters? - c

I'm new in programming and learning basics of C programming. I'm learning about how scanf() works but I think now I'm very much confused and really don't know how and what to ask. But I will try my best to put my question clear.
Questions
I'm really not able to understand the whole concept of whitespace. I mean when they are skip by the scanf and when they are not and biggest question: How they are skipped?
Along with the whitespace concept I'm not able to understand the working of scanf function also? I had read about it in many books and websites and also in this site but it confuse me more since each person has their own way of telling any concept and it vary from one to another.
Have a look at this short program:
#include<stdio.h>
int main()
{
int num;
char ch;
printf("enter the value of num and ch:\n");
scanf("%d",&num);
scanf("%c",&ch);
printf("num = %d and ch = %c",num,ch);
return 0;
}
I know that in this program user will be allowed to enter the value of num only, because of the new line character that stays back in the input buffer and next time scanf will input that new line character but can be solved if we add extra space before %c in the second scanf function.
But when I replace the char ch variable with int ch, scanf skips the new line. Why?
Why scanf do not skip non-white space character just like whitespace For example - a, b, c, d, #) # etc?
What is the difference between space and newline character in scanf? I mean there will some exceptions right?

First Question
I mean when they are skip by the scanf and when they are not
White-space characters are skipped unless the format specifier is %c, %n or %[. Relevant quote from the C11 standard:
7.21.6.2 The fscanf function
[...]
Input white-space characters (as specified by the isspace function) are skipped, unless the specification includes a [, c, or n specifier. 284)
How they are skipped?
Just read and discard them.
Second Question
I'm not able to understand the working of scanf function also?
scanf is a variadic function meaning that it can take any number of arguments with a minimum of one. scanf parses the first argument which is a string literal and accordingly, takes input.
Third Question
But when I replace the char ch variable with int ch, scanf skips the new line. Why?
First part of the first answer explains it. %d will skip whitespace characters.
Fourth Question
Why scanf do not skip non-white space character just like whitespace?
For some conversion specifiers like %c, non-whitespace characters are valid inputs. It doesn't make sense why they should skip them. For other like %d, characters ( not numbers ) are invalid inputs. scanf stops scanning and returns when it sees invalid input. It is designed this way.
Fifth Question
What is the difference between space and newline character in scanf?
There is no difference when any of them are placed in the format string in scanf. Both of them are considered as whitespace characters, although they are different characters. They skip any number of whitespace characters, including none, until the first non-whitespace character when they are used in the format string of scanf. Relevant quote from the C11 standard:
7.21.6.2 The fscanf function
[...]
A directive composed of white-space character(s) is executed by reading input up to the first non-white-space character (which remains unread), or until no more characters can be read. The directive never fails.

Related

Why is this creating two inputs instead of one

https://i.imgur.com/FLxF9sP.png
As shown in the link above I have to input '<' twice instead of once, why is that? Also it seems that the first input is ignored but the second '<' is the one the program recognizes.
The same thing occurs even without a loop too.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(){
int randomGen, upper, lower, end, newRandomGen;
char answer;
upper = 100;
lower = 1;
end = 1;
do {
srand(time(0));
randomGen = rand()%(upper + lower);
printf("%d\n", randomGen);
scanf("%s\n", &answer);
}while(answer != '=');
}
Whitespace in scanf format strings, like the \n in "%c\n", tries to match any amount of whitespace, and scanf doesn’t know that there’s no whitespace left to skip until it encounters something that isn’t whitespace (like the second character you type) or the end of input. You provide it with =\n, which fills in the %c and waits until the whitespace is over. Then you provide it with another = and scanf returns. The second time around, the character could be anything and it’d still work.
Skip leading whitespace instead (and use the correct specifier for one character, %c, as has been mentioned):
scanf(" %c", &answer);
Also, it’s good practice to make sure you actually succeeded in reading something, especially when failing to read something means leaving it uninitialized and trying to read it later (another example of undefined behaviour). So check scanf’s return value, which should match the number of conversion specifiers you provided:
if (scanf(" %c", &answer) != 1) {
return EXIT_FAILURE;
}
As has been commented, you should not use the scanf format %s if you want to read a single character. Indeed, you should never use the scanf format %s for any purpose, because it will read an arbitrary number of characters into the buffer you supply, so you have no way to ensure that your buffer is large enough. So you should always supply a maximum character count. For example, %1s will read only one character. But note: that will still not work with a char variable, since it reads a string and in C, strings are arrays of char terminated with a NUL. (NUL is the character whose value is 0, also sometimes spelled \0. You could just write it as 0, but don't confuse that with the character '0' (whose value is 48, in most modern systems).
So a string containing a single character actually occupies two bytes: the character itself, and a NUL.
If you just want to read a single character, you could use the format %c. %c has a few differences from %s, and you need to be aware of all of them:
The default maximum length read by %s is "unlimited". The default for %c is 1, so %c is identical to %1c.
%s will put a NUL at the end of the characters read (which you need to leave space for), so the result is a C string. %c does not add the NUL, so you only need to leave enough space for the characters themselves.
%s skips whitespace before storing any characters. %c does not ignore whitespace. Note: a newline character (at the end of each line) is considered whitespace.
So, based on the first two rules, you could use either of the following:
char theShortString[2];
scanf("%1s", theShortString);
char theChar = theShortString[0];
or
char theChar;
scanf("%c", &theChar);
Now, when you used
scanf("%s", &theChar);
you will cause scanf to write a NUL (that is, a zero) in the byte following theChar, which quite possibly is part of a different variable. That's really bad. Don't do that. Ever. Even if you get away with it today, it will get you into serious trouble some time soon.
But that's not the problem here. The problem here is with what comes after the %s format code.
Let's take a minute (ok, maybe half an hour) to read the documentation of scanf, by typing man scanf. What we'll see, quite near the beginning, is: (emphasis added)
A directive is one of the following:
A sequence of white-space characters (space, tab, newline, etc.; see isspace(3)). This directive matches any amount of white space, including none, in the input.
So when you use "%s\n", scanf will do the following:
skip over any white-space characters in the input buffer.
read the following word up to but not including the next white-space character, and store it in the corresponding argument, followed by a NUL.
skip over any white-space following the word which it just read.
It does the last step because \n — a newline — is itself white-space, as noted in the quote from the manpage.
Now, what you actually typed was < followed by a newline, so the word read at step 2 will be just he character <. The newline you typed afterwards is white-space, so it will be ignored by step 3. But that doesn't satisfy step 3, because scanf (as documented) will ignore "any amount of white space". It doesn't know that there isn't more white space coming. You might, for example, be intending to type a blank line (that is, just a newline), in which case scanf must skip over that newline as well. So scanf keeps on reading.
Since the input buffer is now empty, the I/O library must now read the next line, which it does. And now you type another < followed by a newline. Clearly, the < is not white-space, so scanf leaves it in the input buffer and returns, knowing that it has done its duty.
Your program then checks the word read by scanf and realises that it is not an =. So it loops again, and the scanf executes again. Now there is already data in the input buffer (the second < which you typed), so scanf can immediately store that word. But it will again try to skip "any amount of white space" afterwards, which by the same logic as above will cause it to read a third line of input, which it leaves in the input buffer.
The end result is that you always need to type the next line before the previous line is passed back to your program. Obviously that's not what you want.
So what's the solution? Simple. Don't put a \n at the end of your format string.
Of course, you do want to skip that newline character. But you don't need to skip it until the next call to scanf. If you used a %1s format code, scanf would automatically skip white-space before returning input, but as we've seen above, %c is far simpler if you only want to read a single character. Since %c does not skip white-space before returning input, you need to insert an explicit directive to do so: a white-space character. It's usual to use an actual space rather than a newline for this purpose, so we would normally write this loop as:
char answer;
srand(time(0)); /* Only call srand once, at the beginning of the program */
do {
randomGen = rand()%(upper + lower); /* This is not right */
printf("%d\n", randomGen);
scanf(" %c", &answer);
} while (answer != '=');
scanf("%s\n", &answer);
Here you used the %s flag in the format string, which tells scanf to read as many characters as possible into a pre-allocated array of chars, then a null terminator to make it a C-string.
However, answer is a single char. Just writing the terminator is enough to go out of bounds, causing undefined behaviour and strange mishaps.
Instead, you should have used %c. This reads a single character into a char.

Why don't you need a getchar() while getting integer as an input? [duplicate]

This question already has answers here:
scanf() leaves the newline character in the buffer
(7 answers)
Closed 4 years ago.
int main()
{
char a,b;
scanf("%c",&a);
getchar();
scanf ("%c",&b);
}
If you don't use getchar for character scanf takes the whitespace as an input but for integer you don't need getchar:
int main()
{
int a,b;
scanf("%d",&a);
scanf ("%d",&b);
}
Why don't you need a getchar() while getting integer as an input?
Interpreting your question in terms of the differences between the two provided examples, you seem to be asking about the difference between scanf's processing of %c directives and its processing of %d directives. At its simplest, the explanation is that scanf's specifications say that when it attempts to match a %d directive, it must skip any leading whitespace. Which, by the way, may comprise any number of characters, and which recognizes more characters than just ASCII 0x20 as whitespace.
The %c is actually the oddball here. Of all the scanf directives that match and convert input, it is one of only two that don't skip leading whitespace. This makes sense, because it allows scanf() to read space characters as input, and because you can instruct it to match (and therefore skip) leading whitespace by inserting a space character into the format string immediately before a %c (or any other) directive. Or you can read and ignore any single character, as your getchar() actually does, by inserting an additional %*c directive into the format.
The whitespace skipping performed for most other directives is a convenience catering to fixed-format tabular data, which may have varying amounts of space between individual items.
Why don't you need a getchar() while getting integer as an input?
"%d" skips leading white-space - including the left-over enter of the prior scanf("%c",&a);.
"%c" does not.
Input white-space characters (as specified by the isspace function) are skipped, unless
the specification includes a [, c, or n specifier.
C11dr §7.21.6.2 8

Why scanf() reads two values when we put spaces after format specifier even we are passing one reference

#include<stdio.h>
main()
{
int a;
printf("Enter a value \n");
scanf("%d ",&a);
printf("a=%d \n",a);
}
In scanf() function I put space after the format specifier. When i run this program scanf() reads two values from the user and only the first value is assigned to 'a'.
Why does scanf() read two values when we use spaces after format specifier even though we are passing one reference in above program?
Why does scanf() read one value when we use space before format specifier even though we are passing one reference?
How will the scanf() function work?
First of all you would want to change your int a; to a pointer, 2) when you specify a variable to be as int it only expect integer number 0-9 and space is a char character, 3) i think number would answer this too
Same question has been asked on Quora
A space in a scanf format string matches an arbitrary amount of
whitespace in the input.
To match all of that arbitrary amount of input space, it has to see
something to terminate the whitespace--i.e., something other than
whitespace (and a new-line is whitespace, so it isn't sufficient).
That's why you need to enter second time(which should not be a whitespace to terminate the scanf()).
White space includes space, tabs, or newlines.
Hope my answer here helps!
Error While taking the input
Always avoid using whitespaces or tabspaces in the scanf unless you intend to! which is required in some scenarios.
Why this behaviour of scanf?
scanf uses whitespaces, newlines, tabs as delimiters. It stops taking input as soon as it read these 3 from the keyboard.
Since you are providing whitespace in the scanf, after entering the integer it will now wait either for a newline, whitespace to be entered through the keyboard to match that whitespace of yours and after that it will finish taking it's input from the keyboard.

scanf() behaviour for strings with more than one word

Well I've been programming in C for quite a while now, and there is this question about the function scanf()
here is my problem:
I know that every element in ASCII table is a character and I even know that %s is a data specified for a string which is a collection of characters
My questions:
1.why does scanf() stops scanning after we press enter. If enter is also character why cant it be added as a component of the string that is being scanned.
2.My second question and what I require the most is why does it stops scanning after a space, when space is again a character?
Note: My question is not about how to avoid these but how does this happen
I'd be happy if this is already addressed, I'd gladly delete my question and even if I've presumed something wrong please let me know
"why does scanf() stops scanning after we press enter." is not always true.
The "%s" directs scanf() as follows
char buffer[100];
scanf("%s", buffer);
Scan and consume all white-space including '\n' generated from multiple Enters. This data is not saved.
Input white-space characters (as specified by the isspace function) are skipped, unless the specification includes a [, c, or n specifier C11dr §7.21.6.2 8
Scan and save all non-white-space characters. Continue doing so until a white-space is encountered.
Matches a sequence of non-white-space characters §7.21.6.2 12
This white-space is put back into stdin for the next input function. (OP's 2nd question)
A null character is appended to buffer.
Operations may stop short if EOF occurs.
If too much data is save in buffer, it is UB.
If some non-white-space data is saved, return 1. If EOF encountered, return EOF.
Note: stdin is usually line buffered, so no keyboard data is given to stdin until a '\n' occurs.
From my reading of your question, both of your numbered questions are the same:
Why does scanf with a format specifier of %s stop reading after encountering a space or newline.
And the answer to both of your questions is: Because that is what scanf with the %s format specifier is documented to do.
From the documentation:
%s Matches a sequence of bytes that are not white-space characters.
A space and a newline character (generated by the enter key) are white-space characters.
I made miniprogram with scanf for get multiple name without stop on space or ever enter.
i use while
Scanf("%s",text);
While (1)
{
Scanf("%s",text1)
If (text1=='.'){break;}
//here i simple add text1 to text
}
This way i get one line if use the .
Now i use
scanf("%[^\n]",text);
It work great.

understanding scanf syntax

I have the following scanf code I am unable to understand::
char board[3][3];
int i;
for(i=0;i<3;i++)
scanf("%s[^\n]%*c", board[i]);
Please help me understand word by word what the letters in scanf syntax mean.
Thankyou.
Read a sequence of non-whitespace characters, then "[^", newline, "]", then one more character which is not stored anywhere. I don't think this is what actually needed. You can read scanf manpage (google it) for correct syntax.
Explanation:
%s - capture a sequence of non-whitespace characters
%[ - capture a sequence of characters determined by set (ending with ']')
That's why %s[^\n] seems wrong to me. Should be %[^\n] instead.
Afaik,
What this does is, for 3 times (inside for loop), reads a line (with %s) till it encounters a newline char (with [^\n]) and discards the last (newline) char (with %*c).
%*c
Here, "*" will tell scanf to not store the value caught by "c". i.e. the newline char.

Resources