C: difference in behaviour of scanf and getchar - c

I wanted to write a function to in C to read characters until a newline is encountered. I wrote the following codes using scanf and getchar:
Code using scanf :
while(scanf("%c",&x)!=EOF&&x!='\n'){....}
Code using getchar : while(((x=getchar())!=EOF)&&x!='\n'){....}
int x is a local variable declared inside the function. The second code stops after reading word (EG: "ADAM\n"), while scanf code does not break the loop and keeps on waiting.
Later I found that after scanf, x's value was (2^7-1)*(2^8) + ascii value of character read ( = 32522 for newline), while character constant '\n' was 10. So the comparison was failing.
My question is that why scanf assigns a value > 32000 to x after reading '\n', while getchar assigns a value 10( which matches with character constant '\n') ?

The key difference here is in the scanf behavior:
1) in general scanf is used to read different data types (not only char), for example scanf("%d",&num) will read integer number and ignore all "space" characters (characters as ' ' (space), '\t' (tab) and '\n' (new line)).
2) scanf("%c",&x) as well as scanf("%d",&num) (if number was entered) will return 1 - number of successfully data read from stdin. Note: scanf("%d",&num) will return 0 if not number is in stdin.

The main difference is that scanf does skip whitespace characters in the input stream, while getchar returns them. So you can't see newlines in the input with scanf. Also the return value of scanf is the number of successful converted variables.
You need to check for scanf(...) == 1 to see, if the variable contains a valid value (more details here). When scanf did not converted all the input variables, the value of the not converted variables is undefined. This is why you see the strange value for x in your case. This is just some (more or less) random value caused by the fact that the compiler assigned x to a memory location, which was used before, and still contains some left over data.

Related

Scanf stores wrong value inside integer variable

I have wrote the following code using c programming language (Standard 89):
#include <stdio.h>
#include <stdlib.h>
int main()
{
int cc,dd;
scanf("%d/%d",&cc,&dd);
int ll;
scanf("%d",&ll);
printf("Value of ll is: %d",ll);
return 0;
}
If I submit the following as an input in one line: 4/5h I get the following output: Value of ll is: 67
So I have 2 questions;
1) where that 67 value came from? (I tried to change the input to something like 1/2t but got the same result)
According to what I have read since there is no integers in the buffer the application should wait until one is available (For example to wait for a new input)
2) When I run my code using debug mode I can see that ll value is 65 but not 67!
By typing non-digit characters in entries like "5h" or "2t" for dd, you're fouling up the read for ll in the second scanf call.
%d tells scanf to skip any leading whitespace, then to read decimal digit characters up to the first non-digit character. If you type a string like "5h" or "2t", that leading digit will be successfully converted and assigned to dd, but the trailing non-digit character will be left in the input stream, and that's fouling up the read for ll. No new value is being read into ll, you're getting whatever indeterminate value it had when the program started up.
Always check the result of scanf (and fscanf and sscanf) - if it's less than the number of inputs you expect, then you have a matching failure (you're not handling some input correctly). If it's EOF, then you have a failure on the input stream itself.
For this particular case, you can work around the problem by checking the result of scanf - if it's 0, then there's a bad character in the stream. Throw it away and try again:
int r;
while ( ( r = scanf( "%d", &ll ) ) != 1 && r != EOF )
getchar();
This will call scanf and try to read a value into ll. We expect scanf to return a 1 on a successful input, so we'll loop while the result of scanf isn't 1 (and isn't EOF, either). If the read isn't successful, we assume there's a non-digit character stuck in the input stream, so we read and discard it with the getchar call.

Understanding scanf behaviour

I am very new to C programming. My sir gave this code to find the maximum of n numbers. When I do as Sir says things are perfect i.e Write a number when the line - Type the number of numbers and write numbers in a row like 7 8 9 10 when Type the numbers pop up.
#include <stdio.h>
main()
{
int n, max, number, i;
printf("Type the number of numbers");
scanf("%d", &n);
if(n>0)
{
printf("Type the numbers");
scanf("%d",&number);
max=number;
for(i=1; i<n; i++)
{
scanf("%d", &number);
if(number>max)
max=number;
}
printf("MAX=%d \n", max);
}
}
But if I write suppose - 5 8 9 10 7 6 - then the program understands it like --
It puts n = 5 then puts number = 8 then loop executes number changes to 9 then number changes to 10 till 6 and then gves max.
So how is scanf working here? It takes digit individually although they are written in a row with spaces?
From the horse's mouth:
7.21.6.2 The fscanf function
...7 A directive that is a conversion specification defines a set of matching input sequences, as
described below for each specifier. A conversion specification is executed in the
following steps:
8 Input white-space characters (as specified by the isspace function) are skipped, unless
the specification includes a [, c, or n specifier.284)
9 An input item is read from the stream, unless the specification includes an n specifier. An
input item is defined as the longest sequence of input characters which does not exceed
any specified field width and which is, or is a prefix of, a matching input sequence.285)
The first character, if any, after the input item remains unread. If the length of the input
item is zero, the execution of the directive fails; this condition is a matching failure unless
end-of-file, an encoding error, or a read error prevented input from the stream, in which
case it is an input failure.
10 Except in the case of a % specifier, the input item (or, in the case of a %n directive, the
count of input characters) is converted to a type appropriate to the conversion specifier. If
the input item is not a matching sequence, the execution of the directive fails: this
condition is a matching failure. Unless assignment suppression was indicated by a *, the
result of the conversion is placed in the object pointed to by the first argument following
the format argument that has not already received a conversion result. If this object
does not have an appropriate type, or if the result of the conversion cannot be represented
in the object, the behavior is undefined.
...
12 The conversion specifiers and their meanings are:
d Matches an optionally signed decimal integer, whose format is the same as
expected for the subject sequence of the strtol function with the value 10
for
the base argument. The corresponding argument shall be a pointer to
signed integer.
...
284) These white-space characters are not counted against a specified field width.
285) fscanf pushes back at most one input character onto the input stream. Therefore, some sequences
that are acceptable to strtod, strtol, etc., are unacceptable to fscanf.
The processing for scanf is exactly the same; the only difference is that scanf always reads from standard input.
Examples:
Suppose you type SpaceSpaceSpace123Enter in response to the first prompt; the input stream then contains the sequence {' ', ' ', ' ', '1', '2', '3', '\n'}. When you call scanf( "%d", &n );, scanf reads and discards the leading blank spaces, then reads and matches the sequence {'1', '2', '3'}, converts it to the integer value 123, and assigns the result to n. Since there was a successful conversion and assignment, scanf returns 1.
If the input stream contains the sequence {' ', ' ', ' ', '1', '2', '.', '3', '\n'}, scanf reads and discards the leading blanks, then reads and matches the sequence {'1', '2'}, converts it to the integer value 12, and assigns the result to n. The input stream will still contain {'.', '3', '\n'}. Since there was a successful conversion and assignment, scanf will return 1.
If the input stream contains the sequence {'.', '3', '\n'}, then there is no matching sequence of characters ('.' is not a valid character in a decimal integer). scanf will leave the . unread and leave the value of n unchanged. Since there was not a successful conversion and assignment, scanf returns 0 to indicate a matching failure.
If an end-of-file is signaled on the input stream before any matching characters have been read, or if there's some other input error, scanf does not assign any new value to n and returns EOF to indicate an input failure.
The "%d" in scanf("%d",&number); causes has 3 stages of scanning user text input into an int.
0 or more leading whites-space, like ' ', '\n', '\t' and some others are read and discarded.
Numeric text like "123", "-123", "+123" is read until until a non-numeric character is read. (or end-of-file, or a rare input error).
That non-numeric character is put back into stdin for subsequent input calls.
If step 2 is successful in reading at least 1 digit, the function returns 1. Good code checks the returned value.
if (scanf("%d",&number) != 1) Handle_UnexpectedInput();
The important thing is that '\n' is not so special with scanf("%d",&number);. It acts like a separator like another white-space or non-numeric text.
'\n' does cause the buffered stdin to accept the line of user input for processing by the various scanf() calls.
Here it is simplified explanation (not overly simplified, I hope) on how scanf works from a user point of view:
The arguments are divided in two parts:
The first part is a “format string”. The string is made of at least
one format specifier. In its simplest form a specifier begins with
% and it is followed by a letter that specifies the type of
variable you’re expecting (“%d” – I’m expecting an integer). The
number of specifiers must match the number of parameters and types
in the second part.
The second part is made of one or more addresses to locations memory
where the data you input will be stored. The pointed types must
match the specifiers.
When called, the function will repeat the following steps, starting with the first specifier and the first pointer, until the end of format string is detected:
Read and discard any white-space until a non-white-space character is found (white-space: space, tab, NL, at least);
Read characters up to first white-space or a character that do not match the expected input for current specifier;
Convert them to the type of current specifier and
Store the result in the location pointed by the current pointer.
There are three typical beginner mistakes which will result in undefined behavior (crash, most likely):
You forget the address-of operator &.
The specifier and the type do not match.
The number of specifiers do not match the number of pointers.
int d;
scanf( "%d", d ); // no &
scanf_s( "%s", &d ); // s do not match int
scanf_s( "%d%d", &d ); // too many specifiers
when you press the keyboard, you are filling a buffer on your computer. the scanf will read the amount of consecutive data untill it hit a space, so "1234 43", on the code scanf("%d") you are saying "read one number", and 1234 is one number, that's what it will read.
But if you have a loop that will execute that scanf again, the number "43" is currently in the reading buffer, and scanf will read it without stopping.
The manual for scanf doesn't explains that and it's a bit confusing for a newcomer to understand why the application is not stopping there to read a new number.
Let me explain as simply as possible...
scanf reads bytes from the standard input stream stdin.
Let's say the input you give is "23 67 21 99\n" (The \n is from when you pressed Enter).
Then each next call to scanf will start reading from this input buffer and it will interpret what is sees as whatever you tell it ("%d", etc) while separating inputs by an empty character. This could be a new line, a space, a tab, etc.
While there are still bytes to be read, scanf will not wait for you to input. That is what is happening here.
let's keep it simple. I assume you don't know anything about buffer or stdin.
scanf is used to take input from user. Whenever you type a number and press 'space' or ' enter' on keyboard the number is entered into program for further purposes. When you type scanf("%d",&n); it mean take integer input from the user and store it on the address of variable n.

scanf in a while loop reads on first iteration only

NOTE: Please notice this is not a duplicate of Why is scanf() causing infinite loop in this code? , I've already seen that question but the issue there is that he checks for ==0 instead of !=EOF. Also, his problem is different, the "infinite loop" there still waits for user input, it just does not exit.
I have the following while loop:
while ((read = scanf(" (%d,%d)\n", &src, &dst)) != EOF) {
if(read != 2 ||
src >= N || src < 0 ||
dst >= N || dst < 0) {
printf("invalid input, should be (N,N)");
} else
matrix[src][dst] = 1;
}
The intention of which is to read input in the format (int,int), to stop reading when EOF is read, and to try again if an invalid input is received.
The probelm is, that scanf works only for the first iteration, after that there is an infinite loop. The program does not wait for user input, it just keeps assuming that the last input is the same.
read, src, and dst are of type int.
I have looked at similar questions, but they seem to fail for checking if scanf returns 0 instead of checking for EOF, and the answers tells them to switch to EOF.
You need to use
int c;
while((c=getchar()) != '\n' && c != EOF);
at the end of the while loop in order to clear/flush the standard input stream(stdin). Why? The answer can be seen below:
The scanf with the format string(" (%d,%d)\n") you have requires the user to type
An opening bracket(()
A number(For the first %d)
A comma(,)
A number(For the last %d)
The space(First character of the format string of your scanf) and the newline character(\n which is the last character of the format string of your scanf) are considered to be whitespace characters. Lets see what the C11 standard has to say about whitespace characters in the format string of fscanf(Yes. I said fscanf because it is equivalent to scanf when the first argument is stdin):
7.21.6.2 The fscanf function
[...]
A directive composed of white-space character(s) is executed by reading input up to the first non-white-space character (which remains unread), or until no more characters can be read. The directive never fails
So, all whitespace characters skips/discards all whitespace characters, if any, until the first non-whitespace character as seen in the quote above. This means that the space at the start of the format string of your scanf cleans all leading whitespace until the first non-whitespace character and the \n character does the same.
When you enter the right data as per the format string in the scanf, the execution of the scanf does not end. This is because the \n hadn't found a non-whitespace character in the stdin and will stop scanning only when it finds one. So, you have to remove it.
The next problem lies when the user types something else which is not as per the format string of the scanf. When this happens, scanf fails and returns. The rest of the data which caused the scanf to fail prevails in the stdin. This character is seen by the scanf when it is called the next time. This can also make the scanf fail. This causes an infinite loop.
To fix it, you have to clean/clear/flush the stdin in each iteration of the while loop using the method shown above.
scanf prompts the user for some input. Assuming the user does what's expected of them, they will type some digits, and they will hit the enter key.
The digits will be stored in the input buffer, but so will a newline character, which was added by the fact that they hit the enter key.
scanf will parse the digits to produce an integer, which it stores in the src variable. It stops at the newline character, which remains in the input buffer.
Later, second scanf which looks for a newline character in the input buffer. It finds one immediately, so it doesn't need to prompt the user for any more input.

C fscanf reports EOF if parsed for a digit but not for a string

I have the following line of text left in a file that I wish to parse one digit by one digit:
10001111001000101001010001100000101110000102
however when I use the format specifier %d fscanf returns an EOF indicator and not a digit. This is contrasted to when I use %s where it returns what I expect though it does it all at once and in a string.
So when I change (as well as the type of he from int to char[250])
if (fscanf(f, "%d", &character) != EOF)
to
if (fscanf(f, "%s", &character) != EOF)
If you want to scan digit by digit, you need to use "%1d" as the format.
Otherwise, it reads the whole line (apart from the newline) as the number and invokes undefined behaviour when it converts the large decimal number into a 32-bit int. Note that the mnemonic for d is not 'digit' but 'decimal', as opposed to 'o' for octal, 'x' for hexadecimal, and 'i' for integer in decimal, octal or hexadecimal according to the normal prefixes (leading 0 for octal, 0x for hex, or decimal otherwise).
It is still not entirely clear why it returns EOF, but with undefined behaviour, any response is valid.
The standard (ISO/IEC 9898:2011 Section 7.21.6.2 The fscanf() function, Para 10) says (in the relevant part):
If this object does not have an appropriate type, or if the result of the conversion cannot be represented in the object, the behavior is undefined.
There is quite a lot of verbiage about the behaviour under error conditions, but this is the crucial one here. Since the behaviour is undefined, getting EOF is a valid and relatively benign response. It would be interesting to investigate whether the file stream is in the state where feof(f) or ferror(f) returns true. There's no obvious reason why it should be except that you do not normally get EOF from fscanf() unless one or the other is true.
"%d" format-specifier inside a scanf family function will cause the function to look for a digit sequence, that is terminated with the first non-digit character encounter. In your case, using "%d" will consume up all the digits at once.
If "%d" didn't do as such, then you'd only be able to read single-digit numbers with it, which would be barely any formatted-input.
"%s" format-specifier causes a character sequence to get consumed, which is to get terminated with a white-space character, or end of file.
What you are looking for here, is to read the input character-by-character. You can do that with "%c" format-specifier. Afterwards, you can interpret the obtained character to its digital value.
For example; you could read the first '1' character, then subtract a '0' character from it, which would yield '1' - '0' == 1, the thing you seem to want.
fscanf returns the number of input items assigned or EOF in the case of failure. That is only one problem. The other is that it will continue to consume characters from the input string until it encounters a \0 character or a non-digit (in the %d case). Passing it the address of a single character is not a valid thing to do either. You could do something like the following instead:
char *p;
for (p=&input_string[0]; isdigit(*p); p++) {
int digit = (int)(*p - '0');
/* process digit */
}
If you want to use one of the scanf functions, create a string of two characters - the digit being processed and \0. Then sscanf that.

confused about getchar and scanf

I'm really confused about the usage of getchar() and scanf(). What's the difference between these two?
I know that scanf() [and family] get a character by character from the user [or file] and save it into a variable, but does it do that immediately or after pressing something (Enter)?
and I don't really understand this code, I saw many pieces of code using getchar() and they all let you type whatever you want on the screen and no response happen, but when you press enter it quits.
int j, ch;
printf("please enter a number : \n");
while (scanf("%i", &j) != 1) {
while((ch = getchar()) != '\n') ;
printf("enter an integer: ");
}
Here in this code can't I use scanf() to get a character by character and test it? Also, what does this line mean?
scanf("%i", &j) != 1
because when I pressed 1 it doesn't differ when I pressed 2? what does this piece do?
and when this line is gonna happen?
printf("enter an integer: ");
because it never happens.
Well, scanf is a versatile utility function which can read many types of data, based on the format string, while getchar() only reads one character.
Basically,
char someCharacter = getchar();
is equivalent to
char someCharacter;
scanf("%c", &someCharacter);
I am not 100% sure, but if you only need to read one character, getchar() might be 'cheaper' than scanf(), as the overhead of processing the format string does not exist (this could count to something if you read many characters, like in a huge for loop).
For the second question.
This code:
scanf("%i", &j) != 1
means you want scanf to read an integer in the variable 'j'. If read successfully, that is, the next input in the stream actually is an integer, scanf will return 1, as it correctly read and assigned 1 integer.
See the oldest answer to this SO question for more details on scanf return values.
As far as I understand,
the getchar function will read your input one character at a time.
scanf will read all types of data, and will be more useful to define a data group.
However, as far as strings go, my teacher recommends using gets instead of scanf. This is because scanf will stop 'getting' the data at the first white space you put in, like in a sentence...
while (scanf("%i", &j) != 1) {
while((ch = getchar()) != '\n') ;
printf("enter an integer: ");
}
Here's how this code breaks down.
scanf() consumes individual characters from the input stream until it sees a character that does not match the %i conversion specifier1, and that non-matching character is left in the input stream;
scanf() attempts to convert the input text into a value of the appropriate type; i.e., if you enter the string "1234\n", it will be converted to the integer value 1234, the converted value will be assigned to the variable j, and the '\n' will be left in the input stream;
if there are no characters in the input string that match the conversion specifier (such as "abcd"), then no conversion is performed and nothing is assigned to j;
scanf() returns the number of successful conversions and assignments.
if the result of the scanf() call is not 1, then the user did not enter a valid integer string;
since non-matching characters are left in the input stream, we need to remove them before we can try another scanf() call, so we use getchar() to consume characters until we see a newline, at which point we prompt the user to try again and perform the scanf() call again.
1. The %i conversion specifier skips over any leading whitespace and accepts optionally signed integer constants in octal, decimal, or hexadecimal formats. So it will accept strings of the form [+|-]{0x[0-9a-fA-F]+ | 0[0-7]+ | [1-9][0-9]*}
The scanf can scan arbitrarily formatted data and parse it as multiple types (integers, floating point, strings, etc). The getchar function just gets a single character and returns it.
The expression
scanf("%i", &j) != 1
reads a (possibly signed) integer from the standard input, and stores it in the variable j. It then compares the return value of the scanf function (which returns the number of successfully scanned conversions) and compares it to 1. That means the expression will be "true" if scanf didn't read or converted an integer value. So the loop will continue to loop as long as scanf fails.
You might want to check this scanf reference.
That the printf doesn't happen might be either because it never happens (use a debugger to find out), or it just seemingly doesn't happen but it really does because the output needs to be flushed. Flushing output is done either by printing a newline, or with the fflush function:
fflush(stdout);
As far as I know, scanf will read user input until the first whitespace, considering the input format specified. getchar, however, reads only a single character.
scanf will return the number of arguments of the format list that were successfully read, as explained here. You obtain the same result when pressing 1 or 2 because both of them are successfully read by the %i format specifier.
getchar reads one char at a time from input. where as scanf can read more depending upon the data type u specify.
its not good practice to use scanf() try using fgets(), its much more efficient and safe than scanf.

Resources