I'm just asking what does the getchar do in this code and how does it work? I don't understand why the getchar affects the code, to me it seems as if its just getting the value but nothing is being done with the value.
int c=0;
while (c>=0)
{
scanf("%d", &c);
getchar();
}
Some possibilities of why getchar() might have been used there:
1) If it's done to ignore whitespaces (typically used when scanning chars with %c), it's not needed here because %d ignores whitespaces anyway.
2) Other possibility is that after this loop, some further scanning is done where the last \n left by the last call to scanf() might be a problem. So, getchar() might be used to ignore it.
3) In case you enter characters do not match %d, scanf() will fail. In that the characters you entered are left in the input stream and you'll never be able to read an int again (For example, if you input abcdddgdfg without that getchar() call). So, getchar() here will consume all those
chars (one per iteration) and eventually you'll be able to read int (using %d) again.
But this is all really not needed; it's just an attempt to fix flaws of scanf(). Reading inputs using scanf() and getting it correct is really difficult. That's why it's always recommended to use fgets() and parse using sscanf() or using strto*() functions if you are just scanning integers.
See: Why does everyone say not to use scanf? What should I use instead?
In this code, getchar is being called for its side effects: it reads a character from standard input and throws it away.
Probably this is reading input from the user. scanf will consume a number, but leave the newline character after the number untouched. The getchar consumes the newline and throws it away. This isn't strictly necessary in this loop, because the next scanf will skip over whitespace to find the next number, but it might be useful if the code after the loop isn't expecting to have a newline as the first thing on stdin.
This code is buggy, because it doesn't check for EOF, because it doesn't do anything sensible when the input is not a number or when there's more text on the line after the number, and because it uses scanf, which is broken-as-specified (for instance, it's allowed to crash the program if the input overflows the range of an int). Better code would be something like
char *linep = 0;
size_t asize = 0;
char *endp;
long c;
while (getline(&linep, &asize, stdin) > 0) {
errno = 0;
c = strtol(linep, &endp, 10);
if (linep == endp || *endp != '\0' || errno) {
puts("?Redo from start");
continue;
}
if (c == 0) break;
do_something_with(c);
}
free(linep);
Most likely the code is for reading in a list of integers, separated by a new line.
scanf will read in an integer, and put it into variable c.
The getchar is reading in the next character (assuming a new line)
Since it doesn't check, there is some potential that it wasn't a new line, or that scanf failed as the what it tried to read wasn't a number.
getchar(); is simply reading and consuming the character after the number, be it a space, comma, new line or the beginning of another integer or anything else.
IMO, this is not robust code. Good code would 1) at least test the result of scanf() and 2) test or limit the consumption of the following character to prevent "eating" a potential sign of the following number. Remember code cannot control what a user types, but has to cope with whatever is entered.
v space
"123 456"
v comma
"123,456"
v negative sign
"123-456"
I'm working in a Linux terminal and have I'm writing a program that requires the user to hit CTRL-D at the end of their input to signify the end of input. The program hangs on the while loop waiting for another input to scan even though I'm pressing the CTRL-D button. The weirdest part is that if I press it enough times, it will register and store the EOF keystroke to my variable, and finish the program correctly.
My code snippet:
char c;
scanf("%c",&c);
while(index<size && c!=EOF){
A[index] = c;
index++;
scanf("%c",&c);
}
I stepped through my program in gdb and every time I've entered my input and it has scanned it all, it waits for more input, at which point I use the CTRL-D keystroke, but it doesn't even register it. I haven't found anything in researching this issue, and I'm confused as to why it works sometimes (if I press it enough times), and not others. What is going on?
scanf, unlike fgetc, never sets it parameters to EOF. In case scanf is unable to read its parameter(s) due to eof, it simply leaves them unchanged. So the check c!=EOF is obsolete. You may check for feof(stdin). Alternatively, you may check the return code of scanf:
int rc = scanf("%c", &c);
It returns the number of parameters it succesfully reads, or sometimes it returns EOF.
I was able to implement the desired behavior using fgetc()
char c = fgetc(stdin);
while(index<size && c!=EOF){
A[index] = c;
index++;
fgetc(stdin);
}
From the man page of fgetc: "fgetc() reads the next character from stream and returns it as an unsigned char cast to an int, or EOF on end of file or error." I guess scanf fails to properly recognize EOF. (see below)
Alternatively, use the feof function provided by stdio.h:
while(index < size){
c = fgetc(stdin);
if(feof(stdin)) break;
}
It's generally recommended to use the fgets() family of functions for I/O-Operations, since they are (mostly) safer than scanf.
Man pages with further information: fgetc, feof, scanf.
EDIT: After posting my answer and reading the comments under the question: scanf() returns EOF (like fgetc) when encountering an end of file error. This means checking the return value of scanf will fix the problem, e.g.
while(index < size && scanf("%c", &c) != EOF){
//do stuff
}
Trying to understand the behavior of my code. I'm expecting Ctrl-D to lead to the program printing the array and exiting, however it takes 3 presses, and it enters the while loop after the second press.
#include <stdio.h>
#include <stdlib.h>
void unyon(int p, int q);
int connected(int p, int q);
int main(int argc, char *argv[]) {
int c, p, q, i, size, *ptr;
scanf("%d", &size);
ptr = malloc(size * sizeof(int));
while((c = getchar()) != EOF){
scanf("%d", &p);
scanf("%d", &q);
printf("p = %d, q = %d\n", p, q);
}
for(i = 0; i < size; ++i)
printf("%d\n", *ptr + i);
free(ptr);
return 0;
}
I read the post here, but I don't quite understand it.
How to end scanf by entering only one EOF
After reading that, I'm expecting the first Ctrl-D to clear the buffer, and then I'm expecting c = getchar() to pick up the second Ctrl-D and jump out. Instead the second Ctrl-D enters the loop and prints p and q, and it takes a third Ctrl-D to drop out.
This is made more confusing by the fact that the code below drops out on the first Ctrl-D-
#include <stdio.h>
main() {
int c, nl;
nl = 0;
while((c = getchar()) != EOF)
if (c == '\n')
++nl;
printf("%d\n", nl);
}
Let's just strip the program down to the calls which do input:
scanf("%d", &size); // Statement 1
while((c = getchar()) != EOF){ // 2
scanf("%d", &p); // 3
scanf("%d", &q); // 4
}
That is definitely not the way to go; we'll get to the correct usage in a bit. For now, let's just analyze what happens. It's important to understand precisely how scanf works. The %d format code causes it to first skip over any whitespace characters, and then read characters as long as the characters can be made into a decimal integer. Eventually some character will be read which is not part of a decimal integer; most likely a newline character. Because the format string is now finished, the unused character which has just been read will be reinserted into the stream.
So when the call to getchar is made, getchar will read and return the newline character which terminated the integer. Inside the loop, there are then two calls to scanf("%d"), each of which will behave as indicated above: skip whitespace if any, read a decimal integer, and reinsert the unused character back into the input stream.
Now, let's suppose that you run the program, and enter the number 42 followed by the enter key, and then Ctrl-D to close the input stream.
The 42 will be read by statement 1, and (as mentioned above) the newline will be read by statement 2. So when statement 3 is executed, there is no more data to be read. Because end-of-file is signaled before any digit is read, scanf will return EOF. However, the code does not test the return value of scanf; it goes on to statement 4.
What should happen at this point is that the scanf in statement 4 should immediately return EOF without attempting to read more input. That's what the C standard says should happen, and it is what Posix says should happen. Once end-of-file has been signaled on a stream, any input request should immediately return EOF until the end-of-file indicator is manually cleared. (See below for standards quotes.)
But glibc, for reasons we won't go into just yet, does not conform to the standard. It attempts another read. And so the user must enter another Ctrl-D, which will cause the scanf at statement 4 to return EOF. Again, the code does not check the return code, so it continues with the while loop and calls getchar again at statement 2. Because of the same bug, getchar does not immediately return EOF, but instead attempts to read a character from the terminal. So the user must now type a third Ctrl-D to cause getchar to return EOF. Finally, the code checks a return code, and the while loop terminates.
So that is the explanation of what is happening. Now, it is easy to see at least one mistake in the code: the return value of scanf is never checked. Not only does this mean that EOF is missed, it also means that input errors are ignored. (scanf would have returned 0 if the input could not be parsed as an integer.) That's serious, because if scanf cannot succesfully match the format code, the value of the corresponding argument is undefined and must not be used.
In short: Always check return values from *scanf. (And other I/O library functions.)
But there is a more subtle mistake as well, which makes little difference in this case but could, in general, be serious. The character read by getchar in statement 2 is simply discarded, regardless of what it was. Normally it will be whitespace, so it doesn't matter that it is discarded, but you don't actually know that because the character is discarded. Maybe it was a comma. Maybe it was a letter. Maybe it matters what it was.
It is bad style to rely on the assumption that whatever character is read by the getchar at statement 2 is unimportant. If you really need to peek at the next character, you should reinsert it into the input stream, just as scanf does:
while ((c = getchar()) != EOF) {
ungetc(c, stdin); /* Put c back into the input stream */
...
}
But actually, that test is not what you want at all. As we have already seen, it is extremely unlikely that getchar will return EOF at this point. (It's possible, but it's very unlikely). Much more more probable is that getchar will read a newline character, even though the next scanf will encounter the end-of-file. So there was absolutely no point peeking at the next character; the correct solution is to check the return code of scanf, as indicated above.
Putting that together, what you really want here is something more like:
/* No reason to use two scanf calls to read two consecutive numbers */
while ((count = scanf("%d%d", &p, &q)) == 2) {
/* Do something with p and q */
}
if (count != EOF) {
/* Invalid format. Issue an error message, at least */
}
/* Do whatever needs to be done at the end of input. */
Finally, let's examine glibc's behaviour. There is a very long-standing bug report linked to by an answer to the question cited in the OP. If you take the trouble to read through to the most recent post in the bugzilla thread, you'll find a link to a discussion on the glibc developer mailing list.
Let me give the TL;DR version, and save you the trouble of digital archaeology. Since C99, the standard has been clear that EOF is "sticky". §7.21.3/11 states that all input is performed as though successive bytes were read by fgetc:
...The byte input functions read characters from the stream as if by successive calls to the fgetc function.
And §7.21.7.1/3 states that fgetc returns EOF immediately if the stream's end-of-file indicator is set:
If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of-file indicator for the stream is set and the fgetc function returns EOF. Otherwise, the fgetc function returns the next character from the input stream pointed to by stream. If a read error occurs, the error indicator for the stream is set and the fgetc function
returns EOF.
So once the end-of-file indicator is set, because either end of file was detected or some read error occurred, subsequent input operations must immediately return EOF without attempting to read from the stream. Various things can clear the end-of-file indicator, including clearerr, seek, and ungetc; once the end-of-file indicator has been cleared, the next input function call will again attempt to read from the stream.
However, it wasn't always like that. Before C99, the result of reading from a stream which had already returned EOF was unspecified. And different standard libraries chose to handle it in different ways.
So a decision was made to not change glibc to conform to the (then) new standard, but rather to maintain compatibility with certain other C libraries, notably Solaris. (A comment in the glibc source is quoted in the bug report.)
Although there is a compelling argument (at least, compelling to me) that fixing the bug is not likely to break anything important, there is still a certain reluctance to do anything about it. And so, here we are, ten years later, with a still-open bug report, and a non-conforming implementation.
If you run it through the debugger you will get a clearer picture. Here is the sequence of events.
scanf("%d", &size); is called.
A number is input followed by ENTER. The key here is that scanf does not consume the \n that results from the ENTER.
getchar is called. This consumes the \n.
scanf("%d", &p); is called. This consumes the first ctrl-D. If the return value were checked then it would be apparent that an error occured.
scanf("%d", &q); is called. This consumes the second ctrl-D.
Loop goes back to the top and calls getchar. The third ctrl-D then causes EOF to be returned by getchar and hence the loop breaks out at that point.
I'll leave it as an exercise for you to explain why the second program functions as expected.
There are different things messing here.
First of all, when you type Ctrl-D to the input terminal, the tty driver is processing your input, adding each character in a buffer and processing special characters. One of these special characters (Ctrl-D) means take up to the last char and make them all available to the system. This makes two things to happen: first, the Ctrl-D character is eliminated from the data stream and; second, all the characters typed up so far are made available to be read(2) by the process syscall. getchar() is a buffered library call that avoids making one read per character, allowing to store previously read characters in the buffer.
Other thing messing here is the way the system signals the end of file in posix systems (and all unix systems). When you make a read(2) system call, the return value is the actual number of characters read (or -1 in case of failure, but this has nothing to do with EOF, as will be explained soon). And the system marks the end of file condition by returning 0 characters. So, the operating system marks the end of file making read(2) return 0 bytes as a result (if you only hit the return key, that will make a \n to appear in the data stream).
The third thing messing up here is the type of return value from getchar(3) function. It doesn't return a char value. As all possible byte values are posible to be returned for getchar(3), there's no possibility to reserve a special value for signalling a EOF. The solution adopted a long, long, time ago (when getchar(3) was designed, that is in the first version of the C language, (see The C programming language by Brian Kernighan and Denis Ritchie, first ed.) was to use an int as return value to be able to return all the possible byte values (0..255) plus one extra value, called EOF. The exact value of EOF is implementation dependant, but normally defined as -1 (I think even the standard specifies now it must be defined as -1, but not sure)
So, making all things work together, EOF is an int constant defined to allow programers to write while ((c = getchar()) != EOF). You will never get -1 as a data value from the terminal. The system always marks the end of file condition by making read(2) to return 0. And the terminal driver on receiving Ctrl-D just eliminates it from the stream and makes data up to, but not including (as different from Ctrl-J or Ctrl-M, line feed and carry return, respectivelly, that are also interpreted and are input as \n in the data stream)
So, next the question is: Why there are needed normally two (or more) Ctrl-D chars to signal eof?
Right, as I've said, one only makes all thata up to the Ctrl-D (but not including it) available to the kernel, so the result from read(2) can be a number different than 0 for the first time. But what is sure is that if you enter the Ctrl-D char twice in sequence, after the first there were not be more chars in between the two chars, assuring a read() of zero chars. Normally, programs are in a loop, doing multiple reads
while ((n_read = read(fd, buffer, sizeof buffer)) > 0) {
/* NORMAL INPUT PROCESSING GOES HERE, for up to n_read bytes
* stored in buffer */
} /* while */
if (n_read < 0) {
/* ERROR PROCESSING GOES HERE */
} else {
/* EOF PROCESSING GOES HERE */
} /* if */
In the case of files, the behaviour is different, as Ctrl-D is not interpreted by any driver (it's stored in the disk file) so you'll get Ctrl-D as a normal character (it's value is \004)
When you read a file, normally this deals to reading a lot of complete buffers, then make a partial read (with less than the buffer size bytes input) and a final read of zero bytes, signalling that the file has ended.
Note
Depending on the configuration of the tty driver in some unices, the eof character can be changed and have different mean. Also happens to the return character and linefeed character. Se termios(3) manual page for a detailed documentation on this.
For my homework we are supposed to write a checksum that continues to display checksums until a carriage return is detected (Enter key is pressed). My checksum works fine and continues to prompt me for a string to convert it to a checksum, but my program is supposed to end after I press the enter key. In my code, I set up a while loop that continues to calculate the checksums. In particular I put:
while(gets(s) != "\r\n")
Where s in a string that the user has to input. I've also tried this with scanf, and my loop looked like:
while(scanf("%s",s) != '\n')
That did not work as well. Could anyone please give me some insight onto how to get this to work? Thanks
The gets(s) returns s. Your condition compares this pointer to the adress of a constant string litteral. It will never be equal. You have to use strcmp() to compare two strings.
You should also take care of special circumstances, such as end of file by checking for !feof(stdin) and of other reading errors in which case gets() returns NULL.
Please note that gets() will read a full line until '\n' is encountered. The '\n' is not part of the string that is returned. So strcmp(s,"\r\n")!=0 will always be true.
Try:
while (!feof(stdin) && gets(s) && strcmp(s,""))
In most cases the stdin stream inserts '\n' (newline or line-feed) when enter is pressed rather than carriage-return or LF+CR.
char ch ;
while( (ch = getchar()) != '\n` )
{
// update checksum using ch here
}
However also be aware that normally functions operating on stdin do not return until a newline is inserted into the stream, so displaying an updating checksum while entering characters is not possible. In the loop above, an entire line will be buffered before getchar() first returns, and then all buffered characters will be updated at once.
To compare a string in C, which is actually a pointer to an array of characters, you need to individually compare the values of each character of the string.
I have read so many questions about getchar() and its behaviour, but still I don't understand this simple code..
while (scanf("%d", &z) != 1)
{
while (getchar() != '\n');
printf ("Try again: ");}
This code is to validate for characters.. From what I infer from this code is that if I input
Stackoverflow
Then the whole line is pushed to the buffer with the newline '\n' also..
Then the getchar() reads each character from the buffer and returns an integer, cleaning the buffer.. In this case the while loop should loop 12 times (number of characters in Stackoverflow) until it reaches the '\n' character.. but actually it just loops once and the output is
Try again:
means its out of loop and asking for scanf again.. It goes against my understanding of looping.. Maybe I misunderstood looping.. One additional question,, if getchar() returns integers then how it could be compared to characters like '\n'?
Reformatting the code to assist with my explanation:
while (scanf("%d", &z) < 1) {
int c; // for getchar()'s return value
while ((c = getchar()) != EOF && c != '\n')
;
printf ("Try again: ");
}
Note that scanf returns the number of items it read successfully. (I changed the outer while condition.)
Edit: It is very important that getchar() be checked for EOF. The loop could spin forever otherwise.
The inner while is followed by a semicolon. This means it does not have a loop body, i.e. does nothing except evaluate its condition until said condition is false.
But what does the code do?
Say I type in Stackoverflow. scanf sees %d in its format string, looks for an integer (any number of digits, optionally prefixed by a sign), finds nothing of the sort, and returns failure (i.e. 0) without changing z.
Execution enters the loop, where getchar is called repeatedly until it finds a newline; this reads and discards all erroneous input. The program prints Try again: (no newline), and the outer loop then evaluates its condition again, trying to read an integer.
Rinse and repeat until an integer is entered, at which point scanf stuffs it in z and the loop finishes.
And what about comparing getchar() (int) to '\n' (char int)?
In C, char is just a smaller int. C does not have a concept of characters separate from a concept of the integers that represent them. '\n' is an int, not a char, so no problems here. (It's for historical reasons - something to do with K&R C.)