In this code:
#include<stdio.h>
int main()
{
int i,p=0;
while(i!=EOF)
{
i=getchar();
putchar(i);
printf("\n");
}
return 0;
}
When I enter hello as input in one go, the output is h then in the next line e and so on. But when h is printed then before printing e why getchar() doesn't take pause to take input from me just like it did in the first time?
getchar() returns either any successfully read character from stdin or some error, so which function is demanding terminal input and then sending it to stdin?
Input from a terminal is generally buffered. This means it is held in memory waiting for your program to read it.
This buffer is performed by multiple pieces of software. The software that is actually reading your input in the terminal window generally accumulates characters you type until you press enter or press certain other keys or combinations that end the current input. Then the line that has been read is made available to your program.
Inside your program, the C standard library, of which getchar is a part, reads the data that has been sent to it and holds it in a buffer of its own. The getchar routine reads the next character from this buffer. (If the buffer is empty when getchar wants another character, getchar will block, waiting for new data to arrive from the terminal software.)
It's because of the loop condition. You are continuing to loop until EOF is received. When you type "hello", it works exactly as you expect except STDIN has more characters in the buffer and none of them are EOF. The program prints out "h", then a newline, and goes back to check the loop condition. EOF has not been found, so then it gets the next character from STDIN (which you have already provided) and the cycle repeats.
If you remove the loop it will only print one character.
I am working on the infamous book "Prentice Hall Software Series" and trying out the Code they write and modifying it to learn more about C.
I am working with VIM on Fedora 25 in the console. The following code is a quote from the book, I know "int" is missing as well as argc and argv etc.
Kernighan and Ritchie - The C Programming Language: Page 20
#include <stdio.h>
/* copy input to output; 1st version */
main(){
int c;
c = getchar();
while (c != EOF) {
putchar(c);
c = getchar();
}
}
With this code I couldn't manage to get the "EOF" to work. I am not sure if "ctr + z" really is the real thing to do, since it quits any console program in console.
Well since I was unsure i changed the condition to
...
while (c != 'a') {
...
So normally if I enter 'a' the while condition should break and the programm should terminate. Well it does not when I try to run it and enter 'a'. What is the problem here?
Thank you guys!
There's nothing wrong with the code (except the archaic declaration of main).
Usually, on Unixes end of file is signalled to the program by ctrl-D. If you hit ctrl-D either straight away or after hitting new-line, your program will read EOF.
However, the above short explanation hides a lot of subtleties.
In Unix, terminal input can operate in one of two modes (called IIRC raw and cooked). In cooked mode - the default - the OS will buffer input from the terminal until it either reads a new line or a ctrl-D character. It then sends the buffered input to your program.
Ultimately, your program will use the read system call to read the input. read will return the number of characters read but normally will block until it has some characters to read. getchar then passes them one by one to its caller. So getchar will block until a whole line of text has been received before it processes any of the characters in that line. (This is why it still didn't work when you used a).
By convention, the read system call returns 0 when it gets end of file. This is how getchar knows to return EOF to the caller. ctrl-D has the effect of forcing read to read and empty buffer (if it is sent immediately after a new line) which makes it look to getchar like it's reached EOF even though nobody has closed the input stream. This is why ctrl-D works if it is pressed straight after new line but not if it is pressed after entering some characters.
Trying to understand the behavior of my code. I'm expecting Ctrl-D to lead to the program printing the array and exiting, however it takes 3 presses, and it enters the while loop after the second press.
#include <stdio.h>
#include <stdlib.h>
void unyon(int p, int q);
int connected(int p, int q);
int main(int argc, char *argv[]) {
int c, p, q, i, size, *ptr;
scanf("%d", &size);
ptr = malloc(size * sizeof(int));
while((c = getchar()) != EOF){
scanf("%d", &p);
scanf("%d", &q);
printf("p = %d, q = %d\n", p, q);
}
for(i = 0; i < size; ++i)
printf("%d\n", *ptr + i);
free(ptr);
return 0;
}
I read the post here, but I don't quite understand it.
How to end scanf by entering only one EOF
After reading that, I'm expecting the first Ctrl-D to clear the buffer, and then I'm expecting c = getchar() to pick up the second Ctrl-D and jump out. Instead the second Ctrl-D enters the loop and prints p and q, and it takes a third Ctrl-D to drop out.
This is made more confusing by the fact that the code below drops out on the first Ctrl-D-
#include <stdio.h>
main() {
int c, nl;
nl = 0;
while((c = getchar()) != EOF)
if (c == '\n')
++nl;
printf("%d\n", nl);
}
Let's just strip the program down to the calls which do input:
scanf("%d", &size); // Statement 1
while((c = getchar()) != EOF){ // 2
scanf("%d", &p); // 3
scanf("%d", &q); // 4
}
That is definitely not the way to go; we'll get to the correct usage in a bit. For now, let's just analyze what happens. It's important to understand precisely how scanf works. The %d format code causes it to first skip over any whitespace characters, and then read characters as long as the characters can be made into a decimal integer. Eventually some character will be read which is not part of a decimal integer; most likely a newline character. Because the format string is now finished, the unused character which has just been read will be reinserted into the stream.
So when the call to getchar is made, getchar will read and return the newline character which terminated the integer. Inside the loop, there are then two calls to scanf("%d"), each of which will behave as indicated above: skip whitespace if any, read a decimal integer, and reinsert the unused character back into the input stream.
Now, let's suppose that you run the program, and enter the number 42 followed by the enter key, and then Ctrl-D to close the input stream.
The 42 will be read by statement 1, and (as mentioned above) the newline will be read by statement 2. So when statement 3 is executed, there is no more data to be read. Because end-of-file is signaled before any digit is read, scanf will return EOF. However, the code does not test the return value of scanf; it goes on to statement 4.
What should happen at this point is that the scanf in statement 4 should immediately return EOF without attempting to read more input. That's what the C standard says should happen, and it is what Posix says should happen. Once end-of-file has been signaled on a stream, any input request should immediately return EOF until the end-of-file indicator is manually cleared. (See below for standards quotes.)
But glibc, for reasons we won't go into just yet, does not conform to the standard. It attempts another read. And so the user must enter another Ctrl-D, which will cause the scanf at statement 4 to return EOF. Again, the code does not check the return code, so it continues with the while loop and calls getchar again at statement 2. Because of the same bug, getchar does not immediately return EOF, but instead attempts to read a character from the terminal. So the user must now type a third Ctrl-D to cause getchar to return EOF. Finally, the code checks a return code, and the while loop terminates.
So that is the explanation of what is happening. Now, it is easy to see at least one mistake in the code: the return value of scanf is never checked. Not only does this mean that EOF is missed, it also means that input errors are ignored. (scanf would have returned 0 if the input could not be parsed as an integer.) That's serious, because if scanf cannot succesfully match the format code, the value of the corresponding argument is undefined and must not be used.
In short: Always check return values from *scanf. (And other I/O library functions.)
But there is a more subtle mistake as well, which makes little difference in this case but could, in general, be serious. The character read by getchar in statement 2 is simply discarded, regardless of what it was. Normally it will be whitespace, so it doesn't matter that it is discarded, but you don't actually know that because the character is discarded. Maybe it was a comma. Maybe it was a letter. Maybe it matters what it was.
It is bad style to rely on the assumption that whatever character is read by the getchar at statement 2 is unimportant. If you really need to peek at the next character, you should reinsert it into the input stream, just as scanf does:
while ((c = getchar()) != EOF) {
ungetc(c, stdin); /* Put c back into the input stream */
...
}
But actually, that test is not what you want at all. As we have already seen, it is extremely unlikely that getchar will return EOF at this point. (It's possible, but it's very unlikely). Much more more probable is that getchar will read a newline character, even though the next scanf will encounter the end-of-file. So there was absolutely no point peeking at the next character; the correct solution is to check the return code of scanf, as indicated above.
Putting that together, what you really want here is something more like:
/* No reason to use two scanf calls to read two consecutive numbers */
while ((count = scanf("%d%d", &p, &q)) == 2) {
/* Do something with p and q */
}
if (count != EOF) {
/* Invalid format. Issue an error message, at least */
}
/* Do whatever needs to be done at the end of input. */
Finally, let's examine glibc's behaviour. There is a very long-standing bug report linked to by an answer to the question cited in the OP. If you take the trouble to read through to the most recent post in the bugzilla thread, you'll find a link to a discussion on the glibc developer mailing list.
Let me give the TL;DR version, and save you the trouble of digital archaeology. Since C99, the standard has been clear that EOF is "sticky". §7.21.3/11 states that all input is performed as though successive bytes were read by fgetc:
...The byte input functions read characters from the stream as if by successive calls to the fgetc function.
And §7.21.7.1/3 states that fgetc returns EOF immediately if the stream's end-of-file indicator is set:
If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of-file indicator for the stream is set and the fgetc function returns EOF. Otherwise, the fgetc function returns the next character from the input stream pointed to by stream. If a read error occurs, the error indicator for the stream is set and the fgetc function
returns EOF.
So once the end-of-file indicator is set, because either end of file was detected or some read error occurred, subsequent input operations must immediately return EOF without attempting to read from the stream. Various things can clear the end-of-file indicator, including clearerr, seek, and ungetc; once the end-of-file indicator has been cleared, the next input function call will again attempt to read from the stream.
However, it wasn't always like that. Before C99, the result of reading from a stream which had already returned EOF was unspecified. And different standard libraries chose to handle it in different ways.
So a decision was made to not change glibc to conform to the (then) new standard, but rather to maintain compatibility with certain other C libraries, notably Solaris. (A comment in the glibc source is quoted in the bug report.)
Although there is a compelling argument (at least, compelling to me) that fixing the bug is not likely to break anything important, there is still a certain reluctance to do anything about it. And so, here we are, ten years later, with a still-open bug report, and a non-conforming implementation.
If you run it through the debugger you will get a clearer picture. Here is the sequence of events.
scanf("%d", &size); is called.
A number is input followed by ENTER. The key here is that scanf does not consume the \n that results from the ENTER.
getchar is called. This consumes the \n.
scanf("%d", &p); is called. This consumes the first ctrl-D. If the return value were checked then it would be apparent that an error occured.
scanf("%d", &q); is called. This consumes the second ctrl-D.
Loop goes back to the top and calls getchar. The third ctrl-D then causes EOF to be returned by getchar and hence the loop breaks out at that point.
I'll leave it as an exercise for you to explain why the second program functions as expected.
There are different things messing here.
First of all, when you type Ctrl-D to the input terminal, the tty driver is processing your input, adding each character in a buffer and processing special characters. One of these special characters (Ctrl-D) means take up to the last char and make them all available to the system. This makes two things to happen: first, the Ctrl-D character is eliminated from the data stream and; second, all the characters typed up so far are made available to be read(2) by the process syscall. getchar() is a buffered library call that avoids making one read per character, allowing to store previously read characters in the buffer.
Other thing messing here is the way the system signals the end of file in posix systems (and all unix systems). When you make a read(2) system call, the return value is the actual number of characters read (or -1 in case of failure, but this has nothing to do with EOF, as will be explained soon). And the system marks the end of file condition by returning 0 characters. So, the operating system marks the end of file making read(2) return 0 bytes as a result (if you only hit the return key, that will make a \n to appear in the data stream).
The third thing messing up here is the type of return value from getchar(3) function. It doesn't return a char value. As all possible byte values are posible to be returned for getchar(3), there's no possibility to reserve a special value for signalling a EOF. The solution adopted a long, long, time ago (when getchar(3) was designed, that is in the first version of the C language, (see The C programming language by Brian Kernighan and Denis Ritchie, first ed.) was to use an int as return value to be able to return all the possible byte values (0..255) plus one extra value, called EOF. The exact value of EOF is implementation dependant, but normally defined as -1 (I think even the standard specifies now it must be defined as -1, but not sure)
So, making all things work together, EOF is an int constant defined to allow programers to write while ((c = getchar()) != EOF). You will never get -1 as a data value from the terminal. The system always marks the end of file condition by making read(2) to return 0. And the terminal driver on receiving Ctrl-D just eliminates it from the stream and makes data up to, but not including (as different from Ctrl-J or Ctrl-M, line feed and carry return, respectivelly, that are also interpreted and are input as \n in the data stream)
So, next the question is: Why there are needed normally two (or more) Ctrl-D chars to signal eof?
Right, as I've said, one only makes all thata up to the Ctrl-D (but not including it) available to the kernel, so the result from read(2) can be a number different than 0 for the first time. But what is sure is that if you enter the Ctrl-D char twice in sequence, after the first there were not be more chars in between the two chars, assuring a read() of zero chars. Normally, programs are in a loop, doing multiple reads
while ((n_read = read(fd, buffer, sizeof buffer)) > 0) {
/* NORMAL INPUT PROCESSING GOES HERE, for up to n_read bytes
* stored in buffer */
} /* while */
if (n_read < 0) {
/* ERROR PROCESSING GOES HERE */
} else {
/* EOF PROCESSING GOES HERE */
} /* if */
In the case of files, the behaviour is different, as Ctrl-D is not interpreted by any driver (it's stored in the disk file) so you'll get Ctrl-D as a normal character (it's value is \004)
When you read a file, normally this deals to reading a lot of complete buffers, then make a partial read (with less than the buffer size bytes input) and a final read of zero bytes, signalling that the file has ended.
Note
Depending on the configuration of the tty driver in some unices, the eof character can be changed and have different mean. Also happens to the return character and linefeed character. Se termios(3) manual page for a detailed documentation on this.
as a Linux system programming exercise I've written my own version of the tree command, which is to read from stdin and write to stdout using only the basic read() and write() C library functions. I've done it so that when an asterisk (*) is entered, the program is terminated. I have managed to get it to work properly, my problem is that I don't really understand why it works the way it does. What confuses me is the buffer. First of all, here is the code portion in question:
char buf[1];
...
do {
read(STDIN_FILENO, buf, 1);
if( buf[0] == '*') break;
write(STDOUT_FILENO, buf, 1);
} while( buf[0] != '*');
...
My idea was to read from stdin char by char, thereby storing the char in buf, check if it was an asterisk, then write the char from buf to stdout.
The behaviour is the following: I type a string of any number of chars, press ENTER, that string gets output to stdout, at which point I can type a new char string. If the string ends with an asterisk, the string is output up until the asterisk, then the program is terminated.
My problems are:
1) buf is sopposed to contain only one char. How is it possible that I enter any number of chars und upon pressing ENTER all of them are output to stdout? I would expect one char at a time to be output, or only the last one. How does a one-char buffer store all of those chars? Or do many one-char buffers get created? By whom?
2) What is so special about the newline character that prompts the string to be output? Why is it not just another char within the string? Is it just a matter of definition within the function read()?
Thank you for any help in understanding the working of the buffer!
This is based upon the way the IO calls - read and write will work on most OS's.
You are reading only 1 byte, so while you are typing, stuff will be held by an io buffer (not yours), until your loop reads it. Since you have no sleeps, it will be reading, or waiting to read faster than you can humanly type.
Also as R Sahu suggests - the input buffer may not be presented to your program until you press enter on the console you are typing at. This depends on the console and its config - but most will buffer lines and wait for enter too. This would be different if you were piping into stdin.
The last parameter to read, the '1', is what instructs it to read one byte here.
The second part is that your output is also buffered, and newline is commonly used by console output buffers to flush and show the line. Until that case, it is being written by your code to that output buffer. If you do not want this behaviour, then an fflush call after the write should output character by character instead.
When you type in your input at a console, the input characters are not immediately fed to stdin. After you press the Enter button, the entire line you typed, including the newline character, are is fed to stdin by the run time environment.
In section 1.5.2 of the 2nd ed. K&R introduce getchar() and putchar() and give an example of character counting, then line counting, and others throughout the chapter.
Here is the character counting program
#include <stdio.h>
main() {
long nc;
nc = 0;
while (getchar() != EOF)
++nc;
printf("%ld\n",nc);
}
where should the input come from? typing into the terminal command window and hitting enter worked for the file copying program but not for this. I am using XCode for Mac.
It seems like the easiest way would be to read a text file with pathway "pathway/folder/read.txt" but I am having trouble with that as well.
From the interactive command line, press ctrl-D after a newline, or ctrl-D twice not after newline, to terminate the input. Then the program will see EOF and show you the results.
To pass a file by path, and avoid the interactive part, use the < redirection operator of the shell, ./count_characters < path/to/file.txt.
Standard C input functions only start processing what you type in when you press the Enter key IOW.Every key you press adds a character to the system buffer (shell).Then when the line is complete (ie, you press Enter), these characters are moved to C standard buffer. getchar() reads the first character in the buffer, which also removes it from the buffer.Each successive call to getchar() reads and removes the next char, and so on. If you don't read every character that you had typed into the keyboard buffer, but instead enter another line of text, then the next call to getchar() after that will continue reading the characters left over from the previous line; you will usually witness this as the program blowing past your second input. BTW, the newline from the Enter key is also a character and is also stored in the keyboard buffer, so if you have new input to read in you first need to clear out the keyboard buffer.