how to read anything after EOF has occured - c

I was studing the c programming book of k & r. There is this program to count no of characters in input
#include<stdio.h>
main()
{
long nc;
nc=0;
while(getchar()!=EOF)
++nc;
printf("%ld\n",nc);
}
I was wondering how come after EOF has occured nc can be printed. Is there any way to it.

The end-of-file condition only affects stdin, not stdout. Note that there are no uses of stdin after the EOF is found, just printouts to stdout.

I think you're getting two different things mixed up. EOF is with regard to input. printf is an output function.

getchar() reads from stdin. printf() writes to stdout. They are different streams that usually map to the same physical device (console or terminal).

You should not count on a Ctrl-Z or any terminator
If you were counting on that and were running on traditional *nix shells you would suspend your process rather than terminate the input (read up on JOB CONTROL, in man bash, for example)
(I know this answer comes a bit late but I see you keep mentioning Ctrl-Z in you responses to other answers)
If you are on a *nix system you can use Ctrl-D, but dont expect that to end up in your input stream (its just used as a signaling mechanism).m You can also test this with a file input which should give you more consistent results than typing, i.e.
a.out < prog.c
to count the lines in your c program

Related

When does feof(stdin) next to fgets(stdin) return true?

int main(void){
char cmdline[MAXLINE];
while(1){
printf("> ");
fgets(cmdline, MAXLINE, stdin);
if(feof(stdin)){
exit(0);
}
eval(cmdline);
}
}
This is main part of myShell program that professor gave to me.
But there is one thing I don't understand in code.
There says if(feof(stdin)) exit(0);
What is the end of the standard input?
fgets accept all characters until the enter key is input. The end of a typical "file"(e.g.txt) is intuitively understandable, but what does the end of a standard input mean?
In what practical situations does the feof(stdin) actually return true?
Even if you enter a space without entering anything, the IF statement does not pass.
feof tests the stream’s end-of-file indicator and returns true (non-zero) iff the end-of-file indicator is set.
For regular files, attempting to read past the end of the file sets the end-of-file indicator. For terminals, a typical behavior is that when a program attempts to read from the terminal and gets no data, the end-of-file indicator is set. In Unix systems with default settings, a way to trigger this “no data, end-of-file behavior” is to press control-D at the beginning of a line or immediately after a prior control-D.
The reason this works is because control-D is used to mean “send pending data to the program immediately.” That is described further in this answer.
Thus, if you want to end input for a program, press control-D (and, if not at the beginning of a line, press it a second time).
For input from terminals, while this does cause an end-of-file indication, it does not actually end the input or close the stream. The program can clear the end-of-file indicator and keep reading. Even for regular files, the program could clear the end-of-file indicator, reset the file context to a different position, and continue reading.
The confusion is to assume stdin = terminal. It is not necessarily true.
What stdin is depends on how you run your program.
For example, assuming your executable is named a.out, if you run it like this:
echo "foo" | ./a.out
Stdin is an output of a different process, in this example this process simply outputs the word "foo", so stdin will contain "foo" and then EOF.
Another example is:
./a.out < file.txt
In this case, stdin is "file.txt". When the file is read to the end, stdin gets EOF.
Stdin can also be a special device, for example:
./a.out < /dev/random
In this specific case it is infinite.
Last, when you simply run your program and stdin is terminal - you can generate EOF too - just press CTRL-D, this sends a special symbol meaning EOF to the terminal.
P.S.
There are other ways to execute a process. Here I only gave examples of processes executed from the command line shell. But process can be executed by a different process, not necessarily from the shell. In this case the creator of the process can decide what stdin will be - terminal, pipe, socket, file or any other object.

Clarification regarding working of EOF when data is not read from any external file

In this code from K&R
#include <stdio.h>
main()
{
int c;
while ((c = getchar()) != EOF)
putchar(c);
}
It has been mentioned that
When the end of the input is finally reached, the while terminates and so
does main.
The above program terminates when EOF is encountered but even when the data in the stream ends, the EOF condition is not met and the looping continues.
Since there is no file involved from where the text can be read, I believe that the EOF must be signalled externally by pressing CTRL+SHIFT+D - Am I correct that it should be signalled externally (manually) and it wouldn't happen on its own even when the data in the stream will finish? The book says it will terminate when the input will end.
NOTE: The OS is centos being used on virtual box, hence the commands need SHIFT as well.
EOF ("end-of-file") doesn't actually have any direction correlation to a "file" - it's not something that's present at the end of a file. It's just a state that says "there's no more to read from this stream" and typically returned by a number of I/O functions.
So whether you are reading from a file, pipe, or any other I/O device, its (EOF) meaning is the same: nothing more to read from that stream.
When you are reading from the terminal via stdin stream, there's no such EOF state (pressing return key is NOT the "end-of-file" condition - it's end of line and would input \n char to your program) because you could keep on inputting data. So you'd generate EOF by using the key sequence CTRL + D (or CTRL + Z windows) on a unix-like system. So to answer your question, yes, you would have to generate it yourself in your (K&R) program.
Yes, EOF has to be signaled manually.
The main reason is that there is no way for the system to know if there is nothing more in the stream because the user has finished or if it is because the user is a slow typist

K&R book 1.5.1 File Copying

I have looked around the site regarding this K&R example and the answers seem to revolve around 'why is this a type int or what is EOF?' kinda guys. I believe that I understand those.
It's the results that I don't understand. I had expected this code to take a single character, print it and then wait for another character or EOF.
The results that I see are the input waiting until I press return, then everything that I typed shows up and the more waiting for input.
Is the while loop just 'looping' until I end the text stream with the carrage return and then shows what putchar(c) has been hiding somewhere?
The code is:
#include <stdio.h>
/* copy input to output: 1st version */
main()
{
int c;
c = getchar();
while(c != EOF) {
putchar(c);
c = getchar();
}
}
Now, if I sneak a putchar(c) before on the line just before the while, I sort of get what I expected. I still must enter a text stream and press return. The result is the first character of the stream and the program exits.
Evidently there is a big picture gap for me going on.
Thank you for your help
By default, stdin and stdout are buffered. That means that they save up batches of characters and send them at once for efficiency. Typically, the batch is saved up until there's no more room in the buffer or until there's a newline or EOF in the stream.
When you call getchar(), you're asking from characters from stdin. Supposed you type A, that character is saved in the buffer and then the system waits for more input. If you type B, that character goes into the buffer next. Perhaps after that, you hit Enter, and a newline is put in the buffer. But the newline also interrupts the buffering process, so the original call to getchar() returns the first character in the buffer (A). On the next iteration, you call getchar() again, and it immediately returns the next character in the buffer (B). And so on.
So it's not that your while loop is running until you end the line, it's that the first call to getchar() (when the buffer is empty) is waiting until it has either a full buffer or it has seen a newline.
When you interleave output functions, like putchar(), most C runtime libraries will "flush" stdin when you do something that sends data to stdout (and vice versa). (The intent is to make sure the user sees a prompt before the program waits for input.) That's why you started seeing different behavior when you added the putchar() calls.
You can manually flush a buffer using the flush() function. You can also control the size of the buffer used by the standard streams using setvbuf().
As Han Passant pointed out in the comments, a newline doesn't "terminate the stream." To get an EOF on stdin, you have to type Ctrl+D (or, on some systems, Ctrl+Z). An EOF will also flush the buffer. If you've redirected a file or the output from another program to stdin, the EOF will happen once that input is exhausted.
While it's true that K&R C is very old, and even ANSI C isn't as common today as it was, everything about buffering with stdin and stdout is effectively the same in the current standards and even in C++. I think the only significant change is that the C standards now explicitly call out the desirability of having stdin and stdout cause the other to flush.
I appreciate your answer, and the buffering as you describe is very helpful and interesting.
Evidently, I also must have mis-read/understood, K&R. They define a text stream as ". . . consists of zero or more characters followed by a new line character," which I took to mean the return/enter key; ending it, and then allowing output.
Also, I would like to thank all of you who offered helpful comments.
By the way, I clearly understood that I had to enter ^D to generate EOF, which terminates the program. I appreciate that you are all top level programmers, and thank you for your time. I guess that I will need to find another place to discuss what the text that R&R wrote regarding this exercise is all about.

Recover stdin from eof in C

I am using the C code below to read user input from a terminal. If the user inputs EOF, e.g. by pressing ^C, stdin is closed and subsequent attempts to read from it, e.g. via getchar() or scanf(), will cause an exception.
Is there anything I can do in C to "recover" my program, in the sense that if some user accidently inputs EOF, this will be ignored, so I can read from stdin again?
#include <stdio.h>
int main(void)
{
int res_getchar=getchar();
getchar();
return 0;
}
If I understand the situation correctly - you're reading from a terminal via stdin, the user types ^D, you want to discard that and ask again for input - you have two options, one more portable (and quite simple) but less likely to work, and one less portable (and considerably more programming) but certain to work.
The clearerr function is standard C, and is documented to clear both the sticky error and sticky EOF flags on a FILE object; if your problem is that the C library isn't bothering to call read again once it's indicated EOF once, this may help.
If this solves your immediate problem, make sure that if you get some number of EOFs in a row (four to ten, say) you give up and quit, because if stdin is not a terminal, or if the terminal has genuinely been closed down, that EOF condition is never going to go away, and you don't want your program to get stuck in an infinite loop when that happens.
On POSIX-compliant systems only (i.e. "not Windows"), you can use cfmakeraw to disable the input preprocessing that turns ^D into an EOF indication.
Doing this means you also have to handle a whole lot of other stuff yourself; you may instead want to use a third-party library that handles it for you, e.g. readline (GPL) or editline (BSD). If your program is any sort of nontrivial interactive command interpreter, using one of these libraries is strongly encouraged, as it will provide a much nicer user experience.
Using ungetc() to push back a character can clear the EOF indicator for a stream.
C99 §7.19.7.11 The ungetc function
int ungetc(int c, FILE *stream);
A successful call to the ungetc function clears the end-of-file indicator for the stream. The value of the file position indicator for the stream after reading or discarding all pushed-back characters shall be the same as it was before the characters were pushed back.
In a word, no. You read EOF when the OS has closed stdin.
I am sure there are Platform-dependent ways to preserve some info that would let you reconstruct stdin after it was closed -- ie, open a new stream connected to the keyboard and assign it to stdin -- but there's definitely no portable way.

Why does getchar() recognize EOF only in the beginning of a line?

This example is from the K&R book
#include<stdio.h>
main()
{
long nc;
nc = 0;
while(getchar() != EOF)
++nc;
printf("%ld\n", nc);
}
Could you explain me why it works that way. Thanks.
^Z^Z doesn't work either (unless it's in the beginning of a line)
Traditional UNIX interpretation of tty EOF character is to make blocking read return after reading whatever is buffered inside a cooked tty line buffer. In the start of a new line, it means read returning 0 (reading zero bytes), and incidentally, 0-sized read is how the end of file condition on ordinary files is detected.
That's why the first EOF in the middle of a line just forces the beginning of the line to be read, not making C runtime library detect an end of file. Two EOF characters in a row produce 0-sized read, because the second one forces an empty buffer to be read by an application.
$ cat
foo[press ^D]foo <=== after ^D, input printed back before EOL, despite cooked mode. No EOF detected
foo[press ^D]foo[press ^D] <=== after first ^D, input printed back, and on second ^D, cat detects EOF
$ cat
Some first line<CR> <=== input
Some first line <=== the line is read and printed
[press ^D] <=== at line start, ^D forces 0-sized read to happen, cat detects EOF
I assume that your C runtime library imitates the semantics described above (there is no special handling of ^Z at the level of kernel32 calls, let alone system calls, on Windows). That's why it would probably detect EOF after ^Z^Z even in the middle of an input line.
The program will read EOF only at the actual end of the input. If your terminal/OS/whatever only permit files to end at the start of a line then that's where you'll find them. I believe this is a throw-back to old-fashioned terminals where data was only transmitted a line at a time (for all I know it goes back to punched card readers).
Try reading your data from a file that you've preprepared with an EOF mid-line. You may even find that some editors make this difficult! Your program should work fine with that as input.
EOF indicates "end of file". A newline (which is what happens when you press enter) isn't the end of a file, it's the end of a line, so a newline doesn't terminate this loop.
Depending on the operating system, EOF character will only work if it's the first character on a line, i.e. the first character after an Enter. Since console input is often line-oriented, the system may also not recognize the EOF character until after you've followed it up with an Enter.
I happened to have the same question as you. When I want to end the function getchar(), I have to enter 2 EOF or enter a <ENTER> plus a EOF.
And here's an easier answer I searched about this question:
If there is characters entering in the terminal, EOF will play the role as stopping this entering, which will arouse a new turn of entering; while, if there is no entering happening, or in another word, when the getchar() is waiting for a new enter(such as you've just finished entering or a EOF), the EOF you are about to enter now equals "end of file", which will lead the program stop executing the function getchar().
PS: the question happens when you are using getchar(). I think this answer is easier to understand, but maybe not for you since it is translated from Chinese...

Resources