K&R book 1.5.1 File Copying - c

I have looked around the site regarding this K&R example and the answers seem to revolve around 'why is this a type int or what is EOF?' kinda guys. I believe that I understand those.
It's the results that I don't understand. I had expected this code to take a single character, print it and then wait for another character or EOF.
The results that I see are the input waiting until I press return, then everything that I typed shows up and the more waiting for input.
Is the while loop just 'looping' until I end the text stream with the carrage return and then shows what putchar(c) has been hiding somewhere?
The code is:
#include <stdio.h>
/* copy input to output: 1st version */
main()
{
int c;
c = getchar();
while(c != EOF) {
putchar(c);
c = getchar();
}
}
Now, if I sneak a putchar(c) before on the line just before the while, I sort of get what I expected. I still must enter a text stream and press return. The result is the first character of the stream and the program exits.
Evidently there is a big picture gap for me going on.
Thank you for your help

By default, stdin and stdout are buffered. That means that they save up batches of characters and send them at once for efficiency. Typically, the batch is saved up until there's no more room in the buffer or until there's a newline or EOF in the stream.
When you call getchar(), you're asking from characters from stdin. Supposed you type A, that character is saved in the buffer and then the system waits for more input. If you type B, that character goes into the buffer next. Perhaps after that, you hit Enter, and a newline is put in the buffer. But the newline also interrupts the buffering process, so the original call to getchar() returns the first character in the buffer (A). On the next iteration, you call getchar() again, and it immediately returns the next character in the buffer (B). And so on.
So it's not that your while loop is running until you end the line, it's that the first call to getchar() (when the buffer is empty) is waiting until it has either a full buffer or it has seen a newline.
When you interleave output functions, like putchar(), most C runtime libraries will "flush" stdin when you do something that sends data to stdout (and vice versa). (The intent is to make sure the user sees a prompt before the program waits for input.) That's why you started seeing different behavior when you added the putchar() calls.
You can manually flush a buffer using the flush() function. You can also control the size of the buffer used by the standard streams using setvbuf().
As Han Passant pointed out in the comments, a newline doesn't "terminate the stream." To get an EOF on stdin, you have to type Ctrl+D (or, on some systems, Ctrl+Z). An EOF will also flush the buffer. If you've redirected a file or the output from another program to stdin, the EOF will happen once that input is exhausted.
While it's true that K&R C is very old, and even ANSI C isn't as common today as it was, everything about buffering with stdin and stdout is effectively the same in the current standards and even in C++. I think the only significant change is that the C standards now explicitly call out the desirability of having stdin and stdout cause the other to flush.

I appreciate your answer, and the buffering as you describe is very helpful and interesting.
Evidently, I also must have mis-read/understood, K&R. They define a text stream as ". . . consists of zero or more characters followed by a new line character," which I took to mean the return/enter key; ending it, and then allowing output.
Also, I would like to thank all of you who offered helpful comments.
By the way, I clearly understood that I had to enter ^D to generate EOF, which terminates the program. I appreciate that you are all top level programmers, and thank you for your time. I guess that I will need to find another place to discuss what the text that R&R wrote regarding this exercise is all about.

Related

How getchar() works when it is used as condition in while loop

I cant understand how the following code really works.
int main() {
char ch;
while((ch=getchar())!='\n')
{
printf("test\n");
}
return 0;
}
Lets say we give as an input "aaa". Then we get the word "test" as an output in 3 seperate lines.
Now my question is, for the first letter that we type, 'a', does the program goes inside the while loop and remembers that it has to print something when the '\n' character is entered? Does it store the characters somewhere and then traverses them and executes the body of the while loop? Im lost.
There are many layers between the user writing input into a terminal, and your program receiving that input.
Typically the terminal itself have a buffer, which is flushed and sent to the operating system when the user presses the Enter key (together with a newline from the Enter key itself).
The operating system will have some internal buffers where the input is stored until the application reads it.
Then in your program the getchar function itself reads from stdin which is usually also buffered, and the characters returned by getchar are taken one by one from that stdin buffer.
And as mentioned in a comment to your question, note that getchar returns an int, which is really important if you ever want to compare what it returns against EOF (which is an int constant).
And you really should compare against EOF, otherwise you won't detect if there's an error or the user presses the "end-of-file" key sequence (Ctrl-D on POSIX systems like Linux or macOS, or Ctrl-Z on Windows).
What you see is due to the I/O line buffering.
The getchar() functions doesn't receive any input until you press the enter. This add the \n completing the line.
Only at this point the OS will start to feed characters to the getchar(), that for each input different from \n prints the test message.
Apparently the printout is done together after you press the enter.
You can change this behavior by modifying the buffering mode with the function setvbuf(). Setting the mode as _IONBF you can force the stream as unbuffered, giving back each character as it is pressed on the keyboard (or at least on an *nix system, MS is not so compliant).

Pause the execution of exe of a C code

I have developed a simple console utility in C which parses various text files.
IDE - Code Blocks
OS - windows
I intend to distribute its executable.
The executable works fine, however unlike when executed from the IDE, the execution does not pause/wait for keystroke at the end of execution.
I tried using getchar()/system("pause"), but the execution doesn't pause there.
Is there an alternative to wait for keystroke before ending execution, so that the user can view the output?
You can use
getchar();
twice , because its very likely that last '\n' newline character will get consumed by your getchar().
or use
scanf(" %c");
with that extra space
at the end of your file .
It depends on how other parts of your code receives input from the user (i.e. reading from stdin).
The getchar() approach will work fine if your program is not reading anything from the user, or is reading using getchar().
A general guideline, however, is to be consistent in style of input from every stream. Style of input refers to character-oriented (functions like getchar()), line-oriented (like fgets()), formatted (functions like scanf()), or unformatted (like fread()). Each one of those functions does different things depending on input - for example getchar() will read a newline as an integral value, fgets() will leave a newline on the end of the string read if the buffer is long enough, scanf() will often stop when it encounters a newline but leave the newline in the stream to be read next.
The net effect is that different styles of input will interact, and can produce strange effects (e.g. data being ignored, not waiting for input as you are seeing).
For example, if you are using scanf(), you should probably also use scanf() to make your program wait at the end. Not getchar() - because, in practice, there may well be a newline waiting to be read, so getchar() will return immediately, and your program will not pause before terminating.
There are exceptions to the above (e.g. depending on what format string is used, and what the user inputs). But as a rule of thumb: be consistent in the manner you are reading from stdin, and the user will have to work pretty hard to stop your program pausing before terminating.
An easier alternative, of course, is to run the program from the command line (e.g. the CMD.EXE command shell). Then the shell will take over when your program terminates, the program output will be visible to the user, so your program does not need to pause.
Don't use system("pause") since it's not portable. getchar should work on the other hand. Can you post some code? Maybe there's something on the keyboard buffer that's being consumed by your one and only getchar call.

Recover stdin from eof in C

I am using the C code below to read user input from a terminal. If the user inputs EOF, e.g. by pressing ^C, stdin is closed and subsequent attempts to read from it, e.g. via getchar() or scanf(), will cause an exception.
Is there anything I can do in C to "recover" my program, in the sense that if some user accidently inputs EOF, this will be ignored, so I can read from stdin again?
#include <stdio.h>
int main(void)
{
int res_getchar=getchar();
getchar();
return 0;
}
If I understand the situation correctly - you're reading from a terminal via stdin, the user types ^D, you want to discard that and ask again for input - you have two options, one more portable (and quite simple) but less likely to work, and one less portable (and considerably more programming) but certain to work.
The clearerr function is standard C, and is documented to clear both the sticky error and sticky EOF flags on a FILE object; if your problem is that the C library isn't bothering to call read again once it's indicated EOF once, this may help.
If this solves your immediate problem, make sure that if you get some number of EOFs in a row (four to ten, say) you give up and quit, because if stdin is not a terminal, or if the terminal has genuinely been closed down, that EOF condition is never going to go away, and you don't want your program to get stuck in an infinite loop when that happens.
On POSIX-compliant systems only (i.e. "not Windows"), you can use cfmakeraw to disable the input preprocessing that turns ^D into an EOF indication.
Doing this means you also have to handle a whole lot of other stuff yourself; you may instead want to use a third-party library that handles it for you, e.g. readline (GPL) or editline (BSD). If your program is any sort of nontrivial interactive command interpreter, using one of these libraries is strongly encouraged, as it will provide a much nicer user experience.
Using ungetc() to push back a character can clear the EOF indicator for a stream.
C99 ยง7.19.7.11 The ungetc function
int ungetc(int c, FILE *stream);
A successful call to the ungetc function clears the end-of-file indicator for the stream. The value of the file position indicator for the stream after reading or discarding all pushed-back characters shall be the same as it was before the characters were pushed back.
In a word, no. You read EOF when the OS has closed stdin.
I am sure there are Platform-dependent ways to preserve some info that would let you reconstruct stdin after it was closed -- ie, open a new stream connected to the keyboard and assign it to stdin -- but there's definitely no portable way.

Why does getchar() recognize EOF only in the beginning of a line?

This example is from the K&R book
#include<stdio.h>
main()
{
long nc;
nc = 0;
while(getchar() != EOF)
++nc;
printf("%ld\n", nc);
}
Could you explain me why it works that way. Thanks.
^Z^Z doesn't work either (unless it's in the beginning of a line)
Traditional UNIX interpretation of tty EOF character is to make blocking read return after reading whatever is buffered inside a cooked tty line buffer. In the start of a new line, it means read returning 0 (reading zero bytes), and incidentally, 0-sized read is how the end of file condition on ordinary files is detected.
That's why the first EOF in the middle of a line just forces the beginning of the line to be read, not making C runtime library detect an end of file. Two EOF characters in a row produce 0-sized read, because the second one forces an empty buffer to be read by an application.
$ cat
foo[press ^D]foo <=== after ^D, input printed back before EOL, despite cooked mode. No EOF detected
foo[press ^D]foo[press ^D] <=== after first ^D, input printed back, and on second ^D, cat detects EOF
$ cat
Some first line<CR> <=== input
Some first line <=== the line is read and printed
[press ^D] <=== at line start, ^D forces 0-sized read to happen, cat detects EOF
I assume that your C runtime library imitates the semantics described above (there is no special handling of ^Z at the level of kernel32 calls, let alone system calls, on Windows). That's why it would probably detect EOF after ^Z^Z even in the middle of an input line.
The program will read EOF only at the actual end of the input. If your terminal/OS/whatever only permit files to end at the start of a line then that's where you'll find them. I believe this is a throw-back to old-fashioned terminals where data was only transmitted a line at a time (for all I know it goes back to punched card readers).
Try reading your data from a file that you've preprepared with an EOF mid-line. You may even find that some editors make this difficult! Your program should work fine with that as input.
EOF indicates "end of file". A newline (which is what happens when you press enter) isn't the end of a file, it's the end of a line, so a newline doesn't terminate this loop.
Depending on the operating system, EOF character will only work if it's the first character on a line, i.e. the first character after an Enter. Since console input is often line-oriented, the system may also not recognize the EOF character until after you've followed it up with an Enter.
I happened to have the same question as you. When I want to end the function getchar(), I have to enter 2 EOF or enter a <ENTER> plus a EOF.
And here's an easier answer I searched about this question:
If there is characters entering in the terminal, EOF will play the role as stopping this entering, which will arouse a new turn of entering; while, if there is no entering happening, or in another word, when the getchar() is waiting for a new enter(such as you've just finished entering or a EOF), the EOF you are about to enter now equals "end of file", which will lead the program stop executing the function getchar().
PS: the question happens when you are using getchar(). I think this answer is easier to understand, but maybe not for you since it is translated from Chinese...

how to read anything after EOF has occured

I was studing the c programming book of k & r. There is this program to count no of characters in input
#include<stdio.h>
main()
{
long nc;
nc=0;
while(getchar()!=EOF)
++nc;
printf("%ld\n",nc);
}
I was wondering how come after EOF has occured nc can be printed. Is there any way to it.
The end-of-file condition only affects stdin, not stdout. Note that there are no uses of stdin after the EOF is found, just printouts to stdout.
I think you're getting two different things mixed up. EOF is with regard to input. printf is an output function.
getchar() reads from stdin. printf() writes to stdout. They are different streams that usually map to the same physical device (console or terminal).
You should not count on a Ctrl-Z or any terminator
If you were counting on that and were running on traditional *nix shells you would suspend your process rather than terminate the input (read up on JOB CONTROL, in man bash, for example)
(I know this answer comes a bit late but I see you keep mentioning Ctrl-Z in you responses to other answers)
If you are on a *nix system you can use Ctrl-D, but dont expect that to end up in your input stream (its just used as a signaling mechanism).m You can also test this with a file input which should give you more consistent results than typing, i.e.
a.out < prog.c
to count the lines in your c program

Resources