I wrote a simple program to test the scanf() function in C. It basically reads from the terminal, char by char, reprinting the return value and the char read; and to terminate if EOF/EOT is met or if a \n newline is read.
#include <stdio.h>
#include <stdbool.h>
int main(void) {
char c; int ret;
printf("Enter the chars to test: ");
//LOOP (scan & print) only when return is not EOF and char is not newline
while ( ((ret = scanf("%c", &c)) != EOF) && c!='\n' ) {
printf("%i %c\n", ret, c);
}
return 0;
}
It terminates correctly, if newline (Enter) is pressed. But the it won't just terminate with a single Ctrl-D. A single Ctrl-D will flush the typed 'chars' and printing them. Then after that it will wait again for input, although an EOF has been sent with the Ctrl-D. If we press Ctrl-D again the 2nd time directly after the 1st (2x) or just Enter it will terminate. So you will need two consecutive Ctrl-D to terminate the program (or the loop in this case).
Example:
If you input 987 on the terminal, then press Enter; then 1 9, 1 8, 1 7 will be printed on newline each.
If you input 987 on the terminal, then press Ctrl-D; then 1 9 will be printed on the same line (because there is no Enter typed after inputing the 987 input), 1 8, 1 7 will be printed on newline. Then it will still wait for more inputs, unless it is terminated by directly entering a 2nd consecutive Ctrl-D or with a newline (Enter). So it (the program) will only stop (exit the loop) after a newline or 2nd consecutive Ctrl-D.
I am confused. Shouldn't a single Ctrl-D sent stop the loop here? What should I do to stop the program (scanf loop) after receiving just a single Ctrl-D?
I tested the code on Lubuntu 19.10 with gcc 9.2.1.
The issue is scanf() does not return EOF until there is no more input waiting and EOF is encountered. (your "%c" conversion specifier will otherwise accept any character, including the '\n' character, as valid and consider it a successful conversion)
When you type a line of characters, e.g. "abcdefg" and attempt to press Ctrl+d (indicating end-of-input), the input ("abcdefg") is processed and when 'g' is reached, then scanf() blocks waiting on your next input. (because a successful conversion took place and no matching or input failure occurred)
Once Ctrl+d is typed a 2nd time (indicating end-of-input), when there is no input to process, EOF is reached before the first successful conversion and an input-failure occurs,scanf() then returns EOF.
See: man 3 scanf -- RETURN VALUE section and C11 Standard - 7.21.6.2 The fscanf function(p16)
Test the returned value for one, not EOF. When eof is reached first time, zero is returned. The next scanf after zero returns EOF.
while ( ((ret = scanf("%c", &c)) == 1) && c!='\n' ) {
printf("%i %c\n", ret, c);
}
This behavior is specified for the standard C library.
On success, the function returns the number of items of the argument list successfully filled. This count can match the expected number of items or be less (even zero) due to a matching failure, a reading error, or the reach of the end-of-file.
If a reading error happens or the end-of-file is reached while reading, the proper indicator is set (feof or ferror).
You can add if (feof(stdin) != 0) break; into the loop body, if you use more complex input format.
If you read chars only, is it not better use fgetc?
Related
I have been trying to understand how EOF works. In my code (on Windows) invoking EOF (Ctrl+Z and Enter) doesn't work the first time and I have to provide two EOF for it to actually stop reading input. Also, the first EOF gets read as some garbage character which gets displayed when I print the input. (We can see the garbage characters being display at the end in the output provided).
This is my code:-
#include<stdio.h>
#define Max 1000
int main()
{
char c, text[Max];
int i = 0;
while((c = getchar()) != EOF)
{
text[i] = c;
i++;
}
printf("\nEntered Text: \n");
puts(text);
return 0;
}
My Output:
I have this doubt:-
Why are two EOFs being required? and how do I prevent the first one from being read (as some garbage) and stored as part of my input?
Control-Z is only recognized as EOF when at the start of a new line. Therefore, if you want to detect it in the middle of a line, you'll need to do so yourself.
So change this line:
while((c = getchar()) != EOF)
to this:
while((c = getchar()) != EOF && c != CTRL_Z)
and then add:
#define CTRL_Z ('Z' & 0x1f)
at the top of your program.
You may still need to type a return after the Ctrl-z to get the buffered input to be read by the program, but it should discard everything after the ^Z.
The following solution fixes the Ctrl+Z problem and the garbage output and also blocks a buffer overrun. I have commented the changes:
#include <stdio.h>
#define Max 1000
#define CTRL_Z 26 // Ctrl+Z is ASCII/ANSI 26
int main()
{
int c ; // getchar() returns int
char text[Max + 1] ; // +1 to acommodate terminating nul
int i = 0;
while( i < Max && // Bounds check
(c = getchar()) != EOF &&
c != CTRL_Z ) // Check for ^Z when not start of input buffer
{
text[i] = c;
i++;
}
text[i] = 0 ; // Terminate string after last added character
printf( "\nEntered Text:\n" );
puts( text );
return 0;
}
The reason for this behavior is somewhat arcane, but end-of-file is not the same as Ctrl-Z. The console generates an end-of-file causing getchar() to return EOF (-1) if and only if the console input buffer is empty, otherwise it inserts the ASCII SUB (26) character into the stream. The use of SUB was originally to do with MS-DOS compatibility with the even earlier CP/M operating system. In particular CP/M files were composed of fixed length records, so a ^Z in the middle of a record, was used to indicate the end of valid data for files that were not an exact multiple of the record length. In the console, the SUB is readable rather than generating an EOF if it is not at the start of the input buffer and all characters after the SUB are discarded. It is all a messy hangover from way-back.
Try changing the type of c to int as EOF can be a negative number and commonly it is defined as -1. char might or might not be able to store -1. Also, do not forget to end the string with \0 before passing it to puts.
The logic that Windows terminals follow with regard to ^Z in keyboard input (at least in their default configuration) is as follows:
The Ctrl-Z combination itself does not cause the input line buffer to get pushed to the waiting application. This key combination simply generates ^Z character in the input buffer. You have to press Enter to finish that line buffer and send it to the application.
You can actually keep entering additional characters after ^Z and before pressing Enter.
If the input line does not begin with ^Z, but contains ^Z inside, then the application will receive that line up to and including the first ^Z character (read as \x1A character). The rest of the input is discarded.
E.g. if you type in
Hello^Z World^Z123
and press Enter your C program will actually read Hello\x1A sequence. EOF condition will not arise.
If the input line begins with ^Z, the whole line is discarded and EOF condition is set.
E.g. if you input
^ZHello World
and press Enter your program will read nothing and immediately detect EOF.
This is the behavior you observe in your experiments. Just keep in mind that the result of getchar() should be received into an int variable, not a char variable.
I am following the exercises in the C language book. I am in the first chapter where he introduces loops. In this code:
#include <stdio.h>
/* copy input to output; 1st version */
int main() {
int c, n1;
n1 = 0;
while ((c = getchar()) != EOF) {
if (c == '\n') {
++n1;
}
printf("%d\n", n1);
}
}
In here I am counting the number of lines. When I just hit enter without entering anything else I get the right number of lines but when I enter a character then hit enter key the loop runs twice without asking for an input the second time. I get two outputs.
this how the output looks like:
// I only hit enter
1
// I only hit enter
2
// I only hit enter
3
g // I put char 'g' then hit enter
3
4
3 and 4 print at the same time. why is 4 printing after the loop has been iterated already? I thought the loop would restart and ask me for input before printing 4.
The getchar function reads one character at a time. The number of lines will be printed for every character in the input read by getchar, whether that character is newline or not, but the counter will only be incremented when there is a newline character in the input.
When you enter g then the actual input that goes to the standard input is g\n, and getchar will read this input in two iterations and that's the reason it is printing number of lines twice.
If you put the print statement inside the if block then it will print only for newline characters. If you put the print statement outside the loop, then it will only print the count of the number of lines at the end of the input.
To be clear this is the terminal that you are dealing with.
By default, the terminal will not get input from the user \n is entered. Then the whole line is placed in the stdin.
Now as I said earlier here the program is not affected by the buffering of stdin. And then the characters will be taken as input and it is processed as you expect it to be. The only hitch was the terminals buffering - line buffering.
And here from standard you will see how getchar behaves:-
The getchar function returns the next character from the input stream pointed to by stdin. If the stream is at end-of-file, the end-of-file indicator for the stream is set and getchar returns EOF. If a read error occurs, the error indicator for the stream is set and getchar returns EOF.
Now what are those characters - those charaacters include \n - the \n is what you put in the terminal and then to stdin via pressing the ENTER. Here earlier you were entering the characters earlier which were \n. This time you entered two characters. That's why the behavior you saw.
I am new to c programming, so hope you guys can help me out with such questions.
1. I thought putchar() only print 1 char each time, while when I enter several char like 'hello' it print 'hello' before allow me to enter a next input? I thought that it should print only 'h' and then allow me to enter other input because getchar() only return one character each time.
2. how to make the loop stops? I know EOF has value of -1, but when I enter -1, the loop still runs.
#include <stdio.h>
main()
{
int c = getchar();
while(c != EOF){
putchar(c);
c = getchar();
}
}
After the first getchar() has completed reading one character, the next getchar(); is inside the while() loop, so as per the logic, it will keep reading the input one-by-one, until in encounters EOF.
Following the same logic, putchar(c); is under the while loop, so it will print all the characters [one character per loop basis] read by getchar() and stored in c.
In linux, EOF is produced by pressing CTRL+D. When waiting for input, if you press this key combination, the terminal driver will transform this to EOF and while loop will break.
I'm not very sure about windows, but the key combination should be CTRL+Z.
Note: even if it seems entering -1 should work in accordance with EOF, actually it won't. getchar() cannot read -1 all at a time. It will be read as - and 1, in two consecutive iterations. Also worthy to mention, a character 1 is not equal to an integer 1. A character 1, once read, will be encoded accordingly [mostly ASCII] and the corresponding value will be stored.
getchar() gets the input from the console. In a while loop, it will read all the characters from the input including the return key.
-1 is "-1". It's not a value but just another combination of characters. EOF occurs when there is no more char in the buffer. i.e. when you press Enter (or Ctrl-Z or Ctrl-D depending on your OS)
I'm starting to learn about EOF and I've written the following simple program :
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
void main()
{
int i=0;
while(getchar()!=EOF)
{
i++;
}
printf("number of characters : %d \n",i);
}
The thing is, when I write a string, press enter and then press Ctrl+Z the output is the number of the characters I wrote plus 1 (for the EOF). However, if I write a string and, without changing line, press Ctrl+Z the while loop does not terminate. Why is that?
First things first, EOF is signalled only when Ctrl + Z is at the very beginning of a line. With that on mind:
On your first try with your input (10 characters) and then Enter, you actually push your input\n to the input stream, which gets read character by character through getchar, and there are 11 characters now with the addition of that new line at the end thanks to your Enter.
On the new line, you then use the Ctrl + Z combination to signal the EOF, and you indeed do that properly there; signal the EOF and get 11 as the result.
It's strange that you were expecting to see 10 here. What if you were to have an input of multiple lines? Would you like it to not count for new lines? Then you could use something like:
int onechar;
while ((onechar = getchar( )) != EOF)
{
if (onechar != '\n')
i++;
}
Or even more further, are you always expecting a single line of input? Then you might want to consider changing your loop condition into following:
while(getchar( ) != '\n')
{
i++;
}
Oooor, would you like it to be capable of getting multi-line input, as well as it to count the \n characters, and on top of all that, just want it to be able to stop at Ctrl + Z combinations that are not necessarily at the beginning of a line? Well then, here have this:
// 26 is the ASCII value of the Substitute character
for (int onechar = getchar( ); onechar != EOF && onechar != 26; onechar = getchar( ))
{
i++;
}
26, as commented, is the Substitute character, which, at least on my machine, is what the programme gets when I use Ctrl + Z inappropriately. Using this, if you were to input:
// loop terminated by the result of (onechar != 26) comparison
your input^Z
You would get 10 as the result and if you were to input:
// loop terminated by the result of (onechar != EOF) comparison
your input
^Z
You would get 11, counting that new-line which you did input along with all the other 10 characters before that. Here, ^Z has been used to display the Ctrl + Z key combination as an input.
Input uses buffers. The first getchar requests a system-level read. When you press enter or ctrl-z the read returns the buffer to the program. When you press enter the system also adds a newline character to the buffer before returning it. Eof is not an actual character but results from reading an empty buffer.
After the control is returned to the program, getchar sequentially reads each character in the returned buffer and when it's finished it requests another read.
In the first case, getchar reads the buffer including the newline character. Then since the buffer is empty getchar requests another read which is interrupt by pressing ctrl-z, returning an empty buffer and resulting in EOF.
In the second case, pressing ctrl-z simply returns the buffer and after getchar is finished reading it, it requests another read which isn't finished since you never press ctrl-z or enter again.
It's not your while loop that never finishes but merely the read call. Try to press ctrl-z twice in the second case.
I saw here topic related to this. What you need to do is to set pts/tty into non-cannonical mode and do it with somekind of TCSANOW(do attr changes immediately).
You do it using functions from termios.h , operating on struct termios;
I have a litte problem with EOF. I need to write numbers in one line with spaces between them and sum them together. Scanf must be ended with one EOF (Ctrl+D)
I have this little program
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char** argv){
double numbers=0, sum=0;
printf("Enter numbers: ");
while(scanf("%lf", &numbers) != EOF){
sum=sum+numbers;
}
printf("\n %.2lf", sum);
}
Problem with this program is that I need to press Ctrl+d two times until it prints sum.
Example input/output:
Enter numbers: 1 3 5 6 <'Ctrl+d'>
15.00
EOF must be preceded with a newline else it won't work. However it depends on the OS. The EOF you enter at the end of of the line containing the input is not recognized. From the man page of scanf -
scanf returns the number of items successfully matched and assigned
which can be fewer than provided for, or even zero in the event of an
early matching failure. The value EOF is returned if the end of input
is reached before either the first successful conversion or a matching
failure occurs.
Therefore you should check the return value of scanf for 1, not EOF.
#include <stdio.h>
int main(void) {
double numbers = 0, sum = 0;
printf("Enter numbers: ");
while(scanf("%lf", &numbers) == 1) {
sum += numbers;
}
printf("\n %.2lf", sum);
}
scanf returns the number of values it assigned; so if you want to use that, you want
while (scanf("%lf", &numbers) == 1)
Alternatively, you could use more of a C++ idiom (assuming you meant to tag the question with that language as well as C):
while (std::cin >> numbers)
The program is showing this behavior because, when you give input
4 4 4 scanf() will not start reading input. scanf() starts reading input when you press [enter]. so if you give input as
4
4
4
then the result will come on first
otherwise you will need 2 's since first one starts reading input and it wont be read as EOF as it just makes scanf() start reading input. therefore one more will be read by scanf() as EOF.
The stdin EOF flag won't be turned on until the second ctrl+d in succession, the first ctrl+d pushes whatever is on your screen to the stdin buffer, the second ctrl+d sets the EOF flag.
In another case, when you enter some numbers and press enter, the numbers on your screen is pushed to the stdin input buffer, and ctrl+d soon after will also set the EOF flag.
On your system this is normal behaviour, the EOF only occurs on the second ctrl+d.