How does scanf determine whether to block? - c

When I redirect a file to stdin using MyProgram < cl.txt command from the command line, scanfs doesn't wait me to press Enter.
But when I use scanf in my program without doing so, it does block until enter key is pressed.
How exactly does it determine that? Does it keep reading the stream until \n is encountered? or does it really wait me to press a key?
When I don't write anything and press Enter it doesn't stop blocking either and keeps asking. I'm really confused.

Does it keep reading the stream until '\n' is encountered?
Normally stdin is in line buffering mode (_IOLBF, see setvbuf). Whenever the buffer is empty, stdin waits for a whole new line to be entered, i.e. waits until you press Enter and \n is inserted into the buffer:
On Input, the buffer is filled up to the next newline character when an input operation is requested and the buffer is empty.
Note: the console (terminal) is most often implementing a buffering on its own, and does not send any data to the stream until you press Enter - this allows you to edit the data (like use delete, and backspace keys) before you send them to the application. Therefore even with no buffering on the stdin side (like when you perform setvbuf(stdin, NULL, _IONBF, 0)), the scanf may still wait until the Enter is pressed.

scanf is just reading from its input stream. If the input stream is a pipe, and the other end of that pipe is associated with a tty (which is usually the case if you are interactively entering data by pressing keys on a keyboard), scanf will return as soon as it reads data that completes its format string (or fails to match it). The tty, however, if it is in cooked mode (which is the default, and unless you make some effort to put the tty into raw mode, you should assume it is cooking what you type), will not write any data into the pipe until you hit return.
In other words, it's not your scanf that is blocking. (Well, it is blocking, but it's not the source of the experienced delay.) Rather, the tty driver is waiting for you to hit return before it passes any data to your program.

When you call scanf it immediately waits for input. in your first example, input is provided in the form of "cl.txt". In your second example, no input is provided until you press a key. Synchronous IO will block on its executing thread until it receives input.

Related

When does feof(stdin) next to fgets(stdin) return true?

int main(void){
char cmdline[MAXLINE];
while(1){
printf("> ");
fgets(cmdline, MAXLINE, stdin);
if(feof(stdin)){
exit(0);
}
eval(cmdline);
}
}
This is main part of myShell program that professor gave to me.
But there is one thing I don't understand in code.
There says if(feof(stdin)) exit(0);
What is the end of the standard input?
fgets accept all characters until the enter key is input. The end of a typical "file"(e.g.txt) is intuitively understandable, but what does the end of a standard input mean?
In what practical situations does the feof(stdin) actually return true?
Even if you enter a space without entering anything, the IF statement does not pass.
feof tests the stream’s end-of-file indicator and returns true (non-zero) iff the end-of-file indicator is set.
For regular files, attempting to read past the end of the file sets the end-of-file indicator. For terminals, a typical behavior is that when a program attempts to read from the terminal and gets no data, the end-of-file indicator is set. In Unix systems with default settings, a way to trigger this “no data, end-of-file behavior” is to press control-D at the beginning of a line or immediately after a prior control-D.
The reason this works is because control-D is used to mean “send pending data to the program immediately.” That is described further in this answer.
Thus, if you want to end input for a program, press control-D (and, if not at the beginning of a line, press it a second time).
For input from terminals, while this does cause an end-of-file indication, it does not actually end the input or close the stream. The program can clear the end-of-file indicator and keep reading. Even for regular files, the program could clear the end-of-file indicator, reset the file context to a different position, and continue reading.
The confusion is to assume stdin = terminal. It is not necessarily true.
What stdin is depends on how you run your program.
For example, assuming your executable is named a.out, if you run it like this:
echo "foo" | ./a.out
Stdin is an output of a different process, in this example this process simply outputs the word "foo", so stdin will contain "foo" and then EOF.
Another example is:
./a.out < file.txt
In this case, stdin is "file.txt". When the file is read to the end, stdin gets EOF.
Stdin can also be a special device, for example:
./a.out < /dev/random
In this specific case it is infinite.
Last, when you simply run your program and stdin is terminal - you can generate EOF too - just press CTRL-D, this sends a special symbol meaning EOF to the terminal.
P.S.
There are other ways to execute a process. Here I only gave examples of processes executed from the command line shell. But process can be executed by a different process, not necessarily from the shell. In this case the creator of the process can decide what stdin will be - terminal, pipe, socket, file or any other object.

How getchar() works when it is used as condition in while loop

I cant understand how the following code really works.
int main() {
char ch;
while((ch=getchar())!='\n')
{
printf("test\n");
}
return 0;
}
Lets say we give as an input "aaa". Then we get the word "test" as an output in 3 seperate lines.
Now my question is, for the first letter that we type, 'a', does the program goes inside the while loop and remembers that it has to print something when the '\n' character is entered? Does it store the characters somewhere and then traverses them and executes the body of the while loop? Im lost.
There are many layers between the user writing input into a terminal, and your program receiving that input.
Typically the terminal itself have a buffer, which is flushed and sent to the operating system when the user presses the Enter key (together with a newline from the Enter key itself).
The operating system will have some internal buffers where the input is stored until the application reads it.
Then in your program the getchar function itself reads from stdin which is usually also buffered, and the characters returned by getchar are taken one by one from that stdin buffer.
And as mentioned in a comment to your question, note that getchar returns an int, which is really important if you ever want to compare what it returns against EOF (which is an int constant).
And you really should compare against EOF, otherwise you won't detect if there's an error or the user presses the "end-of-file" key sequence (Ctrl-D on POSIX systems like Linux or macOS, or Ctrl-Z on Windows).
What you see is due to the I/O line buffering.
The getchar() functions doesn't receive any input until you press the enter. This add the \n completing the line.
Only at this point the OS will start to feed characters to the getchar(), that for each input different from \n prints the test message.
Apparently the printout is done together after you press the enter.
You can change this behavior by modifying the buffering mode with the function setvbuf(). Setting the mode as _IONBF you can force the stream as unbuffered, giving back each character as it is pressed on the keyboard (or at least on an *nix system, MS is not so compliant).

Using EOF in the middle of an input line?

For my program, I have a prompt to stdout
>
and then my program reads from stdin. The prompt loops if EOF has not been reached. I have noticed if I enter something, such as:
> bee
When I press CTRL-D once, nothing happens. When I press CTRL-D again, my prompt comes up again. And only when I press it a third time, does my program terminate due to EOF. Does this mean there is a problem in my code? Or is this normal behavior?
Heres a simplified version of my code:
(fopen used)
(print prompt)
while((fgets(tester, 1026, input)) != NULL) {
if(there is a # in tester) {
(print prompt)
continue;
}
}
In a unix terminal, CTRL-D does nothing more or less than immediately send all bytes pending in the terminals input buffer.
Background:
Normally, when you enter stuff into your terminal, that stuff is line buffered, so you can keep editing a line until you are satisfied with it, and then send it to the running process by entering a newline (or CTRL-D, the difference is only that CTRL-D does not add a newline character at the end).
Now, processes detect the end of an input stream by checking whether the read() call returned anything. So, if you press CTRL-D on an empty input buffer, the read() call returns with nothing, and the process thinks "no more bytes coming out of this stream, I'd better not try again". Afaik, there is no other way to check for the end of an input stream, so all programs that recognize EOF on stdin do this, either directly or via the standard C library. The later is what you did when you called fgets().
Your case:
The first CTRL-D simply sends the three characters "bee" to your process. The read() call within your fgets() call returns these three characters, and your fgets() implementation checks for a newline character. As it finds none, and as its own output buffer is not full yet, it immediately proceeds to fetch more characters with another read() call.
The second CTRL-D sends nothing as you have not entered any other characters since your last CTRL-D. The write() call returns with no output, the fgets() sees that it received zero characters and calls it an EOF condition. So it returns the (mostly buffered) string "bee" to you.
Your program may check whether that string contains a # character. But its loop cannot terminate until a fgets() call returns NULL (there is no break statement to leave the loop preliminarily).
The third CTRL-D agains sends zero bytes to your process. This causes the first read() call of the second fgets() call to return zero bytes (the loop is about to be reentered after a successful first iteration). The fgets() implementation sees the empty results, and since it finds that it has not yet received any bytes, it returns NULL. Your loop condition sees the NULL and terminates the loop, which in turn causes your main() to return, exiting the process.
TL;DR:
Yes, this is totally expected behavior, even though it seems rather counter-intuitive. That's UNIX: It's KISS, not necessarily intuitive.

End of File in stdin

A question about this has been asked here
End of File (EOF) in C
but it still doesn't completely solve my problem.
EOF makes sense to me in any datastream which is not stdin, for example if I have some data.txt file, fgetc() will read all the chars and come to the end of file and return -1.
What I don't understand is the concept of EOF in stdin. If I use getchar(), it will wait for me to enter something, so if there is NOTHING written, End of File, (EOF) is not returned automatically?
So is it that only the user can invoke EOF in stdin by pressing Ctrl+Z?
If so then what are some of the uses of EOF in stdin? I guess it tells the program to continue reading until the user invokes end of file? is this it?
Thank you
so if there is NOTHING written, End of File, (EOF) is not returned automatically?
No, it's not. It should be sent by the user.
So is it that only the user can invoke EOF in stdin by pressing Ctrl+Z?
Yes, you can set the EOF indicator for stdin with a special key combination you can input in the console, for linux console that is Ctrl+D and for windows it's Ctrl+Z.
If so then what are some of the uses of EOF in stdin? I guess it tells the program to continue reading until the user user invokes end of file? is this it?
The use of it depends on whether you instruct the user to input the EOF explicitly or not, for example, I think python console will tell you something like Press Ctrl+D or type quit() to exit.
And EOF is not necessarily -1 it's a macro and you should always use it to test for the EOF indicator. And more importantly EOF is not a character, it's a special value that indicates that the End Of File indicator is set.
Also, getchar() is equivalent to fgetc(stdin).
In linux bash, if you press CTRL+D, it will generate EOF.
In Windows, the equivalent is CTRL+Z
So, no, if nothing written to the terminal, that does not generate EOF automatically. The scanning function is in wait state then. So, without having any other inputs, in wait state, if CTRL+D is pressed, the key press is translated [by the terminal driver] to EOF.Note
Usually, once you key in some value and press the ENTER key, the scannning function starts scanning. To feed an input for producing EOF, you need to press CTRL+D.
Related: Please reaed the wiki entry for EOF
Note: With thanks to Mr Drew for the clarification.
stdin is a stream, data is not available until the user presses some keys. A file on the disk already has (a fixed amount of) content.
When reading from stdin, if getchar() doesn't wait for the user to input something then the program will always get EOF. That will make it impossible to use stdin as an input file.
Because getchar() waits for the user to input something there is no way to signal the input completed; that's why the operating systems provide a combination of keys that have this special meaning when they are pressed on the console.
Windows uses CtrlZ and Unix-like OSes (including OSX) use CtrlD for this purpose.
The file stdin is not always the user typing on the keyboard. If you redirect input to your program, it can be just a normal file.
program.exe <input-from-file.txt
What may be confusing you is that no giving input into a console window does not mark the end of the input. But think it the other way round: how could a user respond so quickly that the program would not terminate before it if the console would not do some buffering for the user? After pressing Enter the user says this is a line of input. In other words: a program running in a console window always waits for the next input to come.
Most programs define a special phrase to end a console session. You probably know exit.

How to clear the contents of scanf of a stopped process?

I am using fork and the child process reads data ten times from user using a scanf inside the for loop. The parent process however sends the SIGSTOP signal to child after 4 seconds of sleep and reads a value from the user and prints it. But if the user has entered data but not pressed enter for the scanf in the child process the parent process reads but prints the data written in the child process. How do I stop this from happeneing.
ch=fork();
if(ch==0)
{
for(i=0;i<10;i++)
{
fflush(stdin);
scanf("%s",buf);
printf("%d: %s\n",i,buf);
}
}
else
{
char buf2[100],cha;
sleep(4);
kill(ch,SIGSTOP);
write(STDOUT_FILENO,"\nchld stopped\n",14);
memset(stdin,0,sizeof(stdin));
read(STDIN_FILENO,buf2,2);
write(STDOUT_FILENO,buf2,2);
kill(ch,SIGCONT);
wait(SIGCHLD);
}
So the output for example comes like this:
a
0: a
b
1: b
ac (I dont press enter here and wait for SIGSTOP)
chld stopped
tr (Entered data for parent and pressed enter)
ac2: tr
c
3: c
... and so on
So after entering tr why does my parent display ac?
fflush(stdin) is never the right thing to do, although lots of people have tried it. fflush is only for output. Your memset is pure insanity. The argument to wait isn't supposed a signal number - it's not even the right type, so you should have got a warning which you apparently ignored.
Those are the easy errors. There is a deeper conceptual problems with what you're trying to do.
Since you didn't modify any tty settings, the tty is in canonical mode while your program is running. That means the tty handles line editing (backspace, Ctrl-U, Ctrl-W, etc.) and doesn't send anything to the program until the line is terminated.
When you have typed a partial line, that partial line is not in the stdin buffer. It's in the tty buffer. It doesn't belong to any process yet. That's why your parent process can read a line that was partially typed while the child was attempting to read. The child never got any of it.
To empty the tty buffer, this should work: turn off canonical mode; set non-blocking mode; read into a until an error occurs (EAGAIN/EWOULDBLOCK will happen in non-blocking mode when the tty has nothing left to give you); turn blocking and canonical mode back on. The idea is to consume whatever is currently available without waiting for more. Code to perform the individual steps should be easy to find.
In general, I question the wisdom of an interface that offers the user an opportunity to enter information, then spontaneously interrupts the reading of that information to read something else. It's going to cause users to shout at the secondary input prompt: HEY! I'M TYPING HERE!
Okay I found the solution.
I used tcflush(): flush non-transmitted output data, non-read input data, or both.
tcflush(STDIN_FILENO,TCIFLUSH); line just before the read in the parent did it.
I found the solution here...
How can I flush unread data from a tty input queue on a UNIX system?
It is because of read(), write() function. When you input something without pressing enter. It will be in stdinput, whatever in stdinput, it will read by read() function store in buf2 as specified in your example. And whatever in buf2, it will send to stdoutput by write(). You can see the difference when you comment read(), write() function, or just print "buf2".

Resources