How to restart stdin after Ctrl+D? - c

Running a program expecting input from terminal I can ”close” stdin by Ctrl+D. Is there any way to reopen stdin after that?

In linux and on POSIXy systems in general, the standard input descriptor is not closed when you press Ctrl+D in the terminal; it just causes the pseudoterminal layer to become readable, with read() returning 0. This is how POSIXy systems indicate end of input.
It does not mean the file descriptor (or even the stream handle provided on top of it by the C library) gets closed. As Steve Summit mentioned in a comment, you only need to clear the end-of-input status of the stream using clearerr(), to be able to read further data; this tells the C library that you noticed the status change, but want to try further reading anyway.
A similar situation can occur when a process is writing to a file, and another reads it. When the reader gets to the end of the file, a read() returns 0, which the C library understands as end-of-input; it sets an internal flag, so that unless you call clearerr(), feof() will return true for that stream. Now, if the writer writes more data, and the reader does a clearerr(), the reader can read the newly written additional data.
This is perfectly normal, and expected behaviour.
In summary:
End of input is indicated by a read() operation returning 0, but the file descriptor status does not change, and can be used as normal.
Ctrl+D on a terminal causes only that to happen; the file descriptors open to the terminal are not affected in any other way, and it is up to the foreground process reading the terminal input to decide what it does. It is allowed to simply go on reading more data.
Most programs do exit when that happens, but that is a convention, not a technical requirement at all.
The C library detects read() returning 0, and sets its internal "end of input seen" flag for that stream. That causes feof() to return true, fgets() to return NULL, fgetc() to return EOF, and so on, for that stream.
Calling clearerr() on the stream handle clears the flag, so that the next read attempt will actually try to read further data from the descriptor.
This is described in the very first sentence in the Description section of the man 3 clearerr man page.

Related

What would happen with c code if you try to close stdin or stdout instead of a file?

I am a bit confused about this question:
What would happen with c code if you would try to close stdin or stdout instead of a file?
My guess is that the buffer will be depleted, am I right?
The C standard does not say there is any special treatment for closing stdin or stdout or any special restrictions on closing them.
The documentation for fclose in C 2018 7.21.5.1 2 says:
A successful call to the fclose function causes the stream pointed to by stream to be flushed and the associated file to be closed. Any unwritten buffered data for the stream are delivered to the host environment to be written to the file; any unread buffered data are discarded. Whether or not the call succeeds, the stream is disassociated from the file and any buffer set by the setbuf or setvbuf function is disassociated from the stream (and deallocated if it was automatically allocated).
“Deplete” is not a term used in the C standard. When used about an input buffer, it refers to the program drawing data from the buffer that has been previously filled with input (such as from a user typing a line of text in a terminal) to the point where there is no data left in the buffer. Given the behavior of fclose, the data in the buffer is discarded, not depleted.
Closing the Unix or other operating system file that is used to implement the C stream, as with close instead of fclose, might result in the C buffer remaining, so that further C library calls such as getchar() will draw data from the buffer until it is depleted, after which such routines are likely to report an I/O error.
When a program starts up, the identifiers stdin, stdout, and stderr are guaranteed to hold the addresses of file objects associated with those streams. When a program terminates, an implementation will do whatever is necessary to close the streams, *but there is no guarantee that it will use the current values of stdin, stdout, and stderr for that purpose.
An implementation may document that if a program does something like:
void swap_stdout(FILE *file_to_use_instead)
{
FILE *old_stdout = stdout;
stdout = file_to_use_instead;
fclose(old_stdout);
}
then the FILE whose address was stored into stdout will be closed when the program terminates, but absent such a specification it is possible that closing stdout would result in the storage that had been assigned to that FILE being reused to hold something else, and the program trying to interpret that other object as a FILE when the program terminates.
The stdin and stdout (and stderr) objects declared by stdio.h are specified to have type FILE * and to initially refer to the program's standard input, standard output, and standard error streams, respectively. Roughly speaking, you can do anything with them that you can do with any other stream, including close them. After closure, they are no longer valid streams for I/O, including for those I/O functions that target them implicitly (printf, scanf, ...).
Therefore, if you attempt to use one of them for I/O after having closed it, you can expect that your I/O operation will fail. The specific manifestation of that depends in part on the specific function with which you attempt the I/O operation.
Note also, by the way, that the C naming of type FILE reflects the Unix model of the world that all I/O endpoints are modeled as "files". Where you mean a persistent, named chunk of data accessible via a filesystem on a local storage device, you can use the term "regular file". Even then, it is important to distinguish between a C stream, represented by a C object of type FILE, and the underlying data on the storage device.

When does feof(stdin) next to fgets(stdin) return true?

int main(void){
char cmdline[MAXLINE];
while(1){
printf("> ");
fgets(cmdline, MAXLINE, stdin);
if(feof(stdin)){
exit(0);
}
eval(cmdline);
}
}
This is main part of myShell program that professor gave to me.
But there is one thing I don't understand in code.
There says if(feof(stdin)) exit(0);
What is the end of the standard input?
fgets accept all characters until the enter key is input. The end of a typical "file"(e.g.txt) is intuitively understandable, but what does the end of a standard input mean?
In what practical situations does the feof(stdin) actually return true?
Even if you enter a space without entering anything, the IF statement does not pass.
feof tests the stream’s end-of-file indicator and returns true (non-zero) iff the end-of-file indicator is set.
For regular files, attempting to read past the end of the file sets the end-of-file indicator. For terminals, a typical behavior is that when a program attempts to read from the terminal and gets no data, the end-of-file indicator is set. In Unix systems with default settings, a way to trigger this “no data, end-of-file behavior” is to press control-D at the beginning of a line or immediately after a prior control-D.
The reason this works is because control-D is used to mean “send pending data to the program immediately.” That is described further in this answer.
Thus, if you want to end input for a program, press control-D (and, if not at the beginning of a line, press it a second time).
For input from terminals, while this does cause an end-of-file indication, it does not actually end the input or close the stream. The program can clear the end-of-file indicator and keep reading. Even for regular files, the program could clear the end-of-file indicator, reset the file context to a different position, and continue reading.
The confusion is to assume stdin = terminal. It is not necessarily true.
What stdin is depends on how you run your program.
For example, assuming your executable is named a.out, if you run it like this:
echo "foo" | ./a.out
Stdin is an output of a different process, in this example this process simply outputs the word "foo", so stdin will contain "foo" and then EOF.
Another example is:
./a.out < file.txt
In this case, stdin is "file.txt". When the file is read to the end, stdin gets EOF.
Stdin can also be a special device, for example:
./a.out < /dev/random
In this specific case it is infinite.
Last, when you simply run your program and stdin is terminal - you can generate EOF too - just press CTRL-D, this sends a special symbol meaning EOF to the terminal.
P.S.
There are other ways to execute a process. Here I only gave examples of processes executed from the command line shell. But process can be executed by a different process, not necessarily from the shell. In this case the creator of the process can decide what stdin will be - terminal, pipe, socket, file or any other object.

What are the particular cases `getchar()` returns error?

So i know getchar() returns EOF when input ended or an error occurred. I also know that i can check which of this cases occurred by ferror(stdin) and feof(stdin).
I want to know what are the cases when error can occur in particular.
I checked man pages of both functions but there is nothing about it there.
getchar() can return EOF for multiple system specific I/O errors. getchar() is defined to be equivalent to getc(stdin), itself equivalent to fgetc(stdin) except for the fact that it may be implemented as a macro. Here is a list of possible causes for linux systems from the linux man page:
RETURN VALUE
Upon successful completion, fgetc() shall return the next byte
from the input stream pointed to by stream. If the end-of-file
indicator for the stream is set, or if the stream is at end-of-
file, the end-of-file indicator for the stream shall be set and
fgetc() shall return EOF. If a read error occurs, the error
indicator for the stream shall be set, fgetc() shall return EOF,
and shall set errno to indicate the error.
ERRORS
The fgetc() function shall fail if data needs to be read and:
EAGAIN The O_NONBLOCK flag is set for the file descriptor
underlying stream and the thread would be delayed in the
fgetc() operation.
EBADF The file descriptor underlying stream is not a valid file
descriptor open for reading.
EINTR The read operation was terminated due to the receipt of a
signal, and no data was transferred.
EIO A physical I/O error has occurred, or the process is in a
background process group attempting to read from its
controlling terminal, and either the calling thread is
blocking SIGTTIN or the process is ignoring SIGTTIN or the
process group of the process is orphaned. This error may
also be generated for implementation-defined reasons.
EOVERFLOW
The file is a regular file and an attempt was made to read
at or beyond the offset maximum associated with the
corresponding stream.
The fgetc() function may fail if:
ENOMEM Insufficient storage space is available.
ENXIO A request was made of a nonexistent device, or the request
was outside the capabilities of the device.
All of stdio.h points at something called error indicator, which is an internal variable supposedly located in an opaque FILE object that the application programmer has no access to (see C17 7.21.1).
The documentation for getchar is found in C17 7.21.7.6:
The getchar function returns the next character from the input stream pointed to by
stdin. If the stream is at end-of-file, the end-of-file indicator for the stream is set and
getchar returns EOF. If a read error occurs, the error indicator for the stream is set and
getchar returns EOF.
So we don't know if getchar returned EOF because it reached the end of the stream, or because there was a read error. In order to know, we'd have to check the error indicator.
This is where ferror(stdin) comes in. It's a mildly useful function, because it only does this (C17 7.21.10.3):
The ferror function returns nonzero if and only if the error indicator is set for
stream.
And that's all there is to it - this is a standardized, portable abstraction layer and we can't really know what's going on underneath the hood. Which is nice, because most of the time we simply don't care.
There will be an OS-specific API underneath these standard C functions, in case of POSIX likely read(), in case of Windows likely ReadFile() etc. These functions in turn can fail for a number of reasons: incorrect file handles, file is taken by another process, no read access to the file given to the user by the OS and so on.
In theory getchar could as well be hooked up to a serial bus on an embedded system, in which case the reasons of it failing would be entirely different ones than on a hosted system. Now we are suddenly talking about wrong baudrate, buffer overruns, framing errors or whatever applies.

Can reading line buffered streams yield multiple lines?

Given the following setup:
child process has been opened for reading (for example via popen())
child's stdout has been set to line buffered (_IOLBF)
parent monitors for data on child's stdout (select(), poll() or epoll())
parents reads from child's stdout once data available (fgets() or getline())
Seeing how the stream has been set to line buffered, can we safely assume that there will only ever be one single line to read, or are there possible scenarios where there are multiple lines waiting in the buffer, meaning we have to call fgets() or getline() until they hit EOF (or EAGAIN / EWOULDBLOCK for non-blocking streams)?
can we safely assume that there will only ever be one single line to read, or are there possible scenarios where there are multiple lines waiting in the buffer, meaning we have to call fgets() or getline() until they hit EOF (or EAGAIN / EWOULDBLOCK for non-blocking streams)?
I steal the relevant part from this answer (that is in turn linked to C11 standard):
When a stream is line buffered, characters are intended to be transmitted to or from the host environment as a block when a new-line character is encountered.
So, line buffered doesn't mean that all data is buffered in a single line, but instead that the data is sent as soon as there's a new line.
As a consequence, the answer to your question is that are possible scenarios in which multiple line are waiting to be read (depending, of course, on what is actually sent from your child process).

File Descriptor 0

Drawing from this thread discussing file descriptors and tables;
I want to know how stdin (that is, file descriptor 0, not C's stdin FILE structure) is handled within shells.
When I run a piece of code like read(0, buffer, 1024) in C, which by default in C file descriptor 0 is connected to keyboard, the shell allows me to type text in, because, we assume, read is waiting to read the contents of the character device 'standard input', aka the keyboard. But wouldn't standard input simply be empty and produce that as its result? Alright, so let's say that 'connected to keyboard' path is the way of explaining it; if that's the case, then that must mean shells line buffer their command's, right? Calling a read on file descriptor 0 would mean that file descriptor 0 in a shell is connected to this line-buffered buffer output of standard input, and not directly to the keyboard, so what's making C wait around? Furthermore, why can we not use lseek() on standard input - does said 'file' always get overwritten every 'write' that's made to it and therefore there is nothing to seek around in as standard input (being the keyboard) is not really a file on a storage device per se?
read(0, buffer, 1024)
is a system call, a call into kernel code. The kernel's implementation of read will dispatch to the terminal (or pseudo-terminal) device driver, which will wait until you've either typed 1024 characters, a newline, or an EOF marker, Ctrl+D.
then that must mean shells line buffer their command's, right?
The buffering is performed in the terminal driver, if the terminal is set to the right mode. Otherwise, the program will just wait until 1024 bytes are entered.
Furthermore, why can we not use lseek() on standard input
You can if stdin is a regular file. You just can't seek on a terminal, because that would require the terminal driver to remember all data that passed through the terminal device since it was created.

Resources