Drawing from this thread discussing file descriptors and tables;
I want to know how stdin (that is, file descriptor 0, not C's stdin FILE structure) is handled within shells.
When I run a piece of code like read(0, buffer, 1024) in C, which by default in C file descriptor 0 is connected to keyboard, the shell allows me to type text in, because, we assume, read is waiting to read the contents of the character device 'standard input', aka the keyboard. But wouldn't standard input simply be empty and produce that as its result? Alright, so let's say that 'connected to keyboard' path is the way of explaining it; if that's the case, then that must mean shells line buffer their command's, right? Calling a read on file descriptor 0 would mean that file descriptor 0 in a shell is connected to this line-buffered buffer output of standard input, and not directly to the keyboard, so what's making C wait around? Furthermore, why can we not use lseek() on standard input - does said 'file' always get overwritten every 'write' that's made to it and therefore there is nothing to seek around in as standard input (being the keyboard) is not really a file on a storage device per se?
read(0, buffer, 1024)
is a system call, a call into kernel code. The kernel's implementation of read will dispatch to the terminal (or pseudo-terminal) device driver, which will wait until you've either typed 1024 characters, a newline, or an EOF marker, Ctrl+D.
then that must mean shells line buffer their command's, right?
The buffering is performed in the terminal driver, if the terminal is set to the right mode. Otherwise, the program will just wait until 1024 bytes are entered.
Furthermore, why can we not use lseek() on standard input
You can if stdin is a regular file. You just can't seek on a terminal, because that would require the terminal driver to remember all data that passed through the terminal device since it was created.
Related
I am a bit confused about this question:
What would happen with c code if you would try to close stdin or stdout instead of a file?
My guess is that the buffer will be depleted, am I right?
The C standard does not say there is any special treatment for closing stdin or stdout or any special restrictions on closing them.
The documentation for fclose in C 2018 7.21.5.1 2 says:
A successful call to the fclose function causes the stream pointed to by stream to be flushed and the associated file to be closed. Any unwritten buffered data for the stream are delivered to the host environment to be written to the file; any unread buffered data are discarded. Whether or not the call succeeds, the stream is disassociated from the file and any buffer set by the setbuf or setvbuf function is disassociated from the stream (and deallocated if it was automatically allocated).
“Deplete” is not a term used in the C standard. When used about an input buffer, it refers to the program drawing data from the buffer that has been previously filled with input (such as from a user typing a line of text in a terminal) to the point where there is no data left in the buffer. Given the behavior of fclose, the data in the buffer is discarded, not depleted.
Closing the Unix or other operating system file that is used to implement the C stream, as with close instead of fclose, might result in the C buffer remaining, so that further C library calls such as getchar() will draw data from the buffer until it is depleted, after which such routines are likely to report an I/O error.
When a program starts up, the identifiers stdin, stdout, and stderr are guaranteed to hold the addresses of file objects associated with those streams. When a program terminates, an implementation will do whatever is necessary to close the streams, *but there is no guarantee that it will use the current values of stdin, stdout, and stderr for that purpose.
An implementation may document that if a program does something like:
void swap_stdout(FILE *file_to_use_instead)
{
FILE *old_stdout = stdout;
stdout = file_to_use_instead;
fclose(old_stdout);
}
then the FILE whose address was stored into stdout will be closed when the program terminates, but absent such a specification it is possible that closing stdout would result in the storage that had been assigned to that FILE being reused to hold something else, and the program trying to interpret that other object as a FILE when the program terminates.
The stdin and stdout (and stderr) objects declared by stdio.h are specified to have type FILE * and to initially refer to the program's standard input, standard output, and standard error streams, respectively. Roughly speaking, you can do anything with them that you can do with any other stream, including close them. After closure, they are no longer valid streams for I/O, including for those I/O functions that target them implicitly (printf, scanf, ...).
Therefore, if you attempt to use one of them for I/O after having closed it, you can expect that your I/O operation will fail. The specific manifestation of that depends in part on the specific function with which you attempt the I/O operation.
Note also, by the way, that the C naming of type FILE reflects the Unix model of the world that all I/O endpoints are modeled as "files". Where you mean a persistent, named chunk of data accessible via a filesystem on a local storage device, you can use the term "regular file". Even then, it is important to distinguish between a C stream, represented by a C object of type FILE, and the underlying data on the storage device.
Running a program expecting input from terminal I can ”close” stdin by Ctrl+D. Is there any way to reopen stdin after that?
In linux and on POSIXy systems in general, the standard input descriptor is not closed when you press Ctrl+D in the terminal; it just causes the pseudoterminal layer to become readable, with read() returning 0. This is how POSIXy systems indicate end of input.
It does not mean the file descriptor (or even the stream handle provided on top of it by the C library) gets closed. As Steve Summit mentioned in a comment, you only need to clear the end-of-input status of the stream using clearerr(), to be able to read further data; this tells the C library that you noticed the status change, but want to try further reading anyway.
A similar situation can occur when a process is writing to a file, and another reads it. When the reader gets to the end of the file, a read() returns 0, which the C library understands as end-of-input; it sets an internal flag, so that unless you call clearerr(), feof() will return true for that stream. Now, if the writer writes more data, and the reader does a clearerr(), the reader can read the newly written additional data.
This is perfectly normal, and expected behaviour.
In summary:
End of input is indicated by a read() operation returning 0, but the file descriptor status does not change, and can be used as normal.
Ctrl+D on a terminal causes only that to happen; the file descriptors open to the terminal are not affected in any other way, and it is up to the foreground process reading the terminal input to decide what it does. It is allowed to simply go on reading more data.
Most programs do exit when that happens, but that is a convention, not a technical requirement at all.
The C library detects read() returning 0, and sets its internal "end of input seen" flag for that stream. That causes feof() to return true, fgets() to return NULL, fgetc() to return EOF, and so on, for that stream.
Calling clearerr() on the stream handle clears the flag, so that the next read attempt will actually try to read further data from the descriptor.
This is described in the very first sentence in the Description section of the man 3 clearerr man page.
For example, the stdio.h library has some functions that require a FILE * argument but accept stdin for user input from a terminal.
C stdio functions operate on streams, not files. As far as your code is concerned, a stream is simply a consumer (output stream) or producer (input stream) of bytes.
A stream may be associated with a file on disk. It may also be associated with a terminal. Or a printer. Or a network socket. Or anything else that you might want to communicate with. A stream is an abstraction of anything that can read or write a string of bytes.
stdin and stdout (along with stderr) are predefined FILE * objects which normally refer to your console, although you can override that either at the command line or within your code.
The stdin and stdout are nothing more than the pointers of the file for the standard input and output. due to the fact that you can change these within your code or with the Command Prompt these can't be pointing to an actual input because then you would not be able to change it.
The stdin andstdio simply take the information from the files where the standard output or standard input read from or write to. this way it is a lot easier to change it using commands and codes.
if am writing a code.
#include<stdio.h>
int main()
{
int i;
FILE *fp;
fp=fopen("shiv.txt","w");
printf("%d",fileno(fp));
dup2(3,1);
fprintf(fp,"hello");
}
as an output the program is printing hello3 in the shiv.txt file
as we can see printf is called first yet its output is shown after the output of fprintf.
moreover dup2 was called after the printf statement therefore the output of printf should be placed on terminal
The standard I/O streams are buffered — with the possible exception of the standard error stream, which is only required to not be fully buffered by POSIX. Without a call to fflush(stdout) to flush the output buffer for standard output (or the output of a newline sequence if it is line-buffered), the way things work with respect to the FILE interface is not defined once you call dup2.
Since dup2 works with file descriptors and not FILE pointers, you have a problem: POSIX doesn't specify what to do in this case. The buffer associated with stdout may be discarded, or it may be flushed as if with fclose. The buffer may even remain associated and not flushed/discarded since stdout from the perspective of the FILE interface is still open.
So the behavior isn't necessarily deterministic without syncing the FILE interface with the underlying file description (add an fclose(stdout) call after dup2). Additionally, what happens with, e.g., stderr in addition to stdout with dup2 associated with the file description of the file you open? Is the behavior in order of the dup2 calls as with a queue or in reverse order as with a stack or even in a seemingly random order, the latter of which suggests that a segfault may be possible? And what is the order of output if you dup2(STDERR_FILENO, STDOUT_FILENO), followed by dup2(fileno(fp), STDERR_FILENO)? Do the results of writing to the standard output/error buffers appear before the fprintf results or after or mixed (or sometimes one and sometimes another)? Which appears first — the data written to stderr or the data written to stdout? Can you be certain this will always happen in that order?
The answer probably won't surprise you: No. What happens on one configuration may differ from what happens on another configuration because the interaction between file descriptors, the buffers used by the standard streams, and the FILE interface of the standard streams is left undefined.
As #twalberg commented, there is no guarantee that the file you opened is file descriptor 3, so be careful when hard-coding numbers like that. Also, you have STDOUT_FILENO available from <unistd.h>, which is where dup2 is actually declared, so you can avoid using a call to fileno in place of file descriptor 1 by using it.
There are rules to follow when manipulating "handles" to open file descriptions.
Per System Interfaces Chapter 2, file descriptors and streams are "handles" to file descriptions and a file description can have multiple handles.
This chapter defines rules when working with both a file descriptor and a stream for the same file description. If they are not followed, then the results are "undefined".
Calling dup2() to replace the stdout file descriptor after calling printf() (on the STDOUT stream) without a call to fflush(STDOUT) in between is a violation of those rules (simplified), hence the undefined behavior.
When does the Read System call terminates when taking input from STDIN ??
There are quite a few parts to this.
First, let's clarify the distinction between OS-level IO and stdio-level IO. read(2) and write(2) (POSIX IO) are specified by POSIX, and operate using file descriptors (numbers starting from 0); fread(3) and fwrite(3) (stdio IO) are specified by ISO C and operate on file handles, such as STDIN, which on POSIX systems encapsulate file descriptors and add some things (such as output buffering) on top of them.
So, read(2) and write(2) don't do any buffering on their own. The buffering you see on standard input (file descriptor 0, not STDIN, which is one abstraction above that) is done by the terminal (or terminal emulation). Search for canonical mode to disable it.
At the stdio-level, fwrite(3) (and printf(3), fprintf(3), et al.) does output buffering depending on what the output is connected to.
See also:
How to check if a key was pressed in Linux?
Single characters are not printed on the terminal
Does printing to the screen cause a switch to kernel mode and running OS code in Unix?