where stdin / stdout created - c

In c( ansi ) , we say input taken by (s/v/f)scanf and stored in stdin , same as we say
stdout . I wonder, in linux ( unix ) where are they reside, under which folder .
Or they ( stdin / stdout ) are arbitrary ( that is, no such things exist )

They are streams created for your process by the operating system. There is no named file object associated with them, and so they do not have a representation within the file system, although as unwind points out, they may be accessed via a pseudo file system if your UNIX variant supports such a thing.

stdin is a FILE * referring to the stdio (standard io) structure that is tied to the file descriptor 0. File descriptors are what Unix-like systems, such as Linux, use to talk with applications about particular file-like things. (Actually, I'm pretty sure that Windows does this as well).
File descriptor 0 may refer to any type of file, but to make sense it must be one that read can be called on (it must be a regular file, a steam socket, or a character device opened for reading or the read side of a pipe, as opposed to a directory file, data gram socket, or a block device).
Processes in Unix-like systems inherit their open file descriptors from their parent process in Unix-like systems. So to run a program with stdin set to something besides the parent's stdin you would do:
int new_stdin = open("new_stdin_file, O_RDONLY);
pid_t fk = fork();
if (!fk) { // in the child
dup2(new_stdin, 0);
close(new_stdin);
execl("program_name", "program_name", NULL);
_exit(127); // should not have gotten here, and calling exit (without _ ) can have
// side effects because it runs atexit registered functions, and we
// don't want that here
} else if (fk < 0) {
// in parent with error from fork
} else {
// in parent with no error so fk = pid of child
}
close(new_stdin); // we don't need this anymore
dup2 duplicates the first file descriptor argument as the second (closing the second before doing so if it were open for the current process).
fork creates a duplicate of the current process. execl is one of the exec family of functions, which use the execve system call to replace the current program with another program. The combination of fork and exec are how programs are generally run (even when hidden within other functions).
In the above example we could have run the new program with stdin set to the read end of a pipe, a tty (serial port / TeleTYpe), or several other things. Some of these have names present in the filesystem and others do not (like some pipes and sockets, though some do have names in the filesystem).
Linux makes /proc/self/fd/0 a symbolic link to the file opened as 0 in the current process. /proc/%i/fd/0, pid would represent the symbolic link to the same thing for an arbitrary pid (process ID) using the printf syntax. These symbolic links are often usable to find the real file in the filesystem (using the readlink system call), but if the file does not actually exist in the filesystem the link data (what would usually be a file name) instead is just a string that tells a little bit about the file.
I should point out here that a file that stdin (fd 0) refers to, even if it is in the filesystem, may not have just one name. It may have more than one hard link, so it would have more than one name -- and each of these would be just as much its name as any other hard link. Additionally it may have no name at all if all of its hard links have been unlinked since it was opened, though it's data would still live on the disk until all open file descriptors for it are closed.
If you don't actually need to know where it is in the filesystem, but just want some data about it you can use the fstat system call. This is like the stat system call and command line utility, except for already open files.
Everything I said here about stdin (fd 0) should be applicable to stdout (fd 1) and stderr (fd 2) except that they will both be writable rather than readable.
If you want to know more about any of the functions I mentioned be sure to look them up in the man pages by typing:
man fork
on the command line. Most functions I mentioned are in section 2 of the man pages, but one or two may be in section one, so man 2 fork will work too, and may be useful when a command line tool has the same name as a function.

In Linux, you can generally find stdin through the /proc file system in /proc/self/fd/0, and stdout is /proc/self/fd/1.

stdin is standard input - for example, keyboard input.
stdout is standard output - for example, monitor.
For more info, read this.

If you run:
./myprog < /etc/passwd
then stdin exists in the filesystem as /etc/passwd. If you just run
./myprog
interactively on a terminal, then stdin exists in the filesystem as whatever your terminal device is (probably /dev/pts/5 or something).
If you run
cat /etc/passwd | ./myprog
then stdin is an anonymous pipe and has no instantiation in the filesystem, but Linux allows you to get at it via /proc/12345/fd/0 where 12345 is the pid of myprog.

Related

C redirect terminal descriptor

It's possible to redirect everything that is written in the terminal to a process?
For example, after I started the process, if I write "command" in the terminal, this should be redirected to a pipe from my process or something like this.
Yes, it should be practical to redirect all terminal output from your program (and all of its child processes) after your program has started. Unix programs usually write to the terminal by writing to standard output (stdout). Standard output is always on the file descriptor number 1 (the C constant is STDOUT_FILENO), for all processes. You can use the dup2() system call to replace any file descriptor number with another file descriptor.
So you can e.g. create a pipe using int fds[2]; pipe(fds);. Then fds[1] will be a file descriptor number that you can use to write to the pipe. If you do dup2(fds[1], STDOUT_FILENO); then standard output will also write to the pipe. (You can close(fds[1]); afterwards since you probably don't need it, now that you can use stdout instead.)
You can also open a file for writing with fd = open("filename", O_WRONLY); and then dup2(fd, STDOUT_FILENO); so everything written to stdout goes into your file.
Note that you need to redirect stdout at the very beginning of your program before doing anything that might write to stdout.
The above trick will make standard output go to your pipe instead of the terminal. If you want the output to go to the terminal, and also get a copy of the output in a pipe of file, that's more difficult but can also be done. You need to create an internal pipe, then dup2(that_pipe, STDOUT_FILENO); so stdout writes to that pipe. Then you need to read from that pipe (probably using poll() then read()) and write everything you got to both 1) the terminal and 2) to another pipe or file that is going outside your program. So you need two pipes if you want to copy output.
The tee command does this (copy stdout to files) from the shell.
This dup2() approach is not bulletproof because a Unix terminal (even when using a GUI terminal emulator instead of a hardware console) is a device in /dev. You can type tty in a shell or use ttyname(STDOUT_FILENO) in C to see which file in /dev corresponds to the terminal that stdout is writing to. In principle, any program (under the same user account) could open the terminal device using that filename and write to it without asking for permission from any other program. You can easily try this from the shell using the write program:
echo hello world | write $(whoami) /dev/ttys123
where /dev/ttys123 is whatever you got by typing tty in some other terminal window (the name looks a bit different on different operating systems, e.g. Linux and MacOS). You should see hello world appear in that other window.
From a child process, no. You must set this up in the parent preocess, and have it propagate downwards to children (barring some kind of crazy hack).
From the shell, you can redirect.
exec >file
This will redirect standard output to file, and it will apply to all future commands run in the shell. You can make this into a function, if you like.

Are stdin and stdout actually the same file?

I am completely confused, is it possible that stdin, stdout, and stderr point to the same filedescriptor internally?
Because it makes no difference in C if i want to read in a string from the console if I am using stdin as input or stdout.
read(1, buf, 200) works as read(0, buf, 200) how is this possible?
(0 == STDIN_FILENO == fileno(stdin),
1 == STDOUT_FILENO == fileno(stdout))
When the input comes from the console, and the output goes to the console, then all three indeed happen to refer to the same file. (But the console device has quite different implementations for reading and writing.)
Anyway, you should use stdin/stdout/stderr only for their intended purpose; otherwise, redirections like the following would not work:
<inputfile myprogram >outputfile
(Here, stdin and stdout refer to two different files, and stderr refers to the console.)
One thing that some people seem to be overlooking: read is the low-level system call. Its first argument is a Unix file descriptor, not a FILE* like stdin, stdout and stderr. You should be getting a compiler warning about this:
warning: passing argument 1 of ‘read’ makes integer from pointer without a cast [-Wint-conversion]
int r = read(stdout, buf, 200);
^~~~~~
On my system, it doesn't work with either stdin or stdout. read always returns -1, and errno is set to EBADF, which is "Bad file descriptor". It seems unlikely to me that those exact lines work on your system: the pointer would have to point to memory address 0, 1 or 2, which won't happen on a typical machine.
To use read, you need to pass it STDIN_FILENO, STDOUT_FILENO or STDERR_FILENO.
To use a FILE* like stdin, stdout or stderr, you need to use fread instead.
is it possible that stdin, stdout, and stderr point to the same filedescriptor internally?
A file descriptor is an index into the file descriptor table of your process (see also credentials(7)...). By definition STDIN_FILENO is 0, STDOUT_FILENO is 1, annd STDERR_FILENO is 2. Read about proc(5) to query information about some process (for example, try ls -l /proc/$$/fd in your interactive shell).
The program (usually, but not always, some shell) which has execve(2)-d your executable might have called dup2(2) to share (i.e. duplicate) some file descriptors.
See also fork(2), intro(2) and read some Linux programming book, such as the old ALP.
Notice that read(2) from STDOUT_FILENO could fail (e.g. with errno(3) being EBADF) in the (common) case where stdout is not readable (e.g. after redirection by the shell). If reading from the console, it could be readable. Read also the Tty Demystified.
There is nothing prohibiting any number of file-handles referring the same thing in the kernel.
And the default for a terminal-program is to have STDIN, STDOUT and STDERR refer to the same terminal.
So, it might look like it doesn't matter which you use, but it will all go wrong if the caller does any handle-redirection, which is quite common.
The most common is piping output from one program into the input of the next, but keeping stdout out of that.
An example for the shell:
source | filter | sink
Programs such as login and xterm typically open the tty device once when creating a new terminal session, and duplicate the file descriptor two or three times, arranging for file descriptors 0, 1 and 2 to be linked to the open file description of the opened tty device. They typically close all other file descriptors before exec-ing the shell. So if no further redirection is done by the shell or its child processes, the file descriptors, 0, 1 and 2, remain linked to the same file. Because the underlying tty device was opened in read-write mode, all three file descriptors have both read and write access.

Does advisory file locking work with default file descriptors?

For example, say I have the following shell command.
~]$ foobar 2>> foobar.log
The above command redirects the standard error output (stderr, or file descriptor 2) to the file foobar.log, appending the output (the >> rather than just >).
Now, assume that two users are both running the exact same command. In this case, the output to the file is interleaved, making it rather difficult to read.
Programs can utilise "advisory file locking" (via the fcntl() C function) as a operating-system level mutual-exclusion on files, essentially coordinating the multiple process so that only one process writes to the file at any given time. Hence the output of the two processes is no longer interleaved and becomes easier to read.
However, how do shells implement the invocation above? If they use the pipe() system call, advisory file locking will not work. If, on the other hand, they use dup() (or some other variant) before calling fork()/exec(), then advisory file locking should work.
Which is the case, and should advisory file locking work on shell-redirected standard output (stdout, file descriptor 1) and standard error (stderr, file descriptor 2)?
fcntl() file locks are bound to a process, not a thread. So locking a file descriptor which is attached to something only available to that process seems useless. Not only is there no actual file to lock (e.g. the mechanism would have to differ somehow), the process which locks its own stderr is competing with no one else for that lock.
As per the comment by #nos, this may be dependent on the operating system and the type of file that file descriptor 2 is connected to. E.g., the above call will not work if standard error is piped to another program, like so:
~]$ mkfifo pipe
~]$ cat pipe
~]$ foobar 2>> pipe
However, if standard error is redirected to a regular file (as in the above example), then advisory file working does appear to work, at least on Arch Linux.

dup2 / dup - Why would I need to duplicate a file descriptor?

I'm trying to understand the use of dup2 and dup.
From the man page:
DESCRIPTION
dup and dup2 create a copy of the file descriptor oldfd. After successful return of dup or dup2, the old and new descriptors may be used interchangeably. They share locks, file position pointers and flags; for example, if the file position is modified by using lseek on one of the descriptors, the position is also changed for the other.
The two descriptors do not share the close-on-exec flag, however. dup uses the lowest-numbered unused descriptor for the new descriptor.
dup2 makes newfd be the copy of oldfd, closing newfd first if necessary.
RETURN VALUE
dup and dup2 return the new descriptor, or -1 if an error occurred (in which case, errno is set appropriately).
Why would I need that system call? What is the use of duplicating the file descriptor? If I have the file descriptor, why would I want to make a copy of it? I'd appreciate it if you could explain and give me an example where dup2 / dup is needed.
The dup system call duplicates an existing file descriptor, returning a new one that
refers to the same underlying I/O object.
Dup allows shells to implement commands like this:
ls existing-file non-existing-file > tmp1 2>&1
The 2>&1 tells the shell to give the command a file descriptor 2 that is a duplicate of descriptor 1. (i.e stderr & stdout point to same fd).
Now the error message for calling ls on non-existing file and the correct output of ls on existing file show up in tmp1 file.
The following example code runs the program wc with standard input connected
to the read end of a pipe.
int p[2];
char *argv[2];
argv[0] = "wc";
argv[1] = 0;
pipe(p);
if(fork() == 0) {
close(STDIN); //CHILD CLOSING stdin
dup(p[STDIN]); // copies the fd of read end of pipe into its fd i.e 0 (STDIN)
close(p[STDIN]);
close(p[STDOUT]);
exec("/bin/wc", argv);
} else {
write(p[STDOUT], "hello world\n", 12);
close(p[STDIN]);
close(p[STDOUT]);
}
The child dups the read end onto file descriptor 0, closes the file de
scriptors in p, and execs wc. When wc reads from its standard input, it reads from the
pipe.
This is how pipes are implemented using dup, well that one use of dup now you use pipe to build something else, that's the beauty of system calls,you build one thing after another using tools which are already there , these tool were inturn built using something else so on ..
At the end system calls are the most basic tools you get in kernel
Cheers :)
Another reason for duplicating a file descriptor is using it with fdopen. fclose closes the file descriptor that was passed to fdopen, so if you don't want the original file descriptor to be closed, you have to duplicate it with dup first.
dup is used to be able to redirect the output from a process.
For example, if you want to save the output from a process, you duplicate the output (fd=1), you redirect the duplicated fd to a file, then fork and execute the process, and when the process finishes, you redirect again the saved fd to output.
Some points related to dup/dup2 can be noted please
dup/dup2 - Technically the purpose is to share one File table Entry inside a single process by different handles. ( If we are forking the descriptor is duplicated by default in the child process and the file table entry is also shared).
That means we can have more than one file descriptor having possibly different attributes for one single open file table entry using dup/dup2 function.
(Though seems currently only FD_CLOEXEC flag is the only attribute for a file descriptor).
http://www.gnu.org/software/libc/manual/html_node/Descriptor-Flags.html
dup(fd) is equivalent to fcntl(fd, F_DUPFD, 0);
dup2(fildes, fildes2); is equivalent to
close(fildes2);
fcntl(fildes, F_DUPFD, fildes2);
Differences are (for the last)- Apart from some errno value beteen dup2 and fcntl
close followed by fcntl may raise race conditions since two function calls are involved.
Details can be checked from
http://pubs.opengroup.org/onlinepubs/009695399/functions/dup.html
An Example of use -
One interesting example while implementing job control in a shell, where the use of dup/dup2 can be seen ..in the link below
http://www.gnu.org/software/libc/manual/html_node/Launching-Jobs.html#Launching-Jobs

can a process create extra shell-redirectable file descriptors?

Can a process 'foo' write to file descriptor 3, for example, in such a way that inside a bash shell one can do
foo 1>f1 2>f2 3>f3
and if so how would you write it (in C)?
You can start your command with:
./foo 2>/dev/null 3>file1 4>file2
Then if you ls -l /proc/_pid_of_foo_/fd you will see that file descriptors are created, and you can write to them via eg.:
write(3,"abc\n",4);
It would be less hacky perhaps if you checked the file descriptor first (with fcntl?).
The shell opens the file descriptors for your program before executing it. Simply use them like you would any other file descriptor, e.g. write(3, buf, len); etc. You may want to do error checking to make sure they were actually opened (attempting to dup them then closing the duplicate would be one easy check).
No.
The file descriptors are opened by the shell and the child process inherits them. It is not the child process which opens these command-line accessible file descriptors, it is the bash process.
There might be a way to convince bash to open additional file descriptors on behalf of the process. That wouldn't be portable to other shells, and I'm not sure if a mechanism even exists -- I am just speculating.
The point is that you can't do this from coding the child process in a special way. The shell would have to abide your desires.
Can a process 'foo' write to file descriptor 3, for example, in such a way that inside a bash shell one can do [...] and if so how would you write it (in C)?
I'm not sure what you are precisely after, but whatever it is, starting point going to be the man dup/man dup2 - this is how the shells make out of a random file descriptor a file descriptor with given number.
But obviously, the process foo has to know somehow that it can write to the file descriptor 3. POSIX only specifies 0, 1 and 2: shell ensures that whatever is started gets the file descriptors and libc in application's context also expects them to be the stdin/stdout/stderr. Starting from 3 and beyond - is up to application developer.

Resources