Does advisory file locking work with default file descriptors? - c

For example, say I have the following shell command.
~]$ foobar 2>> foobar.log
The above command redirects the standard error output (stderr, or file descriptor 2) to the file foobar.log, appending the output (the >> rather than just >).
Now, assume that two users are both running the exact same command. In this case, the output to the file is interleaved, making it rather difficult to read.
Programs can utilise "advisory file locking" (via the fcntl() C function) as a operating-system level mutual-exclusion on files, essentially coordinating the multiple process so that only one process writes to the file at any given time. Hence the output of the two processes is no longer interleaved and becomes easier to read.
However, how do shells implement the invocation above? If they use the pipe() system call, advisory file locking will not work. If, on the other hand, they use dup() (or some other variant) before calling fork()/exec(), then advisory file locking should work.
Which is the case, and should advisory file locking work on shell-redirected standard output (stdout, file descriptor 1) and standard error (stderr, file descriptor 2)?

fcntl() file locks are bound to a process, not a thread. So locking a file descriptor which is attached to something only available to that process seems useless. Not only is there no actual file to lock (e.g. the mechanism would have to differ somehow), the process which locks its own stderr is competing with no one else for that lock.

As per the comment by #nos, this may be dependent on the operating system and the type of file that file descriptor 2 is connected to. E.g., the above call will not work if standard error is piped to another program, like so:
~]$ mkfifo pipe
~]$ cat pipe
~]$ foobar 2>> pipe
However, if standard error is redirected to a regular file (as in the above example), then advisory file working does appear to work, at least on Arch Linux.

Related

popen vs. KornShell security

I am writing a C program using some external binaries to achieve a planned goal. I need to run one command which gives me an output, which in turn I need to process, then feed into another program as input. I am using popen, but wonder if that is the same as using a KornShell (ksh) temporary file instead.
For example:
touch myfile && chmod 700
cat myfile > /tmp/tempfile
process_file < /tmp/tempfile && rm /tmp/tempfile
Since that creates a temporary file which can be readable by root, would it be the same if one used popen in C, knowing that pipes are also files? Or is it safe to assume that the Operating System (OS) will not allow any other process to read your pipe?
You say "that creates a temporary file which can be readable by root", which implies that you are attempting to transfer the data in a way in which the root user cannot read it. That's impossible; in general, the root user has total control of the system, and can thus read any data that is on the system, whether it's in a temporary file or not. Even within a single process, the root user can read the memory of that process.
If you use popen(), there will not be an entry for the file on a filesystem; it creates a pipe, which acts like a file, but doesn't actually write that data to disk, instead it just passes it between two programs.
There will be a file descriptor for it; depending on the system, it may be easier or harder to intercept that data, but it will always be possible to do so. For instance, on Linux, you can just look in /proc/<pid>/fd/ to find all of the open file descriptors and manipulate them (read from or write to them).

How to Obtain Pseudo Terminal Master file descriptor from inside ssh session?

in C or bash,
I was wondering how, if possible, do you obtain from inside an ssh session, the file descriptor to the pseudo terminal master responsible for getting input to that's session's slave(pts).
The shell process has no master file descriptor, only slave.
The shell's parent process (be it sshd or xterm or screen or whatever) creates a new master by calling getpt(3) or posix_openpt(3). The function returns the master file descriptor. The parent process then obtains the slave file descriptor by calling a combination of grantpt(3), unlockpt(3), ptsname(3) and open(2). This is for Linux and other POSIXized systems, other *nixes may use other functions, but the net result is the same. The parent process has the master/slave pair of file descriptors.
The slave descriptor, and the slave descriptor only, is then passed to the shell as its standard input, output and error.
From Solaris 5.8 PTS(7D) Man-page - STREAMS pseudo-tty slave driver
Only one open is allowed on a master device.
I guess that answers my question :)
EDIT: actually it does not, because if there is a way to obtain the file descriptor, i won't need to open again, it is a file descriptor lol , no need to open
On unix-based systems, you can open the controlling terminal of the current process by opening /dev/tty. In many cases your program will already have this open as stdin, stdout and stderr, but even if your program is being invoked with stdin, stdout or stderr redirected, /dev/tty will give you the controlling terminal of the process.

Redirecting the output of a child process

There are several ways of redirecting the output of a child process:
using freopen(3)
using dup(3)
using popen(3)
...
What should one pick if all is wanted is to execute a child process and have it output saved in a given file, pretty much like the ls > files.txt works?
What is normally used by shells?
You can discover what your favorite shell uses by strace(1)ing your shell.
In one terminal:
echo $$
In another terminal:
strace -o /tmp/shell -f -p [PID from the first shell]
In the first terminal again:
ls > files.txt
In the second terminal, ^C your strace(1) command and then edit the /tmp/shell output file to see what system calls it made to do the redirection.
freopen(3) manipulates the C standard IO FILE* pointers. All this will be thrown away on the other side of the execve(2) call, because it is maintained in user memory. You could use this after the execve(2) call, but that would be awkward to use generically.
popen(3) opens a single unidirectional pipe(7). This is useful, but extremely limited -- you get either the standard output descriptor or the standard input descriptor. This would fail for something like ls | grep foo | sort where both input and output must be redirected. So this is a poor choice.
dup2(2) will manage file descriptors -- a kernel-implemented resource -- so it will persist across execve(2) calls and you can set up as many file descriptors as you need, which is nice for ls > /tmp/output 2> /tmp/error or handling both input and output: ls | sort | uniq.
There is another mechanism: pty(7) handling. The forkpty(3), openpty(3), functions can manage a new pseudo-terminal device created specifically to handle another program. The Advanced Programming in the Unix Environment, 2nd edition book has a very nice pty example program in its source code, though if you're having trouble understanding why this would be useful, take a look at the script(1) program -- it creates a new pseudo-terminal and uses it to record all input and output to and from programs and stores the transcript to a file for later playback or documentation. You can also use it to script actions in interactive programs, similar to expect(1).
I would expect to find dup2() used mainly.
Neither popen() nor freopen() is designed to handle redirections such as 3>&7. Up to a point, dup() could be used, but the 3>&7 example shows where dup() starts to creak; you'd have to ensure that file descriptors 4, 5, and 6 are open (and 7 is not) before it would handle what dup2() would do without fuss.

where stdin / stdout created

In c( ansi ) , we say input taken by (s/v/f)scanf and stored in stdin , same as we say
stdout . I wonder, in linux ( unix ) where are they reside, under which folder .
Or they ( stdin / stdout ) are arbitrary ( that is, no such things exist )
They are streams created for your process by the operating system. There is no named file object associated with them, and so they do not have a representation within the file system, although as unwind points out, they may be accessed via a pseudo file system if your UNIX variant supports such a thing.
stdin is a FILE * referring to the stdio (standard io) structure that is tied to the file descriptor 0. File descriptors are what Unix-like systems, such as Linux, use to talk with applications about particular file-like things. (Actually, I'm pretty sure that Windows does this as well).
File descriptor 0 may refer to any type of file, but to make sense it must be one that read can be called on (it must be a regular file, a steam socket, or a character device opened for reading or the read side of a pipe, as opposed to a directory file, data gram socket, or a block device).
Processes in Unix-like systems inherit their open file descriptors from their parent process in Unix-like systems. So to run a program with stdin set to something besides the parent's stdin you would do:
int new_stdin = open("new_stdin_file, O_RDONLY);
pid_t fk = fork();
if (!fk) { // in the child
dup2(new_stdin, 0);
close(new_stdin);
execl("program_name", "program_name", NULL);
_exit(127); // should not have gotten here, and calling exit (without _ ) can have
// side effects because it runs atexit registered functions, and we
// don't want that here
} else if (fk < 0) {
// in parent with error from fork
} else {
// in parent with no error so fk = pid of child
}
close(new_stdin); // we don't need this anymore
dup2 duplicates the first file descriptor argument as the second (closing the second before doing so if it were open for the current process).
fork creates a duplicate of the current process. execl is one of the exec family of functions, which use the execve system call to replace the current program with another program. The combination of fork and exec are how programs are generally run (even when hidden within other functions).
In the above example we could have run the new program with stdin set to the read end of a pipe, a tty (serial port / TeleTYpe), or several other things. Some of these have names present in the filesystem and others do not (like some pipes and sockets, though some do have names in the filesystem).
Linux makes /proc/self/fd/0 a symbolic link to the file opened as 0 in the current process. /proc/%i/fd/0, pid would represent the symbolic link to the same thing for an arbitrary pid (process ID) using the printf syntax. These symbolic links are often usable to find the real file in the filesystem (using the readlink system call), but if the file does not actually exist in the filesystem the link data (what would usually be a file name) instead is just a string that tells a little bit about the file.
I should point out here that a file that stdin (fd 0) refers to, even if it is in the filesystem, may not have just one name. It may have more than one hard link, so it would have more than one name -- and each of these would be just as much its name as any other hard link. Additionally it may have no name at all if all of its hard links have been unlinked since it was opened, though it's data would still live on the disk until all open file descriptors for it are closed.
If you don't actually need to know where it is in the filesystem, but just want some data about it you can use the fstat system call. This is like the stat system call and command line utility, except for already open files.
Everything I said here about stdin (fd 0) should be applicable to stdout (fd 1) and stderr (fd 2) except that they will both be writable rather than readable.
If you want to know more about any of the functions I mentioned be sure to look them up in the man pages by typing:
man fork
on the command line. Most functions I mentioned are in section 2 of the man pages, but one or two may be in section one, so man 2 fork will work too, and may be useful when a command line tool has the same name as a function.
In Linux, you can generally find stdin through the /proc file system in /proc/self/fd/0, and stdout is /proc/self/fd/1.
stdin is standard input - for example, keyboard input.
stdout is standard output - for example, monitor.
For more info, read this.
If you run:
./myprog < /etc/passwd
then stdin exists in the filesystem as /etc/passwd. If you just run
./myprog
interactively on a terminal, then stdin exists in the filesystem as whatever your terminal device is (probably /dev/pts/5 or something).
If you run
cat /etc/passwd | ./myprog
then stdin is an anonymous pipe and has no instantiation in the filesystem, but Linux allows you to get at it via /proc/12345/fd/0 where 12345 is the pid of myprog.

can a process create extra shell-redirectable file descriptors?

Can a process 'foo' write to file descriptor 3, for example, in such a way that inside a bash shell one can do
foo 1>f1 2>f2 3>f3
and if so how would you write it (in C)?
You can start your command with:
./foo 2>/dev/null 3>file1 4>file2
Then if you ls -l /proc/_pid_of_foo_/fd you will see that file descriptors are created, and you can write to them via eg.:
write(3,"abc\n",4);
It would be less hacky perhaps if you checked the file descriptor first (with fcntl?).
The shell opens the file descriptors for your program before executing it. Simply use them like you would any other file descriptor, e.g. write(3, buf, len); etc. You may want to do error checking to make sure they were actually opened (attempting to dup them then closing the duplicate would be one easy check).
No.
The file descriptors are opened by the shell and the child process inherits them. It is not the child process which opens these command-line accessible file descriptors, it is the bash process.
There might be a way to convince bash to open additional file descriptors on behalf of the process. That wouldn't be portable to other shells, and I'm not sure if a mechanism even exists -- I am just speculating.
The point is that you can't do this from coding the child process in a special way. The shell would have to abide your desires.
Can a process 'foo' write to file descriptor 3, for example, in such a way that inside a bash shell one can do [...] and if so how would you write it (in C)?
I'm not sure what you are precisely after, but whatever it is, starting point going to be the man dup/man dup2 - this is how the shells make out of a random file descriptor a file descriptor with given number.
But obviously, the process foo has to know somehow that it can write to the file descriptor 3. POSIX only specifies 0, 1 and 2: shell ensures that whatever is started gets the file descriptors and libc in application's context also expects them to be the stdin/stdout/stderr. Starting from 3 and beyond - is up to application developer.

Resources