What are the portable options if one needs to export open file descriptors to child processes created using exec family of library functions?
Thank you.
EDIT. I know that child processes inherit open descriptors. But how they use those descriptors without knowing their values? Should I implement some sort of IPC in order to pass descriptors to the child process? For example, if the parent creates a pipe, how can an execed child process know read/write ends of the pipe?
Simply don't set the O_CLOEXEC open(2) flag or its corresponding (and standard) FD_CLOEXEC fcntl(2) flag on the descriptor -- it'll be passed across an exec*() by default.
Update
Thanks for the clarification, that does change things a little bit.
There are several possibilities:
Use command line arguments: GnuPG in gpg(1) provides command line switches --status-fd, --logger-fd, --attribute-fd, --passphrase-fd, --command-fd for each file descriptor that it expects to receive. If there are several kinds of data to submit or retrieve, this lets each file descriptor focus on one type of data and reduces the need for parsing more complicated output.
Just work with files and accept filenames as parameters; when you call the program, pass it file names such as /dev/fd/5, and arrange for the input to be on fd 5 before calling the program:
cat /dev/fd/5 5</etc/passwd
Follow conventions: supply 0 to the child as the read end of a pipe, 1 to the write end of a pipe, and let it work as a normal pipeline "filter" command. This is definitely the best approach if all the input can be reasonably sent through a single file descriptor -- not always desirable.
Use an environment variable to indicate the file / socket / fd:
SSH_AUTH_SOCK=/tmp/ssh-ZriaCoWL2248/agent.2248
DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-CsUrnHGmKa,guid=e213e2...
This is nice to pass the file information through many child programs.
Related
Is there any way in Linux, using c, to generate a diff/patch of two files stored in memory, using a common format (ie: unified diff, like with the command-line diff utility)?
I'm working on a system where I generate two text files in memory, and no external storage is available, or desired. I need to create a line-by-line diff of the two files, and since they are mmap'ed, they don't have file names, preventing me from simply calling system("diff file1.txt file2.txt").
I have file descriptors (fds) available for use, and that's my only entry point to the data. Is there any way to generate a diff/patch by comparing the two open files? If the implementation is MIT/BSD licensed (ie: non-GPL), so much the better.
Thank you.
On Linux you can use the /dev/fd/ pseudo filesystem (a symbolic link to /proc/self/fd). Use snprintf() to construct the path for both file descriptors like snprintf(path1, PATH_MAX, "/dev/fd/%d", fd1); ditto for fd2 and run diff on them.
Considering the requirements, the best option would be to implement your own in-memory diff -au. You could perhaps adapt the relevant parts of OpenBSD's diff to your needs.
Here's an outline of one how you can use the /usr/bin/diff command via pipes to obtain the unified diff between two strings stored in memory:
Create three pipes: I1, I2, and O.
Fork a child process.
In the child process:
Move the read ends of pipes I1 and I2 to descriptors 3 and 4, and the write end of pipe O to descriptor 1.
Close the other ends of those pipes in the child process. Open descriptor 0 for reading from /dev/null, and descriptor 2 for writing to /dev/null.
Execute execl("/usr/bin/diff", "diff", "-au", "/proc/self/fd/3", "/proc/self/fd/4", NULL);
This executes the diff binary in the child process. It will read the inputs from the two pipes, I1 and I2, and output the differences to pipe O.
The parent process closes the read ends of the I1 and I2 pipes, and the write end of the O pipe.
The parent process writes the comparison data to the write ends of I1 and I2 pipes, and reads the differences from the read end of the O pipe.
Note that the parent process must use select() or poll() or a similar method (preferably with nonblocking descriptors) to avoid deadlock. (Deadlock occurs if both parent and child try to read at the same time, or write at the same time.) Typically, the parent process must avoid blocking at all costs, because that is likely to lead to a deadlock.
When the input data has been completely written, the parent process must close the respective write end of the pipe, so that the child process detects the end-of-input. (Unless an error occurs, the write ends must be closed before the child process closes its end of the O pipe.)
When the parent process notices that no more data is available in the O pipe (read() returning 0), either it has already closed the write ends of the I1 and I2 pipes, or there was an error. If there is no error, the data transfer is complete, and the child process can be reaped.
The parent process reaps the child using e.g. waitpid(). Note that if there were any differences, diff returns with exit status 1.
You can use a fourth pipe to receive the standard error stream from the child process; diff does not normally output anything to standard error.
You can use a fifth pipe, write end marked O_CLOEXEC with fcntl() in the child, to detect execl() errors. O_CLOEXEC flag means the descriptor is closed when executing another binary, so the parent process can detect successful starting of the diff command by detecting the end-of-data in the read end (read() returning 0). If the execl() fails, the child can e.g. write the errno value (as a decimal number, or as an int) to this pipe, so that the parent process can read the exact cause for the failure.
In all, the complete method (that both records standard error, and detects exec errors) uses 10 descriptors. This should not be an issue in a normal application, but may be important -- for example, consider an internet-facing server with descriptors used by incoming connections.
So, I was given this one line script:
echo test | cat | grep test
Could you please explain to me how exactly that would work given the following system calls: pipe(), fork(), exec() and dup2()?
I am looking for an general overview here and mainly the sequence of operations.
What I know so far is that the shell will fork using fork() and the script's code will replace the shell's one by using the exec(). But what about pipe and dup2? How do they fall in place?
Thanks in advance.
First consider a simpler example, such as:
echo test | cat
What we want is to execute echo in a separate process, arranging for its standard output to be diverted into the standard input of the process executing cat. Ideally this diversion, once setup, would require no further intervention by the shell — the shell would just calmly wait for both processes to exit.
The mechanism to achieve that is called the "pipe". It is an interprocess communication device implemented in the kernel and exported to the user-space. Once created by a Unix program, a pipe has the appearance of a pair of file descriptors with the peculiar property that, if you write into one of them, you can read the same data from the other. This is not very useful within the same process, but keep in mind that file descriptors, including but not limited to pipes, are inherited across fork() and even accross exec(). This makes pipe an easy to set up and reasonably efficient IPC mechanism.
The shell creates the pipe, and now owns a set of file descriptors belonging to the pipe, one for reading and one for writing. These file descriptors are inherited by both forked subprocesses. Now only if echo were writing to the pipe's write-end descriptor instead of to its actual standard output, and if cat were reading from the pipe's read-end descriptor instead of from its standard input, everything would work. But they don't, and this is where dup2 comes into play.
dup2 duplicates a file descriptor as another file descriptor, automatically closing the new descriptor beforehand. For example, dup2(1, 15) will close file descriptor 1 (by convention used for the standard output), and reopen it as a copy of file descriptor 15 — meaning that writing to the standard output will in fact be equivalent to writing to file descriptor 15. The same applies to reading: dup2(0, 8) will make reading from file descriptor 0 (the standard input) equivalent to reading from file descriptor 8. If we proceed to close the original file descriptor, the open file (or a pipe) will have been effectively moved from the original descriptor to the new one, much like sci-fi teleports that work by first duplicating a piece of matter at a remote location and then disintegrating the original.
If you're still following the theory, the order of operations performed by the shell should now be clear:
The shell creates a pipe and then fork two processes, both of which will inherit the pipe file descriptors, r and w.
In the subprocess about to execute echo, the shell calls dup2(1, w); close(w) before exec in order to redirect the standard output to the write end of the pipe.
In the subprocess about to execute cat, the shell calls dup2(0, r); close(r) in order to redirect the standard input to the read end of the pipe.
After forking, the main shell process must itself close both ends of the pipe. One reason is to free up resources associated with the pipe once subprocesses exit. The other is to allow cat to actually terminate — a pipe's reader will receive EOF only after all copies of the write end of the pipe are closed. In steps above, we did close the child's redundant copy of the write end, the file descriptor 15, right after its duplication to 1. But the file descriptor 15 must also exist in the parent, because it was inherited under that number, and can only be closed by the parent. Failing to do that leaves cat's standard input never reporting EOF, and its cat process hanging as a consequence.
This mechanism is easily generalized it to three or more processes connected by pipes. In case of three processes, the pipes need to arrange that echo's output writes to cat's input, and cat's output writes to grep's input. This requires two calls to pipe(), three calls to fork(), four calls to dup2() and close (one for echo and grep and two for cat), three calls to exec(), and four additional calls to close() (two for each pipe).
My program does the following in chronological order
The program is started with root permissions.
Among other tasks, A file only readable with root permissions is open()ed.
Root privileges are dropped.
Child processes are spawned with clone() and the CLONE_FILES | CLONE_FS | CLONE_IO flags set, which means that while they use separate regions of virtual memory, they share the same file descriptor table (and other IO stuff).
All child processes execve() their own programs (the FD_CLOEXEC flag is not used).
The original program terminates.
Now I want every spawned program to read the contents of the aforementioned file, but after they all have read the file, I want it to be closed (for security reasons).
One possible solution I'm considering now is having a step 3a where the fd of the file is dup()licated once for every child process, and each child gets its own fd (as an argv). Then every child program would simply close() their fd, so that after all fds pointing to the file are close()d the "actual file" is closed.
But does it work that way? And is it safe to do this (i.e. is the file really closed)? If not, is there another/better method?
While using dup() as I suggested above is probably just fine, I've now --a day after asking this SO question-- realized that there is a nicer way to do this, at least from the point of view of thread safety.
All dup()licated file descriptors point to the same same file position indicator, which of course means you run into trouble when multiple threads/processes might simultaneously try to change the file position during read operations (even if your own code does so in a thread safe way, the same doesn't necessarily go for libraries you depend on).
So wait, why not just call open() multiple times (once for every child) on the needed file before dropping root? From the manual of open():
A call to open() creates a new open file description, an entry in the system-wide table of open files. This entry records the file offset and the file status flags (modifiable via the fcntl(2) F_SETFL operation). A file descriptor is a reference to one of these entries; this reference is unaffected if pathname is subsequently removed or modified to refer to a different file. The new open file description is initially not shared with any other process, but sharing may arise via fork(2).
Could be used like this:
int fds[CHILD_C];
for (int i = 0; i < CHILD_C; i++) {
fds[i] = open("/foo/bar", O_RDONLY);
// check for errors here
}
drop_privileges();
// etc
Then every child gets a reference to one of those fds through argv and does something like:
FILE *stream = fdopen(atoi(argv[FD_STRING_I]), "r")
read whatever needed from the stream
fclose(stream) (this also closes the underlying file descriptor)
Disclaimer: According to a bunch of tests I've run this is indeed safe and sound. I have however only tested open()ing with O_RDONLY. Using O_RDWR or O_WRONLY may or may not be safe.
I have two prgrams lets say prog1 and prog2. I am opening a file with prog1 and doing some operations
on it. Now without closing the file in prog1 i am sending its file descriptor to prog2 using unix
sockets which then does some operations in it.
Though i get the same descriptor i passed in prog1 but doing a fstat() on the fd recieved in prog2
throws an error saying Bad file descriptor. I have opened the file in prog1 with corerct permissions
that is read and write for all, still i get an error.
Why is it happening so. If my way of passing a file descriptor is wrong then please suggest a correct
one.
I believe this site has what you're looking for:
http://www.lst.de/~okir/blackhats/node121.html
There's also information in Linux's man 7 unix on using SCM_RIGHTS and other features of Unix sockets.
Fix for broken link: http://web.archive.org/web/20131016032959/http://www.lst.de/~okir/blackhats/node121.html
This is normal. Each program has its own file descriptors.
EDIT: Well, it seems that you can pass file descriptor using local socket.
You can see them in /proc/PID/fd, they are often symlinks to your files. What you can do with unix socket is allowing a write to a file from one program to another with sendmsg/recvmsg. See this question for more detail.
But there's way many better way to write concurrently to a file. You can use fifo, shm or even just pass your offset position between your 2 programs.
A file descriptor is a small int value that lets you access a file. It's an index into a file descriptor table, a data structure in the kernel that's associated with each individual process. A process cannot do anything meaningful with a file descriptor from another process, since it has no access to any other process's file descriptor table.
This is for basic security reasons. If one process were able to perform operations on an open file belonging to another process, chaos would ensue. Also, a file descriptor just doesn't contain enough information to do what you're trying to do; one process's file descriptor 0 (stdin) might refer to a completely different file than another process's file descriptor 0. And even if they happen to be the same file, each process needs to maintain its own information about the state of that open file (how much it's read/written, etc.).
If you'll describe what you're trying to accomplish, perhaps we can help.
EDIT :
You want to pass data from one program to another. The most straightforward way to do this is to create a pipe (man 2 pipe). Note that the second process will have to be a child of the first.
An alternative might be to create a file that the second process can open and read (not trying to share a file descriptor), or you might use sockets.
I'm working on a multi-process program which basically perform fuzzification on each layer of a RVB file. (1 process -> 1 layer). Each child process is delivering a temp file by using the function: tmpfile(). After each child process finishes its job, the main process has to read each temp file created and assemble the data. The problem is that I don't know how to read each temp file inside the main process since I can't access to child's process memory so I can't know what's the temporary pointer to the temp file created!
Any idea?
Don't hesitate to ask for clarifications if needed.
The tmpfile() function returns you a FILE pointer to a file with no determinate name - indeed, even the child process cannot readily determine a name for the file, let alone the parent (and on many Unix systems, the file has no name; it has been unlinked before tmpfile() returns to the caller).
extern FILE *tmpfile(void);
So, you are using the wrong temporary file creation primitive if you must convey file names around.
You have a number of options:
Have the parent process create the file streams with tmpfile() so that both the parent and children share the files. There are some minor coordination issues to handle - the parent will need to seek back to the start before reading what the children wrote, and it should only do that after the child has exited.
Use one of the filename generating primitives instead - mkstemp() is good, and if you need a FILE pointer instead of a file descriptor, you can use fdopen() to create one. You are still faced with the problem of getting file names from children to parent; again, the parent could open the files, or you can use a pipe for each child, or some shared memory, or ... take your pick of IPC mechanisms.
Have the parent open a pipe for each child before forking. The child process closes the read end of the pipe and writes to the the write end; the parent closes the write end of the pipe and arranges to read from the the read end. The issue here with multiple children is that the capacity of any given pipe is finite (and quite small - typically about 5 KiB). Consequently, you need to ensure the parent reads all the pipes completely, bearing in mind that the children won't be able to exit until all the data has been read (strictly, all but the last buffer full has been read).
Consider using threads - but be aware of the coordination issues using threads.
Decide that you do not need to use multiple threads of control - whether processes or threads - but simply have the main program do the work. This eliminates coordination and IPC issues - it does mean you won't benefit from the multi-core processor on the machine.
Of these, assuming parallel execution is of paramount importance, I'd probably use pipes to get the file names from the children (option 2); it has the fewest coordination issues. But for simplicity, I'd go with 'main program does it all' (option 5).
If you call tmpfile() in parent process, child will inherit all open descriptors and will be able to write to the file, and opened file will be accessible for parent as well.
You could create a tempfile in the parent process and then fork, then have the child process use that.
The child process can send back the filedescriptor to the parent process.
EDIT: example code in APUE site (src.tar.gz/apue.2e/lib, recvfd.c, sendfd.c)
Use threads instead of subprocesses? Put the names of the temporary files in another file? Don't use random names for the temp files, but (for example) names based on the pid of the parent process (to allow several instances of your program to run simultaneously) plus a sequential number?