For my master thesis project I am building an API in C that works with Unix sockets. To make it short, I have two sockets identified by their two fds, on which I have called a O_NONBLOCK connect(). At this point, I am calling select() to check which one connects first and is ready for writing.
The problems start now, as the application which is using this API is aware of only one of those sockets, let's say the one identified by fd1. If the socket identified by fd2 is the first to connect, the application has no way to know it can write to that socket.
I think my best options are using dup() and/or dup2(), but according to the their man page, dup() creates a copy of the fd passed to the function, but which refers to the same open file description, meaning that the two can be used interchangeably, and dup2() closes the new fd which replaces the old fd.
So my assumptions on what would happen are (in pseudo code)
int fd1, fd2, fd3;
fd1 = socket(x); // what the app is aware of
fd2 = socket(y); // first to connect
fd3 = dup(fd1); // fd1 and fd3 identify the same description
dup2(fd2, fd1); // The description identified by fd2 is now identified by fd1, the description previously identified by fd1 (and fd3) is closed
dup2(fd3, fd2); // The description identified by fd3 (copy of fd1, closed in the line above) is identified by fd2 (which can be closed and reassigned to fd3) since now the the description that was being identified by fd2 is being identified by fd1.
Which looks fine, except for the fact that the first dup2() closes fd1, which closes also fd3 since they are identifying the same file description. The second dup2() works fine but it's replacing the fd of a connection which has been closed by the first one, while I want it to keep trying to connect.
Can anyone with a better understanding of Unix file descriptors help me out?
EDIT: I want to elaborate a little bit more on what the API does and why the application only sees one fd.
The API provides to the application the means to call a very "fancy" version of connect() select() and close().
When the application calls api_connect(), it passes to the function a pointer to an int (together with all the necessary addresses and protocols etc). api_connect() will call socket(), bind() and connect(), the important part is that it will write the return value of socket() in the memory parsed through the pointer. This is what I mean by "The socket is only aware of one fd". The application will then call FD_SET(fd1, write_set), call a api_select() and then check if the fd is writable by calling FD_ISSET(fd1, write_set). api_select() works more or less like select(), but has a timer which can trigger a timeout if the connection takes more than a set amount of time to connect (since it's O_NONBLOCK). If this happens, api_select() creates a new connection on a different interface (calling all the necessary socket(), bind() and connect()). This connection is identified by a new fd -fd2- the application doesn't know about, and which is tracked in the API.
Now, if the application calls api_select() with FD_SET(fd1, write_set) and the API realises that is the second connection that has completed, thus making fd2 writable, I want the application to use fd2. The problem is that the application will only call FD_ISSET(fd1, write_set) and write(fd1) afterwards, that's why I need to replace fd2 with fd1.
At this point I'm really confused on whether I really need to dup or just do an integer swap (my understanding of Unix file descriptors is just a little bit more than basic).
I think my best options are using dup() and/or dup2(), but according
to the their man page, dup() creates a copy of the fd passed to the
function, but which refers to the same open file description,
Yes.
meaning
that the two can be used interchangeably,
Maybe. It depends on what you mean by "interchangeably".
and dup2() closes the new fd
which replaces the old fd.
dup2() closes the target file descriptor, if it is open, before duping the source descriptor onto it. Perhaps that's what you meant, but I'm having trouble reading your description that way.
So my assumptions on what would happen are (excuse my crappy pseudo
code)
int fd1, fd2, fd3;
fd1 = socket(x); // what the app is aware of
fd2 = socket(y); // first to connect
fd3 = dup(fd1); // fd1 and fd3 indentify the same description
Good so far.
dup2(fd2, fd1); // The description identified by fd2 is now identified by fd1, the description previously identified by fd1 (and fd3) is closed
No, the comment is incorrect. File descriptor fd1 is first closed, and then made to be a duplicate of fd2. The underlying open file description to which fd1 originally referred is not closed, because the process has another open file descriptor associated with it, fd3.
dup2(fd3, fd2); // The description identified by fd3 (copy of fd1, closed in the line above) is identified by fd2 (which can be closed and reassigned to fd3) since now the thescription that was being identified by fd2 is being identified by fd1.
Which looks fine, except for the fact that the first dup2() closes
fd1,
Yes it does.
which closes also fd3
No it doesn't.
since they are identifying the same file
description.
Irrelevant. Closing is a function on file descriptors, not, directly, on the underlying open file descriptions. In fact, it would be best not to use the word "identifying" here, for that suggests that file descriptors are some kind of identifier or alias for open file descriptions. They are not. File descriptors identify entries in a table of associations with open file descriptions, but are not themselves open file descriptions.
In short, your sequence of dup(), dup2(), and dup2() calls should effect exactly the kind of swap you want, provided that they all succeed. They do, however, leave an extra open file descriptor hanging around, which would yield a file descriptor leak under many circumstances. Therefore, don't forget to finish up with a
close(fd3);
Of course, all that assumes that it is the value of fd1 that is special to the application, not the variable containing it. File descriptors are just numbers. There is nothing inherently special about the objects that contain them, so if it is the variable fd1 that the application needs to use, regardless of its specific value, then all you need to do is perform an ordinary swap of integers:
fd3 = fd1;
fd1 = fd2;
fd2 = fd3;
With respect to the edit, you write,
When the application calls api_connect(), it passes to the function a
pointer to an int (together with all the necessary addresses and
protocols etc). api_connect() will call socket(), bind() and
connect(), the important part is that it will write the return value
of socket() in the memory parsed through the pointer.
Whether api_connect() returns the file descriptor value by writing it through a pointer or by conveying it as or in the function's return value is irrelevant. The point remains that it is the value that matters, not the object, if any, containing it.
This is what I
mean by "The socket is only aware of one fd". The application will
then call FD_SET(fd1, write_set), call a api_select() and then check
if the fd is writable by calling FD_ISSET(fd1, write_set).
Well that sounds problematic in light of the rest of your description.
[Under some conditions,]
api_select() creates a new connection on a different interface
(calling all the necessary socket(), bind() and connect()). This
connection is identified by a new fd -fd2- the application doesn't
know about, and which is tracked in the API.
Now, if the application calls api_select() with FD_SET(fd1, write_set)
and the API realises that is the second connection that has completed,
thus making fd2 writable, I want the application to use fd2. The
problem is that the application will only call FD_ISSET(fd1,
write_set) and write(fd1) afterwards, that's why I need to replace fd2
with fd1.
Do note that even if you do swap file descriptors as described in the first part of this answer, that will have no effect on either FD's membership in any fd_set, for such membership is logical, not physical. You will have to manage fd_set membership manually if the caller relies on that.
It is unclear to me whether api_select() is intended to provide services for more than one (caller-specified) file descriptor at the same time, as select() can do, but I imagine that the bookkeeping required for it to do so would be monstrous. On the other hand, if in fact the function handles only one caller-provided FD at a time, then mimicking the interface of select() is ... odd.
In that case, I would strongly urge you to design a more suitable interface. Among other things, such an interface should moot the question of swapping FDs. Instead, it can directly tell the caller what FD, if any, is ready for use, either by returning it or by writing it through a pointer to a variable specified by the caller.
Also, in the event that you do switch, one way or another, to an alternative FD, do not overlook managing the old one lest you leak a file descriptor. Each process has a pretty limited quantity of those available, so a file descriptor leak can be much more troublesome than a memory leak. In the event that you do switch, then, are you sure you really need to swap, as opposed to just dup2()ing the new FD onto the old, then closing the new?
I want to vfork() a child process, but have its stdout be different than the parent's stdout.
The obvious way to achieve this with fork() would be to dup2() (and close() the original file descriptor) in the child after forking.
Let's say I have the file descriptor ready before calling vfork() and just need to call these two system calls before calling an exec*() function. Am I allowed to do that?
My answer is probably yes.
Based on Linux manual, A call to vfork() is equivalent to calling clone(2) with flags specified as:
CLONE_VM | CLONE_VFORK | SIGCHLD
Note from clone(2):
CLONE_FILES (since Linux 2.0)
If CLONE_FILES is set, the calling process and the child process share the same file descriptor table. Any file descriptor created by the calling process or by the child
process is also valid in the other process. Similarly, if one of the processes closes a file descriptor, or changes its associated flags (using the fcntl(2) F_SETFD oper‐
ation), the other process is also affected.
If CLONE_FILES is not set, the child process inherits a copy of all file descriptors opened in the calling process at the time of clone(). (The duplicated file descrip‐
tors in the child refer to the same open file descriptions (see open(2)) as the corresponding file descriptors in the calling process.) Subsequent operations that open or
close file descriptors, or change file descriptor flags, performed by either the calling process or the child process do not affect the other process.
So after vfork, file operations in child process are isolated by default.
I want to use named fifo channel and I want to implement a timeout when I write in this fifo.
fd = open(pipe, O_WRONLY);
write(fd, msg, len);
Program is blocked by function open, so using the function select will not work.
Thanks.
use select() and its timeout argument.
Read pipe(7), fifo(7), poll(2)
You might setup a timer or or alarm with a signal handler (see time(7) & signal(7)) before your call to open(2) - but I won't do that - or you could use the O_NONBLOCK flag, since fifo(7) says:
A process can open a FIFO in nonblocking mode. In this case, opening
for read-only will succeed even if no-one has opened on the write
side yet, opening for write-only will fail with ENXIO (no such device
or address) unless the other end has already been opened.
However, you need something (some other process reading) on the other side of the FIFO or pipe.
Perhaps you should consider using unix(7) sockets, i.e. the AF_UNIX address family. It looks more relevant to your case: change your code above (trying to open for writing a FIFO) to a AF_UNIX socket on the client side (with a connect), and change the other process to become an AF_UNIX socket server.
As 5gon12eder commented, you might also look into inotify(7). Or even perhaps D-bus !
I'm guessing that FIFOs or pipes are not the right solution in your situation. You should explain more and give a broader picture of your concerns and goals.
Like so:
if (fcntl(fd, F_SETFD, FD_CLOEXEC) == -1) {
...
Though I've read man fcntl, I can't figure out what it does.
It sets the close-on-exec flag for the file descriptor, which causes the file descriptor to be automatically (and atomically) closed when any of the exec-family functions succeed.
It also tests the return value to see if the operation failed, which is rather useless if the file descriptor is valid, since there is no condition under which this operation should fail on a valid file descriptor.
It marks the file descriptor so that it will be close()d automatically when the process or any children it fork()s calls one of the exec*() family of functions. This is useful to keep from leaking your file descriptors to random programs run by e.g. system().
Note that the use of this flag is essential in some multithreaded programs, because using a separate fcntl(2) F_SETFD operation to set the FD_CLOEXEC flag does not suffice to avoid race conditions where one thread opens a file descriptor and attempts to set its close-on-exec flag using fcntl(2) at the same time as another thread does a fork(2) plus execve(2). Depending on the order of execution, the race may lead to the file descriptor returned by open() being unintentionally leaked to the program executed by the child process created by fork(2).
(This kind of race is, in principle, possible for any system call that creates a file descriptor whose close-on-exec flag should be set, and various other Linux system calls provide an equivalent of the O_CLOEXEC flag to deal with this problem.)
In Linux if we call blocking recv from one thread and close for the same socket from another thread, recv doesn't exit.
Why?
The "why" is simply that that's how it works, by design.
Within the kernel, the recv() call has called fget() on the struct file corresponding to the file descriptor, and this will prevent it from being deallocated until the corresponding fput().
You will simply have to change your design (your design is inherently racy anyway - for this to happen, you must have no locking protecting the file descriptor in userspace, which means that the close() could have happened just before the recv() call - and the file descriptor even been reused for something else).
If you want to wake up another thread that's blocking on a file descriptor, you should have it block on select() instead, with a pipe included in the file descriptor set that can be written to by the main thread.
Check that all file descriptors for the socket have been closed. If any remain open at the "remote end" (assuming this is the one you attempt to close), the "peer has not performed an orderly shutdown".
If this still doesn't work, call shutdown(sock, SHUT_RDWR) on the remote end, this will shut the socket down regardless of reference counts.