Epoll: does it silently remove fds? - c

I've been reading through libev's source code and stumbled upon this comment:
a) epoll silently removes fds from the fd set. as nothing tells us that an fd has been removed otherwise, we have to continually "rearm" fds that we suspect might have changed (same problem with kqueue, but much less costly there).
I've been doing some tests with epoll (directly using syscalls) on some modern linux kernel and I couldn't reproduce it. I didn't see any problem with "silently disappearing fds". Could someone elaborate on this and tell me if it's still an issue?

This is rather vague text there, but I guess it is just that if the descriptor is closed elsewhere, it is silently removed from the set. From Linux manpages, epoll(7):
Q6 Will closing a file descriptor cause it to be
removed from all epoll sets automatically?
A6 Yes, but be aware of the following point. A file
descriptor is a reference to an open file
description (see open(2)). Whenever a descriptor
is duplicated via dup(2), dup2(2), fcntl(2)
F_DUPFD, or fork(2), a new file descriptor refer‐
ring to the same open file description is cre‐
ated. An open file description continues to
exist until all file descriptors referring to it
have been closed. A file descriptor is removed
from an epoll set only after all the file
descriptors referring to the underlying open file
description have been closed (or before if the
descriptor is explicitly removed using
epoll_ctl(2) EPOLL_CTL_DEL). This means that
even after a file descriptor that is part of an
epoll set has been closed, events may be reported
for that file descriptor if other file descrip‐
tors referring to the same underlying file
description remain open.
So you have a socket with fd 42. It gets closed, and subsequently removed from the epoll object. But the kernel doesn't notify the libev about this through epoll_wait. Now the epoll_modify is called again with fd = 42. epoll_modify doesn't know whether this file descriptor 42 the same that already was in the epoll object or some other file description with the file descriptor number 42 reused.
One could also argue that the comments are just ranting and the design of the libev API is at fault here.

Related

open(2) a file from existing descriptor

Background
I have multiple threads in the same process that are all installing fcntl(2) locks on a given file. These locks must block, thus to achieve intra-process blocking I must use Open file description locks (or OFD locks, see fcntl(2)). And it is documented that:
Open file description locks placed via the same open file
description (i.e., via the same file descriptor, or via a
duplicate of the file descriptor created by fork(2), dup(2),
fcntl() F_DUPFD, and so on) are always compatible: if a new lock
is placed on an already locked region, then the existing lock is
converted to the new lock type. (Such conversions may result in
splitting, shrinking, or coalescing with an existing lock as
discussed above.)
On the other hand, open file description locks may conflict with
each other when they are acquired via different open file
descriptions. Thus, the threads in a multithreaded program can
use open file description locks to synchronize access to a file
region by having each thread perform its own open(2) on the file
and applying locks via the resulting file descriptor.
Thus, when a thread is booting up, it must open its own descriptor via open. It should be noted that the "main thread" has the file already open and threads come and go throughout the processes lifetime.
Question
So I was thinking, is there any way I can re-use an existing file descriptor to open a separate descriptor to the same file without dup(2)?
In otherwords, if I had file descriptor A, but do not know the filename, can I open descriptor B pointing to that same file A is?
My first instinct is as follows, whereas fd is the original file descriptor and fd2 is the "deep cloned" descriptor.
char buff[500]
sprintf(buff, "/proc/%d/fd/%d", getpid(), fd);
fd2 = open(buff, O_RDWR)
However, it feels dirty. I was hoping there is a system call to do this.

Is it necessary to remove all file descriptors in the interest list before closing the epoll instance itself?

Provided that:
I have created an epoll instance epfd by epoll_create, and registered many regular file descriptors by EPOLL_CTL_ADD.
I want to close the epoll instance by close(epfd)
The manual page does't say whether I must EPOLL_CTL_DEL all file descriptors before close(epfd).
So, my question is:
Is it necessary to remove all file descriptors in the interest list before closing the epoll instance itself?
No, it's not necessary to manually remove them.
From the epoll_create(2) manpage (emphasis added)
When all file descriptors referring to an epoll instance have been closed, the kernel destroys the instance and releases the associated resources for reuse.

Will a descriptor that has been closed before being put in a fd_set be selected?

If while I put a connection descriptor in the rdset used in the select syscall the client closes that descriptor so it is already closed when the select is called, will it still be selected and a read/write on it return 0?
Or will it remain in the set and never be selected?
If while I put a connection descriptor in the rdset used in the select
syscall the client closes that descriptor so it is already closed when
the select is called, will it still be selected and a read/write on it
return 0?
Or will it remain in the set and never be selected?
Possibly neither. The most likely result is that the select() call fails, returning -1 and setting errno to EBADF. Per POSIX, this indicates that
One or more of the file descriptor sets specified a file descriptor that is not a valid open file descriptor.
The Linux manual page for select(2) gives a file descriptor that was already closed as a specific example of a bad file descriptor. However, the Linux manual also documents a bug that Linux select() ignores FDs in the provided fdsets that are greater than any that the process currently has open. On Linux, then, you cannot rely on select() failing in your scenario, but if select() does not fail then it will never select the file descriptor in question.
POSIX select()
requires a closed file descriptor to generate an error:
ERRORS
Under the following conditions, pselect() and select() shall fail and set errno to:
[EBADF]
One or more of the file descriptor sets specified a file descriptor that is not a valid open file descriptor.

What are the unwanted side effects of open in ioctl?

According to man ioctl, opening file descriptors with open may cause unwanted side-effects. The manual also states that opening with O_NONBLOCK solves those unwanted issues but I can't seem to find what's the reason for that, nor what are the actual side-effects. Can someone shed light into that? With ioctl is it always possible and equivalent* to open file descriptors with O_NONBLOCK?
NOTES (from man ioctl)
In order to use this call, one needs an open file descriptor. Often the open(2) call has unwanted side effects, that can be avoided
under Linux by giving it the O_NONBLOCK flag.
(* I am aware of what O_NONBLOCK implies, but I don't know if that affects ioctl calls the same way it affects other syscalls. My program, which uses ioctl to write and read from an SPI bus, works perfectly with that flag enabled.)
The obvious place to look for the answer would be the open(2) man page, which, under the heading for O_NONBLOCK, says:
When possible, the file is opened in nonblocking mode.
Neither the open() nor any subsequent operations on the file
descriptor which is returned will cause the calling process to
wait.
[...]
For the handling of FIFOs (named pipes), see also fifo(7).
For a discussion of the effect of O_NONBLOCK in conjunction
with mandatory file locks and with file leases, see fcntl(2).
OK, that wasn't very informative, but let's follow the links and see what the manual pages for fifo(7) and fcntl(2) say:
Normally, opening the FIFO blocks until the other end is opened also.
A process can open a FIFO in nonblocking mode. In this case, opening
for read-only will succeed even if no-one has opened on the write
side yet and opening for write-only will fail with ENXIO (no such
device or address) unless the other end has already been opened.
Under Linux, opening a FIFO for read and write will succeed both in
blocking and nonblocking mode. POSIX leaves this behavior undefined.
So here's at least one "unwanted side effect": even just trying to open a FIFO may block, unless you pass O_NONBLOCK to the open() call (and open it for reading).
What about fcntl, then? As the open(2) man page notes, the sections to look under are those title "Mandatory locking" and "File leases". It seems to me that mandatory locks, in this case, are a red herring, though — they only cause blocking when one tries to actually read from or write to the file:
If a process tries to perform an incompatible
access (e.g., read(2) or write(2)) on a file region that has an
incompatible mandatory lock, then the result depends upon whether the
O_NONBLOCK flag is enabled for its open file description. If the
O_NONBLOCK flag is not enabled, then the system call is blocked until
the lock is removed or converted to a mode that is compatible with
the access.
What about leases, then?
When a process (the "lease breaker") performs an open(2) or
truncate(2) that conflicts with a lease established via F_SETLEASE,
the system call is blocked by the kernel and the kernel notifies the
lease holder by sending it a signal (SIGIO by default). [...]
Once the lease has been voluntarily or forcibly removed or
downgraded, and assuming the lease breaker has not unblocked its
system call, the kernel permits the lease breaker's system call to
proceed.
[...] If
the lease breaker specifies the O_NONBLOCK flag when calling open(2),
then the call immediately fails with the error EWOULDBLOCK, but the
other steps still occur as described above.
OK, so that's another unwanted side effect: if the file you're trying to open has a lease on it, the open() call may block until the lease holder has released the lease.
In both cases, the "unwanted side effect" avoided by passing O_NONBLOCK to open() is, unsurprisingly, the open() call itself blocking until some other process has done something. If there are any other kinds of side effects that the man page you cite refers to, I'm not aware of them.
From Linux System Programming, 2nd Edition, by R. Love (emphasis mine):
O_NONBLOCK
If possible, the file will be opened in nonblocking mode. Neither the
open() call,
nor any other operation will cause the process to block (sleep) on the I/O.
This
behaviour may be defined only for FIFOs.
Sometimes, programmers do not want a call to read() to block when there is
no data
available. Instead, they prefer that the call return immediately,
indicating that no data
is available. This is called nonblocking I/O; it allows applications to
perform I/O, potentially
on multiple files, without ever blocking, and thus missing data available in
another file.
Consequently, an additional errno value is worth checking: EAGAIN.
See this thread on the kernel mailing list.

How to create blocking file descriptor in unix?

I would like to create blocking and non-blocking file in Unix's C. First, blocking:
fd = open("file.txt", O_CREAT | O_WRONLY | O_EXCL);
is that right? Shouldnt I add some mode options, like 0666 for example?
How about non-blocking file? I have no idea for this.
I would like to achieve something like:
when I open it to write in it, and it's opened for writing, it's ok; if not it blocks.
when I open it to read from it, and it's opened for reading, it's ok; if not it blocks.
File descriptors are blocking or non-blocking; files are not. Add O_NBLOCK to the options in the open() call if you want a non-blocking file descriptor.
Note that opening a FIFO for reading or writing will block unless there's a process with the FIFO open for the other operation, or you specify O_NBLOCK. If you open it for read and write, the open() is non-blocking (will return promptly); I/O operations are still controlled by whether you set O_NBLOCK or not.
The updated question is not clear. However, if you're looking for 'exclusive access to the file' (so that no-one else has it open), then neither O_EXCL nor O_NBLOCK is the answer. O_EXCL affects what happens when you create the file; the create will fail if the file already exists. O_NBLOCK affects whether a read() operation will block when there's no data available to read. If you read the POSIX open() description, there is nothing there that allows you to request 'exclusive access' to a file.
To answer the question about file mode: if you include O_CREAT, you need the third argument to open(). If you omit O_CREAT, you don't need the third argument to open(). It is a varargs function:
int open(const char *filename, int options, ...);
I don't know what you are calling a blocking file (blocking IO in Unix means that the IO operations wait for the data to be available or for a sure failure, they are opposed to non-blocking IO which returns immediately if there is no available data).
You always need to specify a mode when opening with O_CREAT.
The open you show will fails if the file already exists (when fixed for the above point).
Unix has no standard way to lock file for exclusive access excepted that. There are advisory locks (but all programs must respect the protocol). Some have mandatory lock extension. The received wisdom is not to rely on either kind of locking when accessing network file system.
Shouldn't I add some mode options?
You should, if the file is write-only and to be created if nonexistent. In this case, open() expects a third argument as well, so omitting it results in undefined behavior.
Edit:
The updated question is even more confusing...
when I open it to write in it, and it's opened for writing, it's ok; if not it blocks.
Why would you need that? See, if you try to write to a file/file descriptor not opened for writing, write() will return -1 and you can check the error code stored in errno. Tell us what you're trying to achieve by this bizarre thing you want instead of overcomplicating and messing up your code.
(Remarks in parentheses:
I would like to create blocking and non-blocking file
What's that?
in unix's C
Again, there's no such thing. There is the C language, which is platform-independent.)

Resources