How to create blocking file descriptor in unix? - c

I would like to create blocking and non-blocking file in Unix's C. First, blocking:
fd = open("file.txt", O_CREAT | O_WRONLY | O_EXCL);
is that right? Shouldnt I add some mode options, like 0666 for example?
How about non-blocking file? I have no idea for this.
I would like to achieve something like:
when I open it to write in it, and it's opened for writing, it's ok; if not it blocks.
when I open it to read from it, and it's opened for reading, it's ok; if not it blocks.

File descriptors are blocking or non-blocking; files are not. Add O_NBLOCK to the options in the open() call if you want a non-blocking file descriptor.
Note that opening a FIFO for reading or writing will block unless there's a process with the FIFO open for the other operation, or you specify O_NBLOCK. If you open it for read and write, the open() is non-blocking (will return promptly); I/O operations are still controlled by whether you set O_NBLOCK or not.
The updated question is not clear. However, if you're looking for 'exclusive access to the file' (so that no-one else has it open), then neither O_EXCL nor O_NBLOCK is the answer. O_EXCL affects what happens when you create the file; the create will fail if the file already exists. O_NBLOCK affects whether a read() operation will block when there's no data available to read. If you read the POSIX open() description, there is nothing there that allows you to request 'exclusive access' to a file.
To answer the question about file mode: if you include O_CREAT, you need the third argument to open(). If you omit O_CREAT, you don't need the third argument to open(). It is a varargs function:
int open(const char *filename, int options, ...);

I don't know what you are calling a blocking file (blocking IO in Unix means that the IO operations wait for the data to be available or for a sure failure, they are opposed to non-blocking IO which returns immediately if there is no available data).
You always need to specify a mode when opening with O_CREAT.
The open you show will fails if the file already exists (when fixed for the above point).
Unix has no standard way to lock file for exclusive access excepted that. There are advisory locks (but all programs must respect the protocol). Some have mandatory lock extension. The received wisdom is not to rely on either kind of locking when accessing network file system.

Shouldn't I add some mode options?
You should, if the file is write-only and to be created if nonexistent. In this case, open() expects a third argument as well, so omitting it results in undefined behavior.
Edit:
The updated question is even more confusing...
when I open it to write in it, and it's opened for writing, it's ok; if not it blocks.
Why would you need that? See, if you try to write to a file/file descriptor not opened for writing, write() will return -1 and you can check the error code stored in errno. Tell us what you're trying to achieve by this bizarre thing you want instead of overcomplicating and messing up your code.
(Remarks in parentheses:
I would like to create blocking and non-blocking file
What's that?
in unix's C
Again, there's no such thing. There is the C language, which is platform-independent.)

Related

Epoll: does it silently remove fds?

I've been reading through libev's source code and stumbled upon this comment:
a) epoll silently removes fds from the fd set. as nothing tells us that an fd has been removed otherwise, we have to continually "rearm" fds that we suspect might have changed (same problem with kqueue, but much less costly there).
I've been doing some tests with epoll (directly using syscalls) on some modern linux kernel and I couldn't reproduce it. I didn't see any problem with "silently disappearing fds". Could someone elaborate on this and tell me if it's still an issue?
This is rather vague text there, but I guess it is just that if the descriptor is closed elsewhere, it is silently removed from the set. From Linux manpages, epoll(7):
Q6 Will closing a file descriptor cause it to be
removed from all epoll sets automatically?
A6 Yes, but be aware of the following point. A file
descriptor is a reference to an open file
description (see open(2)). Whenever a descriptor
is duplicated via dup(2), dup2(2), fcntl(2)
F_DUPFD, or fork(2), a new file descriptor refer‐
ring to the same open file description is cre‐
ated. An open file description continues to
exist until all file descriptors referring to it
have been closed. A file descriptor is removed
from an epoll set only after all the file
descriptors referring to the underlying open file
description have been closed (or before if the
descriptor is explicitly removed using
epoll_ctl(2) EPOLL_CTL_DEL). This means that
even after a file descriptor that is part of an
epoll set has been closed, events may be reported
for that file descriptor if other file descrip‐
tors referring to the same underlying file
description remain open.
So you have a socket with fd 42. It gets closed, and subsequently removed from the epoll object. But the kernel doesn't notify the libev about this through epoll_wait. Now the epoll_modify is called again with fd = 42. epoll_modify doesn't know whether this file descriptor 42 the same that already was in the epoll object or some other file description with the file descriptor number 42 reused.
One could also argue that the comments are just ranting and the design of the libev API is at fault here.

Use a FILE * in epoll

I am opening another process in my application using popen and parsing its output. I want to get notified as soon as the program has made any output. Currently all stuff in my program uses epoll for such actions. Now popen does return me a FILE * instead of a fd. Is it save to use the fileno function and put the resulting fd into epoll? If not, is there another way? I do not want the process to block, thats why I want the notifications.
Yes it is safe, as long as you don't do any extra thing with the file descriptor without being careful it seems to be safe. One more thing, is that you probably want to mark the file descriptor as non-blocking too. So that you can read() without blocking and handle EBUSY or EAGAIN checking errno.

What are the unwanted side effects of open in ioctl?

According to man ioctl, opening file descriptors with open may cause unwanted side-effects. The manual also states that opening with O_NONBLOCK solves those unwanted issues but I can't seem to find what's the reason for that, nor what are the actual side-effects. Can someone shed light into that? With ioctl is it always possible and equivalent* to open file descriptors with O_NONBLOCK?
NOTES (from man ioctl)
In order to use this call, one needs an open file descriptor. Often the open(2) call has unwanted side effects, that can be avoided
under Linux by giving it the O_NONBLOCK flag.
(* I am aware of what O_NONBLOCK implies, but I don't know if that affects ioctl calls the same way it affects other syscalls. My program, which uses ioctl to write and read from an SPI bus, works perfectly with that flag enabled.)
The obvious place to look for the answer would be the open(2) man page, which, under the heading for O_NONBLOCK, says:
When possible, the file is opened in nonblocking mode.
Neither the open() nor any subsequent operations on the file
descriptor which is returned will cause the calling process to
wait.
[...]
For the handling of FIFOs (named pipes), see also fifo(7).
For a discussion of the effect of O_NONBLOCK in conjunction
with mandatory file locks and with file leases, see fcntl(2).
OK, that wasn't very informative, but let's follow the links and see what the manual pages for fifo(7) and fcntl(2) say:
Normally, opening the FIFO blocks until the other end is opened also.
A process can open a FIFO in nonblocking mode. In this case, opening
for read-only will succeed even if no-one has opened on the write
side yet and opening for write-only will fail with ENXIO (no such
device or address) unless the other end has already been opened.
Under Linux, opening a FIFO for read and write will succeed both in
blocking and nonblocking mode. POSIX leaves this behavior undefined.
So here's at least one "unwanted side effect": even just trying to open a FIFO may block, unless you pass O_NONBLOCK to the open() call (and open it for reading).
What about fcntl, then? As the open(2) man page notes, the sections to look under are those title "Mandatory locking" and "File leases". It seems to me that mandatory locks, in this case, are a red herring, though — they only cause blocking when one tries to actually read from or write to the file:
If a process tries to perform an incompatible
access (e.g., read(2) or write(2)) on a file region that has an
incompatible mandatory lock, then the result depends upon whether the
O_NONBLOCK flag is enabled for its open file description. If the
O_NONBLOCK flag is not enabled, then the system call is blocked until
the lock is removed or converted to a mode that is compatible with
the access.
What about leases, then?
When a process (the "lease breaker") performs an open(2) or
truncate(2) that conflicts with a lease established via F_SETLEASE,
the system call is blocked by the kernel and the kernel notifies the
lease holder by sending it a signal (SIGIO by default). [...]
Once the lease has been voluntarily or forcibly removed or
downgraded, and assuming the lease breaker has not unblocked its
system call, the kernel permits the lease breaker's system call to
proceed.
[...] If
the lease breaker specifies the O_NONBLOCK flag when calling open(2),
then the call immediately fails with the error EWOULDBLOCK, but the
other steps still occur as described above.
OK, so that's another unwanted side effect: if the file you're trying to open has a lease on it, the open() call may block until the lease holder has released the lease.
In both cases, the "unwanted side effect" avoided by passing O_NONBLOCK to open() is, unsurprisingly, the open() call itself blocking until some other process has done something. If there are any other kinds of side effects that the man page you cite refers to, I'm not aware of them.
From Linux System Programming, 2nd Edition, by R. Love (emphasis mine):
O_NONBLOCK
If possible, the file will be opened in nonblocking mode. Neither the
open() call,
nor any other operation will cause the process to block (sleep) on the I/O.
This
behaviour may be defined only for FIFOs.
Sometimes, programmers do not want a call to read() to block when there is
no data
available. Instead, they prefer that the call return immediately,
indicating that no data
is available. This is called nonblocking I/O; it allows applications to
perform I/O, potentially
on multiple files, without ever blocking, and thus missing data available in
another file.
Consequently, an additional errno value is worth checking: EAGAIN.
See this thread on the kernel mailing list.

Preventing reuse of file descriptors

Is there anyway in Linux (or more generally in a POSIX OS) to guarantee that during the execution of a program, no file descriptors will be reused, even if a file is closed and another opened? My understanding is that this situation would usually lead to the file descriptor for the closed file being reassigned to the newly opened file.
I'm working on an I/O tracing project and it would make life simpler if I could assume that after an open()/fopen() call, all subsequent I/O to that file descriptor is to the same file.
I'll take either a compile-time or run-time solution.
If it is not possible, I could do my own accounting when I process the trace file (noting the location of all open and close calls), but I'd prefer to squash the problem during execution of the traced program.
Note that POSIX requires:
The open() function shall return a file descriptor for the named file
that is the lowest file descriptor not currently open for that
process.
So in the strictest sense, your request will change the program's environment to be no longer POSIX compliant.
That said, I think your best bet is to use the LD_PRELOAD trick to intercept calls to close and ignore them.
You'd have to write a SO that contains a close(2) that opens /dev/null on old FDs, and then use $LD_PRELOAD to load it into process space before starting the application.
You must already be ptraceing the application to intercept its file opening and closing operations.
It would appear trivial to prevent FD re-use by "injecting" dup2(X, Y); close(X); calls into the application, and adjusting Y to be anything you want.
However, the application itself could be using dup2 to force a re-use of previously closed FD, and may not work if you prevent that, so I think you'll just have to deal with this in post-processing step.
Also, it's quite easy to write an app that will run out of FDs if you disallow re-use.

File Descriptor Assignment in C

When sockets are created or files are opened/created in C, is the file descriptor that's assigned to the socket/file guaranteed to be the lowest-valued descriptor available? What does the C spec say about file descriptor assignment in this regard, if anything?
It's not guaranteed to be the lowest, and is implementation dependent (1). In general, however, the routine that assigns open file descriptors uses a method that gives you the first open on. It could be that immediately after several lower ones free, leaving you with a higher descriptor than you might expect though.
The only reason I can think of to know this, though, is for the select function, which is sped up if you pass it the highest file descriptor you need to check for.
(1) Note that those implementations that follow the IEEE standard do guarantee the lowest unused descriptor for files, but this may not apply to sockets. Not every implementation follows the IEEE standard for open(), so if you're writing portable software it is best not to depend on it.
I don't think you'll find it in the C spec, more likely the spec for your OS. My experience in Linux has been that it's always the lowest.
I'll counter this with another question - why does this matter? You shouldn't be comparing the file descriptor with anything (unless checking for stdin/stdout/stderr) or doing math with it. As long as it fits in an int (and its guaranteed to) that's all you really need to know.
Steve M is right; C has no notion of sockets, and its file I/O functions use a [pointer to a] FILE object, not a descriptor.
#aib the open(), close(), lseek(), read(), write() all make use of file descriptors. I hardly ever use streams for I/O.
#Kyle it matters because of statements like select(). Knowing the highest descriptor can improve performance.
The C spec says that it's implementation dependent. If you're looking at a Unix implementation, the man page for open(2) says "The file descriptor returned by a successful call will be the lowest-numbered file descriptor not currently open for the process."
This helps if you're trying to attach a specific file to a specific descriptor. Say you want to redirect stderr to /dev/null. Something like
close(2); open("/dev/null", O_WRONLY);
ought to do it. You should, of course, capture the fd returned by open and ensure that it's 2.

Resources