What would happen if you call read (or write, or both) in two different thread, on the same file descriptor (lets says we are interested about a local file, and a it's a socket file descriptor), without using explicitly a synchronization mechanism?
Read and Write are syscall, so, on a single core CPU, it's probably unlucky that two read would be executed "at the same time". But with multiple cores...
What the linux kernel will do?
And let's be a bit more general : is the behavior always the same for other kernels (like BSDs) ?
Edit : According to the close documentation, we should be sure that the file descriptor isn't used by a syscall in an other thread. So it seams that explicit synchronization would be required before closing a file descriptor (and so, also around read/write if thread that may call it are still running).
Any system level (syscall) file descriptor access is thread safe in all mainstream UNIX-like OSes.
Though depending on the age they are not necessarily signal safe.
If you call read, write, accept or similar on a file descriptor from two different tasks then the kernel's internal locking mechanism will resolve contention.
For reads each byte may be only read once though and writes will go in any undefined order.
The stdio library functions fread, fwrite and co. also have by default internal locking on the control structures, though by using flags it is possible to disable that.
The comment about close is because it doesn't make a lot of sense to close a file descriptor in any situation in which some other thread might be trying to use it. So while it is 'safe' as far as the kernel is concerned, it can lead to odd, hard to diagnose corner cases.
If a thread closes a file descriptor while a second thread is trying to read from it, the second thread may get an unexpected EBADF error. Worse, if a third thread is simultaneously opening a new file, that might reallocate the same fd, and the second thread might accidentally read from the new file rather than the one it was expecting...
Have a care for those who follow in your footsteps
It's perfectly normal to protect the file descriptor with a mutex semaphore. It removes any dependence on kernel behaviour so your message boundaries are now certain. You then don't have to cite the last paragraph at the bottom of a 15,489 line manpage which explains why the mutex isn't necessary (I exaggerated, but you get my meaning)
It also makes it clear to anyone reading your code that the file descriptor is being used by more than one thread.
Fringe Benefit
There is a fringe benefit to using a mutex that way. Suppose you've got different messages coming from the different threads and some of those messages are more important than others. All you need to do is set the thread priorities to reflect their messages' importance. That way the OS will ensure that your messages will be sent in order of importance for minimal effort on your part.
The result would depend on how the threads are scheduled to run at that particular instant in time.
One way to potentially avoid undefined behavior with multi-threading is to assume that you are doing memory operations. E.g. updating a linked list or changing a variable, etc.
If you use mutex/semaphores/lock or some other synchronization mechanism, it should work as intended.
Related
Suppose two different processes open the same file independently, and so have different entries in the Open file table (system-wide). But they refer to the same i-node entry.
As the file descriptors refer to the different entries in the Open file table (system-wide), then they may have different file offset. Will be there any chance for race condition during write as the file offset is different? And how does the kernel avoid it?
Book: The Linux Programming Interface; Page no. 95; Chapter-5 (File I/O: Further details); Section 5.4
(I'm assuming because you used write() that the question refers to POSIX systems.)
Each write() operation is supposed to be fully atomic, assuming a POSIX system (presumed from the use of write()).
Per POSIX 7's 2.9.7 Thread Interactions with Regular File Operations:
All of the following functions shall be atomic with respect to each
other in the effects specified in POSIX.1-2017 when they operate on
regular files or symbolic links:
chmod()
chown()
close()
creat()
dup2()
fchmod()
fchmodat()
fchown()
fchownat()
fcntl()
fstat()
fstatat()
ftruncate()
lchown()
link()
linkat()
lseek()
lstat()
open()
openat()
pread()
read()
readlink()
readlinkat()
readv()
pwrite()
rename()
renameat()
stat()
symlink()
symlinkat()
truncate()
unlink()
unlinkat()
utime()
utimensat()
utimes()
write()
writev()
If two threads each call one of these functions, each call shall
either see all of the specified effects of the other call, or none of
them. The requirement on the close() function shall also apply
whenever a file descriptor is successfully closed, however caused (for
example, as a consequence of calling close(), calling dup2(), or of
process termination).
But pay particular attention to the specification for write() (bolding mine):
The write() function shall attempt to write nbyte bytes ...
POSIX says that write() calls to a file shall be atomic. POSIX does not say that the write() calls will be complete. Here's a Linux bug report where a signal was interrupting a write() that was partially complete. Note the explanation:
Now this is perfectly valid behavior as far as spec (POSIX, SUS,...) is concerned (please correct me if I'm missing something). So I'd say the program is incorrect. But OTOH I agree that this was not possible before a50527b1 and we don't want to break userspace. I'd hate to revert that commit since it allows us to interrupt processes doing large writes (especially when something goes wrong) but if you explain to us why this behavior is a problem for you then I guess I'll have to revert it.
That's all but admitting that there's a POSIX requirement for write() calls to be atomic, if not complete, with an offer to revert back to earlier behavior where the write() calls apparently were all also complete in this same circumstance.
Note, though, there are lots of file systems out there that don't conform to POSIX standards.
As the file descriptors refer to the different entries in the Open file table (system-wide), then they may have different file offset. Will be there any chance for race condition during write as the file offset is different?
Any write() in Linux can return a short count, for example due to a signal being delivered to an userspace handler. For simplicity, let's ignore that, and only consider what happens to the successfully written data.
There are two scenarios:
The regions written to do not overlap.
(For example, one process writes 100 bytes starting at offset 23, and another writes 50 bytes starting at offset 200.)
There is no race condition in this case.
The regions written to do overlap.
(For example, one process writes 100 bytes starting at offset 50, and another writes 10 bytes starting at offset 70.)
There is a race condition. It is impossible to predict (without advisory locks etc.) the order in which the data gets updated.
Depending on the target filesystem, and if the writes are large enough (so that paging effects can be observed), the two writes may even be "mixed" (in page-sized chunks) in Linux on some filesystems on machines with more than one hardware thread, even though POSIX says this shouldn't happen.
Normally, writes go through the Linux page cache. It is possible for one of the processes to have opened the file with O_DIRECT | O_SYNC, bypassing the page cache. In that case, there are many additional corner cases that can occur. Specifically, even if you use a shared clock source, and can show that the normal/page-cached write completed before the direct write call was made, it may still be possible for the page-cached write to overwrite the direct write contents.
And how does the kernel avoid it?
It doesn't. Why should it? POSIX says each write is atomic, but there is no practical way to avoid a race condition relying on that alone (and get consistent and expected results).
Userspace programs have at least four different methods to avoid such races:
Advisory file locks on the entire open file using the flock() interface.
Advisory file locks on the entire open file using the lockf() interface. In Linux, these are just shorthand for placing/removing fcntl() advisory locks on the entire file.
Advisory record locks on the file using the fcntl() interface. This works even across shared volumes, as long as the file server is configured to support file locking.
Obtaining an exclusive lease on the open file using the fcntl() interface.
Advisory file locks are like street lights: they are intended for co-operating processes to easily determine who gets to go when. However, they do not stop any other process from actually ignoring the "lock" and accessing the file.
File leases are a mechanism, where one or more processes can get a read lease at the same time on the same file, but only one process can get a write lease and only when that process is the only one having the file open. When granted, the write lease (or exclusive lease) means that if any other process tries to open the same file, the lease owner process is notified by a signal (that you can control using the fcntl() interface), and has a configured time (typically 45 seconds; see man 5 proc and /proc/sys/fs/lease-break-time, in seconds) to relinguish the lease. The opener is blocked in the kernel until the lease is downgraded or the lease break time passes, in which case the kernel breaks the lease.
This allows the lease holder to postpone the opening for a short while.
However, the lease holder cannot block the opening, and cannot e.g. replace the file with a decoy one; the opener already has a hold on the inode, and the lease break time is just a grace period for cleanup work.
Technically, a fifth method would be mandatory file locking, but aside from the kernel use wrt. executed binaries, they're not used, and are actually buggy in Linux anyway. In Linux, inodes are only locked against modification when that inode is being executed as a binary by the kernel. (You can still rename or delete the original file, and create a new one, so that any subsequent execs will execute the modified/new data. Attempts to modify a file that is being executed as a binary file will fail with error EBUSY.)
My question is quite simple. Is reading and writing from and to a serial port under Linux thread-safe? Can I read and write at the same time from different threads? Is it even possible to do 2 writes simultaneously? I'm not planning on doing so but this might be interesting for others. I just have one thread that reads and another one that writes.
There is little to find about this topic.
More on detail—I am using write() and read() on a file descriptor that I obtained by open(); and I am doing so simultaneously.
Thanks all!
Roel
There are two aspects to this:
What the C implementation does.
What the kernel does.
Concerning the kernel, I'm pretty sure that it will either support this or raise an according error, otherwise this would be too easy to exploit. The C implementation of read() is just a syscall wrapper (See what happens after read is called for a Linux socket), so this doesn't change anything. However, I still don't see any guarantees documented there, so this is not reliable.
If you really want two threads, I'd suggest that you stay with stdio functions (fopen/fread/fwrite/fclose), because here you can leverage the fact that the glibc synchronizes these calls with a mutex internally.
However, if you are doing a blocking read in one thread, the other thread could be blocked waiting to write something. This could be a deadlock. A solution for that is to use select() to detect when there is some data ready to be read or buffer space to be written. This is done in a single thread though, but while the initial code is a bit larger, in the end this approach is easier and cleaner, even more so if multiple streams are involved.
Is there a problem with using pread on the same file descriptor from 2 or more different threads at the same time?
pread itself is thread-safe, since it is not on the list of unsafe functions. So it is safe to call it.
The real question is: what happens if you read from the same file concurrently (not necessarily from two threads, but also from two processes).
Regarding this, the specification says:
The behavior of multiple concurrent reads on the same pipe, FIFO, or terminal device is unspecified.
Note that it doesn't mention ordinary files. This bit relates only to read anyway, because pread cannot be used on unseekable files.
I/O is intended to be atomic to ordinary files and pipes and FIFOs.
But this is from the non-normative section, so your OS might do it differently. E.g., if you read from two threads and there is a concurrent write, you might get different pieces of the write in your two read buffers. But this kind of problem is not specific to multithreading.
Also nice to know that in some cases
read() shall block the calling thread
Not the process, just the thread. And
A thread that has blocked shall not prevent any unblocked thread [...] from eventually making forward progress
As we are using same fd, we have to bind a lock otherwise there will be mix of data from the two pread on the file descriptor.
Hence yes there is a problem in doing this
http://linux.die.net/man/2/pread
I'm not 100% sure but I think that the file descriptor structure itself isn't thread safe, so two concurrent changes to it would corrupt it. You need some kind of locking.
What is the behavior of the select(2) function when a file descriptor it is watching for reading is closed by another thread?
From some cursory testing, it does return right away. I suspect the outcome is either that (a) it still continues to wait for data, but if you actually tried to read from it you'd get EBADF (possibly -- there's a potential race) or (b) that it pretends as though the file descriptor were never passed in. If the latter case is true, passing in a single fd with no timeout would cause a deadlock if it were closed.
From some additional investigation, it appears that both dwc and bothie are right.
bothie's answer to the question boils down to: it's undefined behavior. That doesn't mean that it's unpredictable necessarily, but that different OSes do it differently. It would appear that systems like Solaris and HP-UX return from select(2) in this case, but Linux does not based on this post to the linux-kernel mailing list from 2001.
The argument on the linux-kernel mailing list is essentially that it is undefined (and broken) behavior to rely upon. In Linux's case, calling close(2) on the file descriptor effectively decrements a reference count on it. Since there is a select(2) call also with a reference to it, the fd will remain open and waiting for input until the select(2) returns. This is basically dwc's answer. You will get an event on the file descriptor and then it'll be closed. Trying to read from it will result in a EBADF, assuming the fd hasn't been recycled. (A concern that MarkR made in his answer, although I think it's probably avoidable in most cases with proper synchronization.)
So thank you all for the help.
I would expect that it would behave as if the end-of-file had been reached, that's to say, it would return with the file descriptor shown as ready but any attempt to read it subsequently would return "bad file descriptor".
Having said that, doing that is very bad practice anyway, as you'd always have potential race conditions as another file descriptor with the same number could be opened by yet another thread immediately after the other 2nd closed it, then the selecting thread would end up waiting on the wrong one.
As soon as you close a file, its number becomes available for reuse, and may get reused by the next call to open(), socket() etc, even if by another thread. Therefore you really, really need to avoid this kind of thing.
The select system call is a way to wait for file desctriptors to change state while the programs doesn't have anything else to do. The main use is for server applications, which open a bunch of file descriptors and then wait for anything to do on them (accept new connections, read requests or send the responses). Those file descriptors will be opened in non-blocking io mode such that the server process won't hang in a syscall at any times.
This additionally means, there is no need for separate threads, because all the work, that could be done in the thread can be done prior to the select call as well. And if the work takes long, than it can be interrupted, select being called with timeout={0,0}, the file descriptors get handled and afterwards the work is being resumed.
Now, you close a file descriptor in another thread. Why do you have that extra thread at all, and why shall it close the file descriptor?
The POSIX standard doesn't provide any hints, what happens in this case, so what you're doing is UNDEFINED BEHAVIOR. Expect that the result will be very different between different operating systems and even between version of the same OS.
Regards, Bodo
It's a little confusing what you're asking...
Select() should return upon an "interesting" change. If the close() merely decremented the reference count and the file was still open for writing somewhere then there's no reason for select() to wake up.
If the other thread did close() on the only open descriptor then it gets more interesting, but I'd need to see a simple version of the code to see if something's really wrong.
Is fprintf thread-safe? The glibc manual seems to say it is, but my application, which writes to a file using single call to fprintf() seems to be intermingling partial writes from different processes.
edit: To clarify, the program in question is a lighttpd plugin, and the server is running with multiple worker threads.
Looking at the file, some of the writes are intermingled.
edit 2: It seems the problem I'm seeing might be due to lighttpd's "worker threads" actually being separate processes: http://redmine.lighttpd.net/wiki/lighttpd/Docs:MultiProcessor
Problems
By running 2 or more processes on the
same socket you will have a better
concurrency, but will have a few
drawbacks that you have to be aware
of:
mod_accesslog might create broken access logs, as the same file is opened twice and is NOT synchronized.
mod_status will have n separate counters, one set for each
process.
mod_rrdtool will fail as it receives the same timestamp twice.
mod_uploadprogress will not show correct status.
You're confusing two concepts - writing from multiple threads and writing from multiple processes.
Inside a process its possible to ensure that one invocation of fprintf is completed before the next is allowed access to the output buffer, but once your app pumps that output to a file you're at the mercy of the OS. Without some kind of OS based locking mechanism you cant ensure that an entirely different application doesnt write to your log file.
Sounds to me like you need to read on file locking. The problem you have is that multiple processes (i.e. not threads) are writing to the same file simultaneously and there is no reliable way to insure the writes will be atomic. This can result in files overwriting each other's writes, mixed output, and altogether non-deterministic behaviour.
This has nothing to do with Thread Safety, as this is relevant only in single-process multithreading programs.
The current C++ standard says nothing useful about concurrency, nor does the 1990 C standard. (I haven't read the 1999 C standard, so can't comment on it; the upcoming C++0x standard does say things, but I don't know exactly what offhand.)
This means that fprintf() itself is likely neither thread-safe nor otherwise, and that it would depend on the implementation. I'd read exactly what the glibc documentation says about it, and compare it to exactly what you're doing.