How to issue multithreaded/non-blocking readdir in FUSE - filesystems

Right now, the readdir() in FUSE is a blocking method, which means that at anytime there is only one readdir() operation can be invoked. My file system may need support heavy simultaneously directory operations. Any suggestions?
Thanks

You have to enable the multi threading mode when you mount your fuse filesystem. It's enable by default now.
You have to be double sure that your fuse fs implementation is thread safe before enabling multi threading.

Related

Can you open a directory without blocking on I/O?

I'm working on a Linux/C application with strict timing requirements. I want to open a directory for reading without blocking on I/O (i.e. succeed only if the information is immediately available in cache). If this request would block on I/O I would like to know so that I can abort and ignore this directory for now. I know that open() has a non-blocking option O_NONBLOCK. However, it has this caveat:
Note that this flag has no effect for regular files and
block devices; that is, I/O operations will (briefly)
block when device activity is required, regardless of
whether O_NONBLOCK is set.
I assume that a directory entry is treated like a regular file. I don't know of a good way to prove/disprove this. Is there a way to open a directory without any I/O blocking?
You could try using COPROC command in linux to run a process in background. Maybe it could work for you.

Posix named lock inter process what work with multi-thread application?

I need to create named lock that work correctly with multi-thread application for Linux. Each instance of application could use more than one named-lock with different names.
I know about fcntl/flock, but it doesn't work if try to lock twice from different thread of one application or from one thread.
I know about open(..., O_CREATE | O_EXCL), but this file-lock will not be removed if application was killed by signal KILL or was crashed with segmentation fault and there is needed manual removing of lock-files after restart application.
Any another ways?
If you just need to run under modern Linux, you could use file-private locks. If that's not an option, you'll have to build your own thread-safe locking abstraction on top of fcntl locks. SQLite is public domain and has implemented that, so you could look at that for inspiration. If GPLed code is okay: OpenJDK has another, incompatible implementation of the same thing.
O_EXCL does not perform locking (beyond the file creation step), so that's usually not helpful.
Other options are System V and POSIX semaphores, but these usually do not work as well as fcntl locks when processes day. A robust, process-shared mutex in a file mapping could be an option as well, but you need to be careful to stay within the POSIX semantics as far as serialization to disk is concerned (basically, you need to reinitialize the mutex every time the application starts from scratch, after a reboot or libc update).

How can I serialize access to a directory in Linux?

Lets say 4 simultaneous processes are running on a processor, and data needs to be copied from an HDFS (used with Spark) file system to a local directory. Now I want only one process to copy that data, while the other processes just wait for that data to be copied by the first process.
So, basically, I want some kind of a semaphore mechanism, where every process tries to obtain semaphore to try copying the data, but only one process gets the semaphore. All processes who failed to acquire the semaphore would then just wait for the semaphore to be cleared (the process who was able to acquire the semaphore would clear it after its done with copying), and when its cleared they know the data has already been copied. How can I do that in Linux?
There's a lot of different ways to implement semaphores. The classical, System V semaphore way is described in man semop and more broadly in man sem_overview.
You might still want to do something more easily scalable and modern. Many IPC frameworks (Apache has one or two of those, too!) have atomic IPC operations. These can be used to implement semaphores, but I'd be very very careful.
Generally, I regularly encourage people who write multi-process or multi-threaded applications to use C++ instead of C. It's often simpler to see where a shared state must be protected if your state is nicely encapsulated in an object which might do its own locking. Hence, I urge you to have a look at Boost's IPC synchronization mechanisms.
In addition of Marcus Müller's answer, you could use some file locking mechanism to synchronize.
File locking might not work very well on networked or remote file systems. You should use it on a locally mounted file system (e.g. Ext4, BTRFS, ...) not on a remote one (e.g. NFS)
For example, you might adopt the convention that your directory contains (or else you'll create it) some .lock file and use an advisory lock flock(2) (or a POSIX lockf(3)) on that .lock file before accessing the directory.
If using flock, you could even lock the directory directly....
The advantage of using such a file lock approach is that you could code shell scripts using flock(1)
And on Linux, you might also use inotify(7) (e.g. to be notified when some file is created in that directory)
Notice that most solutions are (advisory, so) presupposing that every process accessing that directory is following some convention (in other words, without more precautions like using flock(1), a careless user could access that directory - e.g. with a plain cp command -, or files under it, while your locking process is accessing the directory). If you don't accept that, you might look for mandatory file locking (which is a feature of some Linux kernels & filesystems, AFAIK it is sort-of deprecated).
BTW, you might read more about ACID properties and consider using some database, etc...

libevent2 and file io

I've been toying around with libevent2, and I've got reading files working, but it blocks. Is there any way to make file reading not block just within libevent. Or, do I need to use another IO library for files and make it pump events that I need.
fd = open("/tmp/hello_world",O_RDONLY);
evbuffer_read(buf,fd,4096);
The O_NONBLOCK flag doesn't work either.
In POSIX disks are considered "fast devices" meaning that they always block (which is why O_NONBLOCK didn't work for you). Only network sockets can be non-blocking.
There is POSIX AIO, but e.g. on Linux that comes with a bunch of restrictions making it unsuitable for general-purpose usage (only for O_DIRECT, I/O must be sector-aligned).
If you want to integrate normal POSIX IO into an asynchronous event loop it seems people resort to thread pools, where the blocking syscalls are executed in the background by one of the worker threads. One example of such a library is libeio
No.
I've yet to see a *nix where you can do non-blocking i/o on regular files without resorting to the more special AIO library (Though for some, e.g. solaris, O_NONBLOCK has an effect if e.g. someone else holds a lock on the file)
Please take a look at libuv, which is used by node.js / io.js: https://github.com/libuv/libuv
It's a good alternative to libeio, because it does perform well on all major operating systems, from Windows to the BSDs, Mac OS X and of course Linux.
It supports I/O completion ports, which makes it a better choice than libeio if you are targeting Windows.
The C code is also very readable and I highly recommend this tutorial: https://nikhilm.github.io/uvbook/

file lock leases via NFS v4 in C

does anybody know how to use the fancy file locking features of NFS v4? (described in e.g. About the NFS protocol (scroll down)). supposedly NFS v4 supports file lock leasing with a 45 second lifetime. I would like to believe that the linux kernel (I'm using gentoo 2.6.30) happily takes care of these details, and I can use fcntl() and it all comes out in the wash. I am guessing, however, that I have to do something special somehow to get, maintain, and release the lock lease. all help appreciated.
you are right, fcntl takes care of all this business for you. The lease management is done by the nfs client(kernel module in linux)

Resources