I'm using autobench to test performance of two different php scripts. Only one of them has some filesystem I/O.
The problem is that I can't run autobench on a different host than webserver so I fear my benchmarks could be wrong.
Does autobench (configured to open thousands of connection) interfere with filesystem I/O?
Yes, as sockets and files both use file descriptors to represent themselfs if opened. And the number of file descriptors it limited.
Search man proc for file-max for details.
It interferes, of course. Place where these two programs collide is in linux kernel. Linux kernel can do limited quantity of system calls.
Related
Direct I/O is the most performant way to copy larger files, so I wanted to add that ability to a program.
Windows offers FILE_FLAG_WRITE_THROUGH and FILE_FLAG_NO_BUFFERING in the Win32's CreateFileA(). Linux, since 2.4.10, has the O_DIRECT flag for open().
Is there a way to achieve the same result portably within POSIX? Like how the Win32 API here works from Windows XP to Windows 11, it would be nice to do direct IO across all UNIX-like systems in one reliably portable way.
No, there is no POSIX standard for direct IO.
There are at least two different APIs and behaviors that exist as of January 2023. Linux, FreeBSD, and apparently IBM's AIX use an O_DIRECT flag to open(), while Oracle's Solaris uses a directio() function on an already-opened file descriptor.
The Linux use of the O_DIRECT flag to the POSIX open() function is documented on the Linux open() man page:
O_DIRECT (since Linux 2.4.10)
Try to minimize cache effects of the I/O to and from thishttps://man7.org/linux/man-pages/man2/open.2.html
file. In general this will degrade performance, but it is
useful in special situations, such as https://en.wikipedia.org/wiki/QFSwhen applications do
their own caching. File I/O is done directly to/from
user-space buffers. The O_DIRECT flag on its own makes an
effort to transfer data synchronously, but does not give
the guarantees of the O_SYNC flag that data and necessary
metadata are transferred. To guarantee synchronous I/O,
O_SYNC must be used in addition to O_DIRECT. See NOTES
below for further discussion.
Linux does not clearly specify how direct IO interacts with other descriptors open on the same file, or what happens when the file is mapped using mmap(); nor any alignment or size restrictions on direct IO read or write operations. In my experience, these are all file-system specific and have been improving/becoming less restrictive over time, but most Linux filesystems require page-aligned IO buffers, and many (most? all?) (did? still do?) require page-sized reads or writes.
FreeBSD follows the Linux model: passing an O_DIRECT flag to open():
O_DIRECT may be used to minimize or eliminate the cache effects
of reading and writing. The system will attempt to avoid caching the
data you
read or write. If it cannot avoid caching the data, it will minimize the
impact the data has on the cache. Use of this flag can drastically reduce performance if not used with care.
OpenBSD does not support direct IO. There's no mention of direct IO in either the OpenBSD open() or the OpenBSD 'fcntl()` man pages.
IBM's AIX appears to support a Linux-type O_DIRECT flag to open(), but actual published IBM AIX man pages don't seem to be generally available.
SGI's Irix also supported the Linux-style O_DIRECT flag to open():
O_DIRECT
If set, all reads and writes on the resulting file descriptor will
be performed directly to or from the user program buffer, provided
appropriate size and alignment restrictions are met. Refer to the
F_SETFL and F_DIOINFO commands in the fcntl(2) manual entry for
information about how to determine the alignment constraints.
O_DIRECT is a Silicon Graphics extension and is only supported on
local EFS and XFS file systems, and remote BDS file systems.
Of interest, the XFS file system on Linux originated with SGI's Irix.
Solaris uses a completely different interface. Solaris uses a specific directio() function to set direct IO on a per-file basis:
Description
The directio() function provides advice to the system about the
expected behavior of the application when accessing the data in the
file associated with the open file descriptor fildes. The system
uses this information to help optimize accesses to the file's data.
The directio() function has no effect on the semantics of the other
operations on the data, though it may affect the performance of other
operations.
The advice argument is kept per file; the last caller of directio()
sets the advice for all applications using the file associated with
fildes.
Values for advice are defined in <sys/fcntl.h>.
DIRECTIO_OFF
Applications get the default system behavior when accessing file data.
When an application reads data from a file, the data is first cached
in system memory and then copied into the application's buffer (see
read(2)). If the system detects that the application is reading
sequentially from a file, the system will asynchronously "read ahead"
from the file into system memory so the data is immediately available
for the next read(2) operation.
When an application writes data into a file, the data is first cached
in system memory and is written to the device at a later time (see
write(2)). When possible, the system increases the performance of
write(2) operations by cacheing the data in memory pages. The data
is copied into system memory and the write(2) operation returns
immediately to the application. The data is later written
asynchronously to the device. When possible, the cached data is
"clustered" into large chunks and written to the device in a single
write operation.
The system behavior for DIRECTIO_OFF can change without notice.
DIRECTIO_ON
The system behaves as though the application is not going to reuse the
file data in the near future. In other words, the file data is not
cached in the system's memory pages.
When possible, data is read or written directly between the
application's memory and the device when the data is accessed with
read(2) and write(2) operations. When such transfers are not
possible, the system switches back to the default behavior, but just
for that operation. In general, the transfer is possible when the
application's buffer is aligned on a two-byte (short) boundary, the
offset into the file is on a device sector boundary, and the size of
the operation is a multiple of device sectors.
This advisory is ignored while the file associated with fildes is
mapped (see mmap(2)).
The system behavior for DIRECTIO_ON can change without notice.
Notice also the behavior on Solaris is different: if direct IO is enabled on a file by any process, all processes accessing that file will do so via direct IO (Solaris 10+ has no alignment or size restrictions on direct IO, so switching between direct IO and "normal" IO won't break anything.*). And if a file is mapped via mmap(), direct IO on that file is disabled entirely.
* - That's not quite true - if you're using a SAMFS or QFS filesystem in shared mode and access data from the filesystem's active metadata controller (where the filesystem must be mounted by design with the Solaris forcedirectio mount option so all access is done via direct IO on that one system in the cluster), if you disable direct IO for a file using directio( fd, DIRECTIO_OFF ), you will corrupt the filesystem. Oracle's own top-end RAC database would do that if you did a database restore on the QFS metadata controller, and you'd wind up with a corrupt filesystem.
The short answer is no.
IEEE 1003.1-2017 (the current POSIX standard afaik) doesn't mention any directives for direct I/O like O_DIRECT. That being said, a cursory glance tells me that GNU/Linux and FreeBSD support the O_DIRECT flag, while OpenBSD doesn't.
Beyond that, it appears that not all filesystems support O_DIRECT so even on a GNU/Linux system where you know your implementation of open() will recognize that directive, there's still no guarantee that you can use it.
At the end of the day, the only way I can see portable, direct I/O is runtime checks for whether or not the platform your program is running on supports it; you could do compile time checks, but I don't recommend it since filesystems can change, or your destination may not be on the OS drive. You might get super lucky and find a project out there that's already started to do this, but I kind of doubt it exists.
My recommendation for you is to start by writing your program to check for direct I/O support for your platform and act accordingly, adding checks and support for kernels and file systems you know your program will run on.
Wish I could be more help,
--K
I'm coding a server process that can be accessed by multiple client processes at the same time. Server process might need to create some files into a directory depending on the client.
As there can be many clients connected at the same time, I obviously have a dedicated server for each one (which is a thread), so my question is, do I need to add mutex handling (e.g pthread_mutex_lock / pthread_mutex_unlock when accessing the directory? (I can guarantee the same file won't be modified or created more than once, so my question is just regarding accessing the directory).
It is the operating system's responsibility to control access to shared resources. It would be a pretty poor OS that could not handle multiple files open simultaneously.
The only time you would need consider that perhaps is where you are implementing the filesystem itself on a bare-metal system lacking an OS, or at least an OS lacking an intrinsic filesystem of its own, which is pretty much restricted to embedded systems RTOS / kernels, or where you are writing the OS itself.
Accessing the same file concurrently may be a different matter. It is usually necessary to explicitly request/permit shared access of a file, but not a directory.
I would like to know if the open() system call in Linux latest kernel would block if the filesystem is mounted as remote device, for example a CEPH filesystem, or NFS , and there is a network failure of some sort?
Yes. How long depends on the speed (and state) of the uplink, but your process or thread will block until the remote operation finishes. NFS is a bit notorious for this, and some FUSE file systems handle the blocking for whatever has the file handle, but you will block on open(), read() and write(), often at the mercy of the network and the other system.
Don't use O_NONBLOCK to get around it, or you're potentially reading from or writing to a black hole (which would just block anyway).
Yes, an open() call can block when trying to open a file on a remote file system if there is a network failure of some sort.
Depending on how the remote file system is mounted, it may just take a long time (multiple seconds) to determine that the remote file system is unavailable and return unsuccessfully after what seems like an inordinate amount of time, or it may simply lock up indefinitely until the remote resource becomes available once more (or until the mapping is removed from the system).
I need to update a log file according to the messages produced by two different modules which may be running simultaeously.
So is it possible to open and write a file simultaneously in two programs?
Sys Spec: SLES 11 x86_64.
You can do one of the following:
Use flock() (or a similar mechanism) to synchronize the writes on the open file descriptors (as already answered).
Use open() and close() (or similar) repeatedly on systems that support (or even enforce) exclusive open().
Depend on buffered output to send out log lines uninterrupted. This is often used with stderr logging, as a possible race condition isn't usually a problem here.
Use a logging service and only open() the file there. Other processes communicate with the logging service via IPC. You can use a custom logging service or a tool like syslog or journald. Both of them AFAIK support logging from non-root processes as well.
I would personally prefer the last option because its design is the cleanest one and it doesn't depend so much on OS-specific behavior. If your application consists of multiple processes started by the main process, then the main process may perform as the logging service as well and create pipes before spawning the child processes. If the processes are started separately, you can have a separate service that listens on a TCP/IP socket or (if your system supports it) a local domain socket.
Yes. A file can be opened by several processes/programs simulatneously. Multiple processes/programs can read & write in a file simultaneously but the end result of writing in the same file at the same time may be undefined. So it is better to use locks.
On Linux you can use: flocks
I've been toying around with libevent2, and I've got reading files working, but it blocks. Is there any way to make file reading not block just within libevent. Or, do I need to use another IO library for files and make it pump events that I need.
fd = open("/tmp/hello_world",O_RDONLY);
evbuffer_read(buf,fd,4096);
The O_NONBLOCK flag doesn't work either.
In POSIX disks are considered "fast devices" meaning that they always block (which is why O_NONBLOCK didn't work for you). Only network sockets can be non-blocking.
There is POSIX AIO, but e.g. on Linux that comes with a bunch of restrictions making it unsuitable for general-purpose usage (only for O_DIRECT, I/O must be sector-aligned).
If you want to integrate normal POSIX IO into an asynchronous event loop it seems people resort to thread pools, where the blocking syscalls are executed in the background by one of the worker threads. One example of such a library is libeio
No.
I've yet to see a *nix where you can do non-blocking i/o on regular files without resorting to the more special AIO library (Though for some, e.g. solaris, O_NONBLOCK has an effect if e.g. someone else holds a lock on the file)
Please take a look at libuv, which is used by node.js / io.js: https://github.com/libuv/libuv
It's a good alternative to libeio, because it does perform well on all major operating systems, from Windows to the BSDs, Mac OS X and of course Linux.
It supports I/O completion ports, which makes it a better choice than libeio if you are targeting Windows.
The C code is also very readable and I highly recommend this tutorial: https://nikhilm.github.io/uvbook/