Mandatory file locking in Mac OS X - c

According to a man pages the following approaches supports only advisory locking: flock, lockf and fcntl. Is there any way to mandatory lock a some file by a single process, for example with a write lock, so that other process will not able to open this file with the write permissions?

No. Operating systems in the Unix family do not generally support mandatory file locking1. This includes Linux, BSD, and OS X.
On some Unixes, you are prevented from opening files for writing if they are executable images that are currently running; open() will fail with ETXTBSY. However, you can always just unlink (delete) the file and create a new one instead, and nothing will prevent that.
Footnotes
1: This is not entirely true, but mandatory file locks require a bit of work, mandatory locks are platform-specific, and OS X has no support for them.

Related

POSIX way to do O_DIRECT?

Direct I/O is the most performant way to copy larger files, so I wanted to add that ability to a program.
Windows offers FILE_FLAG_WRITE_THROUGH and FILE_FLAG_NO_BUFFERING in the Win32's CreateFileA(). Linux, since 2.4.10, has the O_DIRECT flag for open().
Is there a way to achieve the same result portably within POSIX? Like how the Win32 API here works from Windows XP to Windows 11, it would be nice to do direct IO across all UNIX-like systems in one reliably portable way.
No, there is no POSIX standard for direct IO.
There are at least two different APIs and behaviors that exist as of January 2023. Linux, FreeBSD, and apparently IBM's AIX use an O_DIRECT flag to open(), while Oracle's Solaris uses a directio() function on an already-opened file descriptor.
The Linux use of the O_DIRECT flag to the POSIX open() function is documented on the Linux open() man page:
O_DIRECT (since Linux 2.4.10)
Try to minimize cache effects of the I/O to and from thishttps://man7.org/linux/man-pages/man2/open.2.html
file. In general this will degrade performance, but it is
useful in special situations, such as https://en.wikipedia.org/wiki/QFSwhen applications do
their own caching. File I/O is done directly to/from
user-space buffers. The O_DIRECT flag on its own makes an
effort to transfer data synchronously, but does not give
the guarantees of the O_SYNC flag that data and necessary
metadata are transferred. To guarantee synchronous I/O,
O_SYNC must be used in addition to O_DIRECT. See NOTES
below for further discussion.
Linux does not clearly specify how direct IO interacts with other descriptors open on the same file, or what happens when the file is mapped using mmap(); nor any alignment or size restrictions on direct IO read or write operations. In my experience, these are all file-system specific and have been improving/becoming less restrictive over time, but most Linux filesystems require page-aligned IO buffers, and many (most? all?) (did? still do?) require page-sized reads or writes.
FreeBSD follows the Linux model: passing an O_DIRECT flag to open():
O_DIRECT may be used to minimize or eliminate the cache effects
of reading and writing. The system will attempt to avoid caching the
data you
read or write. If it cannot avoid caching the data, it will minimize the
impact the data has on the cache. Use of this flag can drastically reduce performance if not used with care.
OpenBSD does not support direct IO. There's no mention of direct IO in either the OpenBSD open() or the OpenBSD 'fcntl()` man pages.
IBM's AIX appears to support a Linux-type O_DIRECT flag to open(), but actual published IBM AIX man pages don't seem to be generally available.
SGI's Irix also supported the Linux-style O_DIRECT flag to open():
O_DIRECT
If set, all reads and writes on the resulting file descriptor will
be performed directly to or from the user program buffer, provided
appropriate size and alignment restrictions are met. Refer to the
F_SETFL and F_DIOINFO commands in the fcntl(2) manual entry for
information about how to determine the alignment constraints.
O_DIRECT is a Silicon Graphics extension and is only supported on
local EFS and XFS file systems, and remote BDS file systems.
Of interest, the XFS file system on Linux originated with SGI's Irix.
Solaris uses a completely different interface. Solaris uses a specific directio() function to set direct IO on a per-file basis:
Description
The directio() function provides advice to the system about the
expected behavior of the application when accessing the data in the
file associated with the open file descriptor fildes. The system
uses this information to help optimize accesses to the file's data.
The directio() function has no effect on the semantics of the other
operations on the data, though it may affect the performance of other
operations.
The advice argument is kept per file; the last caller of directio()
sets the advice for all applications using the file associated with
fildes.
Values for advice are defined in <sys/fcntl.h>.
DIRECTIO_OFF
Applications get the default system behavior when accessing file data.
When an application reads data from a file, the data is first cached
in system memory and then copied into the application's buffer (see
read(2)). If the system detects that the application is reading
sequentially from a file, the system will asynchronously "read ahead"
from the file into system memory so the data is immediately available
for the next read(2) operation.
When an application writes data into a file, the data is first cached
in system memory and is written to the device at a later time (see
write(2)). When possible, the system increases the performance of
write(2) operations by cacheing the data in memory pages. The data
is copied into system memory and the write(2) operation returns
immediately to the application. The data is later written
asynchronously to the device. When possible, the cached data is
"clustered" into large chunks and written to the device in a single
write operation.
The system behavior for DIRECTIO_OFF can change without notice.
DIRECTIO_ON
The system behaves as though the application is not going to reuse the
file data in the near future. In other words, the file data is not
cached in the system's memory pages.
When possible, data is read or written directly between the
application's memory and the device when the data is accessed with
read(2) and write(2) operations. When such transfers are not
possible, the system switches back to the default behavior, but just
for that operation. In general, the transfer is possible when the
application's buffer is aligned on a two-byte (short) boundary, the
offset into the file is on a device sector boundary, and the size of
the operation is a multiple of device sectors.
This advisory is ignored while the file associated with fildes is
mapped (see mmap(2)).
The system behavior for DIRECTIO_ON can change without notice.
Notice also the behavior on Solaris is different: if direct IO is enabled on a file by any process, all processes accessing that file will do so via direct IO (Solaris 10+ has no alignment or size restrictions on direct IO, so switching between direct IO and "normal" IO won't break anything.*). And if a file is mapped via mmap(), direct IO on that file is disabled entirely.
* - That's not quite true - if you're using a SAMFS or QFS filesystem in shared mode and access data from the filesystem's active metadata controller (where the filesystem must be mounted by design with the Solaris forcedirectio mount option so all access is done via direct IO on that one system in the cluster), if you disable direct IO for a file using directio( fd, DIRECTIO_OFF ), you will corrupt the filesystem. Oracle's own top-end RAC database would do that if you did a database restore on the QFS metadata controller, and you'd wind up with a corrupt filesystem.
The short answer is no.
IEEE 1003.1-2017 (the current POSIX standard afaik) doesn't mention any directives for direct I/O like O_DIRECT. That being said, a cursory glance tells me that GNU/Linux and FreeBSD support the O_DIRECT flag, while OpenBSD doesn't.
Beyond that, it appears that not all filesystems support O_DIRECT so even on a GNU/Linux system where you know your implementation of open() will recognize that directive, there's still no guarantee that you can use it.
At the end of the day, the only way I can see portable, direct I/O is runtime checks for whether or not the platform your program is running on supports it; you could do compile time checks, but I don't recommend it since filesystems can change, or your destination may not be on the OS drive. You might get super lucky and find a project out there that's already started to do this, but I kind of doubt it exists.
My recommendation for you is to start by writing your program to check for direct I/O support for your platform and act accordingly, adding checks and support for kernels and file systems you know your program will run on.
Wish I could be more help,
--K

How can I serialize access to a directory in Linux?

Lets say 4 simultaneous processes are running on a processor, and data needs to be copied from an HDFS (used with Spark) file system to a local directory. Now I want only one process to copy that data, while the other processes just wait for that data to be copied by the first process.
So, basically, I want some kind of a semaphore mechanism, where every process tries to obtain semaphore to try copying the data, but only one process gets the semaphore. All processes who failed to acquire the semaphore would then just wait for the semaphore to be cleared (the process who was able to acquire the semaphore would clear it after its done with copying), and when its cleared they know the data has already been copied. How can I do that in Linux?
There's a lot of different ways to implement semaphores. The classical, System V semaphore way is described in man semop and more broadly in man sem_overview.
You might still want to do something more easily scalable and modern. Many IPC frameworks (Apache has one or two of those, too!) have atomic IPC operations. These can be used to implement semaphores, but I'd be very very careful.
Generally, I regularly encourage people who write multi-process or multi-threaded applications to use C++ instead of C. It's often simpler to see where a shared state must be protected if your state is nicely encapsulated in an object which might do its own locking. Hence, I urge you to have a look at Boost's IPC synchronization mechanisms.
In addition of Marcus Müller's answer, you could use some file locking mechanism to synchronize.
File locking might not work very well on networked or remote file systems. You should use it on a locally mounted file system (e.g. Ext4, BTRFS, ...) not on a remote one (e.g. NFS)
For example, you might adopt the convention that your directory contains (or else you'll create it) some .lock file and use an advisory lock flock(2) (or a POSIX lockf(3)) on that .lock file before accessing the directory.
If using flock, you could even lock the directory directly....
The advantage of using such a file lock approach is that you could code shell scripts using flock(1)
And on Linux, you might also use inotify(7) (e.g. to be notified when some file is created in that directory)
Notice that most solutions are (advisory, so) presupposing that every process accessing that directory is following some convention (in other words, without more precautions like using flock(1), a careless user could access that directory - e.g. with a plain cp command -, or files under it, while your locking process is accessing the directory). If you don't accept that, you might look for mandatory file locking (which is a feature of some Linux kernels & filesystems, AFAIK it is sort-of deprecated).
BTW, you might read more about ACID properties and consider using some database, etc...

Exclusively open a device file in Linux

What ways are there available, for exclusively opening a device file (say, the display frame buffer)?
[Info: I already know about flock() & friends, which have an effect only when the other applications are also using it (in other words: open() will succeed but flock() will fail if already locked) --> but still the device handle retrieved from open() can be used to write to the display..]
What about cases when I want to enforce such an exclusive access on a device files? How would such an enforcement be possible?
From fcntl(2):
To make use of mandatory locks, mandatory locking must be enabled
both on the filesystem that contains the file to be locked, and on
the file itself.
...also, you need to enable CONFIG_MANDATORY_FILE_LOCKING in the kernel.
Mandatory locking is enabled on a filesystem using
the "-o mand" option to mount(8), or the MS_MANDLOCK flag for
mount(2). Mandatory locking is enabled on a file by disabling group
execute permission on the file and enabling the set-group-ID permis‐
sion bit (see chmod(1) and chmod(2)).
Mandatory locking is not specified by POSIX. Some other systems also
support mandatory locking, although the details of how to enable it
vary across systems.
So, as you request a posix-compliant solution, the answer is: no, there is not such a feature in the POSIX standard.
try lockf() : apply, test or remove a POSIX lock on an open file
To open a device you should use open system call in linux and check the list of available devices for example /dev/ttyUSB0 or /dev/ttyS0 etc. and open it and you will get a descriptor to write and read on to the device is you open a device to communicate.
To know the further details follow the link :
http://www.firmcodes.com/lower-level-file-handling-in-linux/
If you want to get exclusive access to a device, create a lock file in /var/lock. The process that can create the lock file with open("my_device.lock", O_CREAT|O_EXCL, 0777) gets access to the device, the other processes have to wait. After the process is done using the device, it closes the file.
Such a lock is only advisory and doesn't guarantee that no other process (that you are not aware of) accesses the device.

file locking C programming

Hello every one I am making a program using filing I know how to read an write in a file .But please can any one help me about the file read write locks in C programming.Like how to insert lock and how to release it especially in forking .Please any give a small example or a tutorial as i didn't file any thing about file locks in c
Thanks
File locking is not part of C, but is dependent on the operating system. Since you talk abour forking I assume you are using UNIX or a UNIX-like system (e.g. Linux or BSD.)
In that case you can use the flock or lockf functions. These locks are preserved on forking, which means that multiple processes can have an exclusive lock to the same file if the lock was acquired in the parent process before the fork.
On Windows it can be specified in the CreateFilecall, or later with the LockFile or LockFileEx functions.

fcntl, lockf, which is better to use for file locking?

Looking for information regarding the advantages and disadvantages of both fcntl and lockf for file locking. For example which is better to use for portability? I am currently coding a linux daemon and wondering which is better suited to use for enforcing mutual exclusion.
What is the difference between lockf and fcntl:
On many systems, the lockf() library routine is just a wrapper around fcntl(). That is to say lockf offers a subset of the functionality that fcntl does.
Source
But on some systems, fcntl and lockf locks are completely independent.
Source
Since it is implementation dependent, make sure to always use the same convention. So either always use lockf from both your processes or always use fcntl. There is a good chance that they will be interchangeable, but it's safer to use the same one.
Which one you chose doesn't matter.
Some notes on mandatory vs advisory locks:
Locking in unix/linux is by default advisory, meaning other processes don't need to follow the locking rules that are set. So it doesn't matter which way you lock, as long as your co-operating processes also use the same convention.
Linux does support mandatory locking, but only if your file system is mounted with the option on and the file special attributes set. You can use mount -o mand to mount the file system and set the file attributes g-x,g+s to enable mandatory locks, then use fcntl or lockf. For more information on how mandatory locks work see here.
Note that locks are applied not to the individual file, but to the inode. This means that 2 filenames that point to the same file data will share the same lock status.
In Windows on the other hand, you can actively exclusively open a file, and that will block other processes from opening it completely. Even if they want to. I.e., the locks are mandatory. The same goes for Windows and file locks. Any process with an open file handle with appropriate access can lock a portion of the file and no other process will be able to access that portion.
How mandatory locks work in Linux:
Concerning mandatory locks, if a process locks a region of a file with a read lock, then other processes are permitted to read but not write to that region. If a process locks a region of a file with a write lock, then other processes are not permitted to read nor write to the file. What happens when a process is not permitted to access the part of the file depends on if you specified O_NONBLOCK or not. If blocking is set it will wait to perform the operation. If no blocking is set you will get an error code of EAGAIN.
NFS warning:
Be careful if you are using locking commands on an NFS mount. The behavior is undefined and the implementation widely varies whether to use a local lock only or to support remote locking.
Both interfaces are part of the POSIX standard, and nowadays both interfaces are available on most systems (I just checked Linux, FreeBSD, Mac OS X, and Solaris). Therefore, choose the one that fits better your requirements and use it.
One word of caution: it is unspecified what happens when one process locks a file using fcntl and another using lockf. In most systems these are equivalent operations (in fact under Linux lockf is implemented on top of fcntl), but POSIX says their interaction is unspecified. So, if you are interoperating with another process that uses one of the two interfaces, choose the same one.
Others have written that the locks are only advisory: you are responsible for checking whether a region is locked. Also, don't use stdio functions, if you want the to use the locking functionality.
Your main concerns, in this case (i.e. when "coding a Linux daemon and wondering which is better suited to use for enforcing mutual exclusion"), should be:
will the locked file be local or can it be on NFS?
e.g. can the user trick you into creating and locking your daemon's pid file on NFS?
how will the lock behave when forking, or when the daemon process is terminated with extreme prejudice e.g. kill -9?
The flock and fcntl commands behave differently in both cases.
My recommendation would be to use fcntl. You may refer to the File locking article on Wikipedia for an in-depth discussion of the problems involved with both solutions:
Both flock and fcntl have quirks which
occasionally puzzle programmers from
other operating systems. Whether flock
locks work on network filesystems,
such as NFS, is implementation
dependent. On BSD systems flock calls
are successful no-ops. On Linux prior
to 2.6.12 flock calls on NFS files
would only act locally. Kernel 2.6.12
and above implement flock calls on NFS
files using POSIX byte range locks.
These locks will be visible to other
NFS clients that implement
fcntl()/POSIX locks.1 Lock upgrades
and downgrades release the old lock
before applying the new lock. If an
application downgrades an exclusive
lock to a shared lock while another
application is blocked waiting for an
exclusive lock, the latter application
will get the exclusive lock and the
first application will be locked out.
All fcntl locks associated with a file
for a given process are removed when
any file descriptor for that file is
closed by that process, even if a lock
was never requested for that file
descriptor. Also, fcntl locks are not
inherited by a child process. The
fcntl close semantics are particularly
troublesome for applications which
call subroutine libraries that may
access files.
I came across an issue while using fcntl and flock recently that I felt I should report here as searching for either term shows this page near the top on both.
Be advised BSD locks, as mentioned above, are advisory. For those who do not know OSX (darwin) is BSD. This must be remembered when opening a file to write into.
To use fcntl/flock you must first open the file and get its ID. However if you have opened the file with "w" the file will instantly be zeroed out. If your process then fails to get the lock as the file is in use elsewhere, it will most likely return, leaving the file as 0kb. The process which had the lock will now find the file has vanished from underneath it, catastrophic results normally follow.
To remedy this situation, when using file locking, never open the file "w", but instead open it "a", to append. Then if the lock is successfully acquired, you can then safely clear the file as "w" would have, ie. :
fseek(fileHandle, 0, SEEK_SET);//move to the start
ftruncate(fileno((FILE *) fileHandle), 0);//clear it out
This was an unpleasant lesson for me.
As you're only coding a daemon which uses it for mutual exclusion, they are equivalent, after all, your application only needs to be compatible with itself.
The trick with the file locking mechanisms is to be consistent - use one and stick to it. Varying them is a bad idea.
I am assuming here that the filesystem will be a local one - if it isn't, then all bets are off, NFS / other network filesystems handle locking with varying degrees of effectiveness (in some cases none)

Resources