Deleting a possibly locked file in c

Deleting a possibly locked file in c - c

I am using fcntl locks in C on linux and have a dilemma of trying to delete a file that may possibly be locked from other processes that also check for the fcntl locking mechanism. What would be the preferred way of handling this file which must be deleted, (Should I simply delete the file without regard of other processes that may have reader locks or is there a better way)?
Any help would be much appreciated.

On UNIX systems, it is possible to unlink a file while it is still open; doing so decrements the reference count on the file, but the actual file and its inode remains around until the reference count goes to zero.

As others have noted, you are free to delete the file even while you hold the lock.
Now, a cautionary note: you didn't mention why processes are locking this file, but you should be aware that if you are using that file for interprocess synchronization, deleting it is a good way to introduce subtle race conditions into your system, basically because there's no way to atomically create AND lock the file in a single operation.
For example, process AA might create the file, with the intention of locking it immediately to do whatever updates it needs to do. However, there's nothing to prevent process BB from grabbing the lock on the file first, then deleting the file, leaving process AA with a handle to the now deleted file. Process AA will still be able to lock and update that file, but those updates will effectively be "lost" because the file's already been deleted.

Moreover, locks on UNIX system are advisory by default, not mandatory, so that locking a file does not prevent it from being open or unlinked, just from being locked again.

Related

Is it possible to have a shared global variable for inter-process communication?

I need to solve a concurrency assignment for my operating systems class. I don't want the solution here, but I am lacking one part.
We should write a process that writes to file, reads from it and then deltetes it. This process we should run two times in two different shells. No fork here for simplicity. Process A should write, Process B then read and then Process should delete the file. Afterwards they switch roles.
I understand that you can achieve atomicity easily by locking. With while loops around the read-, and write sections etc. you can also get further control. But when I run process A and then process B, process B will spin before the write seciton until it achieves the lock and not got into reading when process A releases the lock. So my best guess is to have a read and a write lock. This information must be shared somehow between the processes. The only way I can think of is some global variable, but since both processes hold copies of the variables, I think this is not possible. Another way would be to have a read lock file and a write lock file, but that seems overly complicated to me.
Is there a better way?

You can use semaphores to ensure the writer and deleter wait for the previous process to finish its job. (Use man sem_init for details)
When running multiple processes with semaphores, it should be created using shared mem (man shm_open for more details).
You will need as many semaphores as the number of pipelines in this process.

You can use file as a lock. Two processes try to create a file with a previously agreed upon name using the O_EXCL flag. Only one will succeed. The one that succeeds gets the access to the resource. So in this case process A should try to create a file with name say, foo, with O_EXCL flag and, if successful, it should go ahead and write to file the information. After its work is complete, Process A should unlink foo. Process B should try to create file foo with O_EXCL flag, and if successful, try to read the file created by Process A. After its attempt is over, Process B should unlink the file foo. That way only one process will be accessing the file at any time.

Your problem (with files and alternating roles in the creation/deletion of files) seems to be a candidate to use the O_EXCL flag on opening/creating the file. This flag makes the open(2) system call to succeed in creating a file only if the file doesn't exist, so it makes the file to appear as a semaphore itself. Each process can liberate the lock (A or B) but the one that does, just liberates the lock and makes the role of owning again accessible.
You will see that both processes try to use one of the roles, but if they both try to use the owner role, one of them will succeed, and the other will fail.
Just enable a SIGINT signal handler on the owning process, to allow it to delete the file in case it gets signalled, or you will leave the file and after that no process will be able to assume the owning role (at least you will need to delete it manually).
This was the first form of locking feature in unix, long before semaphores, shared memory or other ways to block processes existed. It is based on the atomicity of system calls (you cannot execute two system calls on the same file simultaneously)

Linux: Reading file while other program might modify it

A program Foo periodically updates a file and calls my C program Bar to process the file.
The issue is that the Foo might update the file, call Bar to process it, and while Bar reads the file, Foo might update the file again.
Is it possible for Bar to read the file in inconsistent state, e.g. read first half of the file as written by first Foo and the other half as written by the second Foo? If so, how would I prevent that, assuming I can modify only Bar's code?

Typically, Foo should not simply rewrite the contents of the file again and again, but create a new temporary file, and replace the old file with the temporary file when it is done (using link()). In this case, simply opening the file (at any point in time) will give the reader a consistent snapshot of the contents, because of how typical POSIX filesystems work. (After opening the file, the file descriptor will refer to the same inode/contents, even if the file gets deleted or replaced; the disk space will be released only after the last open file descriptor of a deleted/replaced file is closed.)
If Foo does rewrite the same file (without a temporary file) over and over, the recommended solution would be for both Foo and Bar to use fcntl()-based advisory locking. (However, using a temporary file and renaming/linking it over the actual file when complete, would be even better.)
(While flock()-based locking might seem easier, it is actually a bit of a guessing game whether it works on NFS mounts or not. fcntl() works, unless the NFS server is configured not to support locking. Which is a bit of an issue on some commercial web hosts, actually.)
If you cannot modify the behaviour of Foo, and it does not use advisory locking, there are still some options in Linux.
If Foo closes the file -- i.e., Bar is the only one to open the file --, then taking an exclusive file lease (using fcntl(descriptor, F_SETLEASE, F_WRLCK) is a workable solution. You can only get an exclusive file lease if descriptor is the only open descriptor on the file, and the owner user of the file is the same as the process UID (or the process has the CAP_LEASE capability). If any other process tries to open or truncate the file, the lease owner gets signaled (SIGIO by default), and has up to /proc/sys/fs/lease-break-time seconds to downgrade or release the lease. The opener is blocked for the duration, which allows Bar to either cancel the processing, or copy the file for later processing.
The other option for Bar is rather violent. It can monitor the file say once per second, and when the file is old enough -- say, a few seconds --, pause Foo by sending it a SIGSTOP signal, checking /proc/FOOPID/stat until it gets stopped, and rechecking the file statistics to verify it's still old, until making a temporary copy of it (either in memory, or on disk) for processing. After the file is read/copied, Bar can let Foo continue by sending it a SIGCONT signal.
Some filesystems may support file snapshots, but in my opinion, one of the above are much saner than relying on nonstandard filesystem support to function correctly. If Foo cannot be modified to co-operate, it is time to refactor it out of the picture. You do not want to be a hostage for a black box out of your control, so the sooner you replace it with something more user/administrator-friendly, the better you'll be in the long term.

This is difficult to do robustly without Foo's cooperation.
Unixes have two main kinds of file locking:
range locking with fcntl(2)
always-whole-file locking with flock(2)
Ideally, you use either of these in cooperative mode (advisory locking), where all participants attempt to acquire the lock and only one will get it at a time.
Without the other program's cooperation, your only recourse, as far as I know is mandatory locking, which you can have with fcntl if you allow it on the filesystem, but the manpage mentions that the Linux implementation is unreliable.

In all UN*X systems, what is warranted to happen atomically is the write(2) or read(2) system calls. The kernel even locks the file inode in memory, so while you are read(2)ing or write(2)ing it, it would not change.
For more spatial atomicity, you have to lock the whole file. You can use the file locking tools available to lock different regions of a file. Some are advisory (you can force an skip over them) and others are mandatory (you are blocked until the other side unblocks the file region)
See fcntl(2) and the options F_GETLK, F_SETLK and F_SETLKW to get lock info, set lock for reading or writing, respectively.

Allow opening a file with open() only in one process, in c unix programming

I'm creating an application client/server. Users registered are located in their specific file. I need that only one process of my client can login with that specific username. So I think that the best way to handle it, it's to forbid the opening of a file if it's just opened by another process, but I don't know how to do it. Suggestions? Thanks!
I have thought about semaphores but I don't think is the best solution....
ok, I'll use flock() thanks! ^^ But after open() what kind of error will give me?

Check out the system command/shell command called flock.
So as you want only one process accessing the open file, you'll be using LOCK_EX operation on the file descriptor (Assuming you're using the system call).
Please go through the man pages. man flock for shell command and man 2 flock for the system call.

If you have one multi-threaded server (but one process) that handles all the users, then it's best to just keep track of logged in users in memory. In that case, you can use mutexes (a type of semaphore) to make sure that one connection locks access to a particular user profile, and every time a new user connects, you can query your data structure. For instance, if you're using pthreads, you can define an array as follows, assuming each user has a sequential integer ID:
pthread_mutex_t YourServer::accountLocks[numberOfUsers]
If you have multiple separate processes that for some reason can't share memory, then lock files are an option. In that case, you'll have to be careful not to introduce race conditions, and you can use something like flock.

You can use flock() and set an exclusive lock on that file. This will prevent other processes than the one that set the lock to open the file.

Accessing a file by several processes

this is a design question more than a coding problem. I have a parent process that will fork many children. Each of the children is supposed to read and write on the same text file.
How can we achieve this safely?
My thoughts:
create the file pointer in the parent, then create a binary semaphore on it. And processes will compete on obtaining the file pointer and write on the file. In the read case i don't need a semaphore.
Please tell me if i got it wrong.
I am using C under linux.
Thank you.

POSIX systems have kernel level file locks using fcntl and/or flock. Their history is a bit complicated and their use and semantics not always obvious but they do work, especially in simple cases. For locking an entire file, flock is easier to use IMO. If you need to lock only parts of a file, fcntl provides that ability.
As an aside, file locking over NFS is not safe on all (most?) platforms.
man 2 flock
man 2 fcntl
http://en.wikipedia.org/wiki/File_locking#In_Unix-like_systems
Also, keep in mind that file locks are "advisory" only. They don't actually prevent you from writing/reading/etc to a file if you bypass acquiring the lock.

If writers are appending data to the file, your approach seems fine (at least up until the file becomes too large for the file system).
If writers are doing file replacement, then I would approach it something like this:
The reading API would check the time of last modification (with fstat()) against a cached value. If the time has changed, the file is re-opened, and the cached modification time updated, before the read is performed.
The writing API would acquire a lock, and write to a temporary file. Then, the actual data file is replaced by calling rename(), after which the lock is released.
If writers can write anywhere in the file, then you probably want are more structured file than just plain text, similar to a database. In such a case, some kind of reader-writer lock should be used to manage data consistency and data integrity.

How to use a file as a mutex in Linux and C?

I have different processes concurrently accessing a named pipe in Linux and I want to make this access mutually exclusive.
I know is possible to achieve that using a mutex placed in a shared memory area, but being this a sort of homework assignment I have some restrictions.
Thus, what I thought about is to use locking primitives on files to achieve mutual exclusion; I made some try but I can't make it work.
This is what i tried:
flock(lock_file, LOCK_EX)
// critic section
flock(lock_file, LOCK_UN)
Different projects will use different file descriptors but referring to the same file.
Is it possible to achieve something like that? Can you provide some example.

The standard lock-file technique uses options such as O_EXCL on the open() call to try and create the file. You store the PID of the process using the lock, so you can determine whether the process still exists (using kill() to test). You have to worry about concurrency - a lot.
Steps:
Determine name of lock file based on name of FIFO
Open lock file if it exists
Check whether process using it exists
If other process exists, it has control (exit with error, or wait for it to exit)
If other process is absent, remove lock file
At this point, lock file did not exist when last checked.
Try to create it with open() and O_EXCL amongst the other options.
If that works, your process created the file - you have permission to go ahead.
Write your PID to the file; close it.
Open the FIFO - use it.
When done (atexit()?) remove the lock file.
Worry about what happens if you open the lock file and read no PID...is it that another process just created it and hasn't yet written its PID into it, or did it die before doing so? Probably best to back off - close the file and try again (possibly after a randomized nanosleep()). If you get the empty file multiple times (say 3 in a row) assume that the process is dead and remove the lock file.
You could consider having the process that owns the file maintain an advisory lock on the file while it has the FIFO open. If the lock is absent, the process has died. There is still a TOCTOU (time of check, time of use) window of vulnerability between opening the file and applying the lock.
Take a good look at the open() man page on your system to see whether there are any other options to help you. Sometimes, processes use directories (mkdir()) instead of files because even root can't create a second instance of a given directory name, but then you have issues with how to know the PID of the process with the resource open, etc.

I'd definitely recommend using an actual mutex (as has been suggested in the comments); for example, the pthread library provides an implementation. But if you want to do it yourself using a file for educational purposes, I'd suggest taking a look at this answer I posted a while ago which describes a method for doing so in Python. Translated to C, it should look something like this (Warning: untested code, use at your own risk; also my C is rusty):
// each instance of the process should have a different filename here
char* process_lockfile = "/path/to/hostname.pid.lock";
// all processes should have the same filename here
char* global_lockfile = "/path/to/lockfile";
// create the file if necessary (only once, at the beginning of each process)
FILE* f = fopen(process_lockfile, "w");
fprintf(f, "\n"); // or maybe write the hostname and pid
fclose(f);
// now, each time you have to lock the file:
int lock_acquired = 0;
while (!lock_acquired) {
int r = link(process_lockfile, global_lockfile);
if (r == 0) {
lock_acquired = 1;
}
else {
struct stat buf;
stat(process_lockfile, &buf);
lock_acquired = (buf.st_nlink == 2);
}
}
// do your writing
unlink(global_lockfile);
lock_acquired = 0;

Your example is as good as you're going to get using flock (2) (which is after all, merely an "advisory" lock (which is to say not a lock at all, really)). The man page for it on my Mac OS X system has a couple of possibly important provisos:
Locks are on files, not file descriptors. That is, file descriptors duplicated through dup(2) or fork(2) do not result in multiple instances of a
lock, but rather multiple references to a single lock. If a process holding a lock on a file forks and the child explicitly unlocks the file, the
parent will lose its lock
and
Processes blocked awaiting a lock may be awakened by signals.
both of which suggest ways it could fail.
// would have been a comment, but I wanted to quote the man page at some length