I'm writting an app and its in the specification that I need to lock
a file everytime I write on it (this file will be read for other apps
that other team is working on):
I made the following function:
int lock_file (int fd)
{
if (fd == -1)
return -1;
struct flock file_locker;
file_locker.l_type = F_WRLCK;
file_locker.l_whence = SEEK_SET;
file_locker.l_start = 0;
file_locker.l_len = 0; //lock the entire file
int locked = fcntl(fd, F_SETLK, &file_locker);
if (locked == -1){
/*handle errors*/
return 0;
}
return 1;
}
I can get the 1 return (means everything is ok) but when I made a test case
I could write in the locked file Oo
the test code was:
char *file = "lock_test_ok";
int fd = open(file, O_RDWR);
int locked = lock_file(fd);
/* call popen and try write 'ERROR' in the file */
/* if the file contains ERROR, than fail */
Locking in Unix is advisory: only programs testing the lock will not write in it. (Some offers mandatory locking, but not that way. It usually involves setting up special properties on the locked file.)
The lock is released when the first process exists and its file descriptors are all closed.
Edit: I think I misunderstood the test scenario -- a popen() call won't be following the locking protocol (which is only advisory, and not enforced by the OS), so the write occurs even if the process that called lock_file() still exists and is holding the lock.
In addition to what Jim said, fcntl locks are advisory. They do not prevent anyone from opening and writing to the file. The only thing they do is prevent other processes from acquiring their own fcntl locks.
If you control all writers to the file, this is fine, because you can just have every writer try to lock the file first. Otherwise you're hosed. Unix does not offer any "mandatory" locks (locks that cause open or write to fail).
Related
I'm trying to trigger some concurrent conflicts by having several processes writing to the same file, but couldn't:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <sys/wait.h>
void concurrent_write()
{
int create_fd = open("bar.txt", O_CREAT | O_TRUNC, 0644);
close(create_fd);
int repeat = 20;
int num = 4;
for (int process = 0; process < num; process++)
{
int rc = fork();
if (rc == 0)
{
// child
int write_fd = open("bar.txt", O_WRONLY | O_APPEND, 0644);
for (int idx = 0; idx < repeat; idx++)
{
sleep(1);
write(write_fd, "child writing\n", strlen("child writing\n"));
}
close(write_fd);
exit(0);
}
}
for (int process = 0; process < num; process++)
{
wait(NULL);
// wait for all children to exits
}
printf("write to `bar.txt`\n%d lines written by %d process\n", repeat * num, num);
printf("wc:");
if (fork() == 0)
{
// child
char *args[3];
args[0] = strdup("wc");
args[1] = strdup("bar.txt");
args[2] = NULL;
execvp(args[0], args);
}
}
int main(int argc, char *argv[])
{
concurrent_write();
return 0;
}
This program fork #num children and then have all of them write #repeat lines to a file. But every time (however I change #repeat and #num) I got the same result that the length of bar.txt (output file) matched the number of total written lines. Why is there no concurrent conflicts triggered?
Writing to a file can be divided into a two-step process:
Locate where you want to write.
Write data into the file.
You open a file with flag O_APPEND and it ensures that the two-step process is atomic. So, you can always find the lines of the file as the count you set.
See the open(2) man page:
O_APPEND
The file is opened in append mode. Before each write(2),
the file offset is positioned at the end of the file, as
if with lseek(2). The modification of the file offset and
the write operation are performed as a single atomic step.
In essence, one of the major design features of O_APPEND is precisely to prevent the sort of "concurrent conflicts" you mention. The typical example would be a log file that several processes must write to. Using O_APPEND ensures their messages do not overwrite each other.
Moreover, all data written by a single write call is written atomically, so provided that your write("child writing\n") successfully writes all its bytes (which for a regular file it usually would), they will not be interleaved with the bytes of any other such message.
First, write() calls with the O_APPEND flag should be atomic. Per POSIX write():
If the O_APPEND flag of the file status flags is set, the file offset shall be set to the end of the file prior to each write and no intervening file modification operation shall occur between changing the file offset and the write operation.
But that's not enough when there are multiple threads or processes making parallel write() calls on the same file - that does not guarantee that parallel write() calls are atomic.
POSIX does guarantee that parallel write() calls are also atomic:
All of the following functions shall be atomic with respect to each
other in the effects specified in POSIX.1-2017 when they operate on
regular files or symbolic links:
...
write()
...
See also Is file append atomic in UNIX?
Beware, though. Reading that question and its answers shows that Linux filesystems such as ext3 are not POSIX compliant once you get past a relatively small size operation, or possibly if you cross page and/or file system sector boundaries. I suspect XFS and ZFS will support write() atomicity much better given their origins.
And none of this applies to Windows.
I'm writing a C Linux program for college, using low-level file I/O (I have to use fcntl, basically).
I need to lock 8 bytes past the end of the file, to append some new data. Trying this as below returns an Invalid argument error.
struct flock field_lock;
field_lock.l_type = F_WRLCK;
field_lock.l_whence = SEEK_CUR;
field_lock.l_start = 0;
field_lock.l_len = 2 * sizeof(int);
// ...
lseek(stocks_fd, 0, SEEK_END);
// ...
fcntl(stocks_fd, F_SETLKW, &field_lock);
Any idea how else I could achieve this, or what I'm doing wrong?
edit 1: https://gist.github.com/limelier/5a7ba8ab166a1f586a3c4feec355b83b
Entire program with an improvement applied to the EOF locks and writing logic, as recommended in the comments below. Same problem still exists, though, and a sample output has been appended.
I have 2 functions. The first function is opening a file in write mode and writing some contents to it and then closing it.
FILE *fp = fopen("file.txt", "w");
//writing itnot file using fwrite
fclose(fp);
The second function opens the file in read mode, parses the content and then closes the file.
FILE *fp = fopen("file.txt", "r");
//parsing logic
fclose(fp);
In main, I am calling function1 and function2 sequentially.
int main()
{
function1();
function2();
return 1;
}
Sometimes, function1 fopen fails with error number 13 i.e. Permission Denied. I am observing this only sometimes. I introduced a sleep in function1 after fclose for 2 seconds and it started working fine without any issues.
So I am suspecting file is not immediately released after fclose. Sleep is not the right solution. Can anyone suggest how to resolve this problem? The example I have given here is a use case and the actual code is running in a thread environment.
Draft N1570 for C11 says as 7.21.5.1 The fclose function
A successful call to the fclose function causes the stream pointed to by stream to be
flushed and the associated file to be closed. Any unwritten buffered data for the stream
are delivered to the host environment to be written to the file; any unread buffered data
are discarded. Whether or not the call succeeds, the stream is disassociated from the file
and any buffer set by the setbuf or setvbuf function is disassociated from the stream
(and deallocated if it was automatically allocated).
It makes no assumption on what happens at the host environment level, that is does the function returns only when the whole operation is finished, or does it returns as soon as a request has been queued.
As race conditions can happen in your environment, you should retry a failed open a number of times, eventually with a delay between them. If portability is not a problem and if your system supports the POSIX sync function, you can also force a disk synchronisation of the file after closing it:
Close part:
...
fclose(fp)
sync(); // forces synchronization of io buffers to disk
Re-open part
ntries = ...; // number of open tries
while (ntries-- > 0) {
fp = fopen(...);
if (fp != NULL) break; // open was successful
// optionaly add a delay
}
In an environment and with a C implementation where you must accommodate such behavior, the best approach is probably to implement some fault tolerance around the fopen()s. Although an unconditional sleep() is not the right answer, short, conditional delays via sleep() or a similar function may indeed be part of such an strategy. For example, you might do something along these lines:
#include <stdio.h>
#include <errno.h>
#define MAX_ATTEMPTS 3
FILE *tolerant_fopen(const char *path, const char *mode) {
FILE *result;
int attempts = 0;
while (!(result = fopen(path, mode))) {
if (errno != EACCES || attempts >= MAX_ATTEMPTS) {
break;
}
if (sleep(1) == 0) {
attempts += 1;
}
}
return result;
}
That attempts to open the file immediately, and in the event that that fails on account of access permissions, it waits a short time and then makes another attempt. Overall it may make three or more attempts to open the file, spaced up to a second apart or perhaps slightly more. (Note that sleep() can be interrupted early; in that case it returns the number of seconds left to sleep.)
You can of course implement a different strategy for the timing and duration of the retries if you prefer, retry more error conditions, etc..
If a child process is spawned (even from another thread) while a file is open, then the child process will inherit the file handle and the file will not be fully closed until after the child process has terminated.
To prevent this behavior pass the "N" flag to fopen:
FILE *fp = fopen("file.txt", "wN");
The man pages ( https://linux.die.net/man/2/flock ) are not clear regarding whether a LOCK_UN operation on flock() is allowed if unlock has already been called in another thread. In my case, multiple threads could be reading the same file via multiple file descriptors, each of which would call flock(fd, LOCK_SH) and then flock(fd, LOCK_UN) after reading. Would this produce undefined behavior?
For example:
//Two duplicate threads do:
FILE* fp = fopen("path", "r");
if (fp) {
int fd = fileno(fp); // there is much more complexity in-between that makes fopen easier to use
flock(fd, LOCK_SH);
my_read_function(fp); // does something with a global (not the case in actual code)
fclose(fp); // apparently the file might already be unlocked here if only this thread has opened it, but assume that I may use C++ to create some safer wrapper to handle exceptions and decide to use explicit flock calls outside the destructor
flock(fd, LOCK_UN);
}
Also I am not convinced that flock is necessary at all given that threads only read. Is it good practice to use flock() in this way nontheless?
Thank you in advance.
I am working on a multithreaded system where a file can be shared among different threads based on the file access permissions.
How can I check if file is already opened by another thread?
To find out if a named file is already opened on linux, you can scan the /proc/self/fd directory to see if the file is associated with a file descriptor. The program below sketches out a solution:
DIR *d = opendir("/proc/self/fd");
if (d) {
struct dirent *entry;
struct dirent *result;
entry = malloc(sizeof(struct dirent) + NAME_MAX + 1);
result = 0;
while (readdir_r(d, entry, &result) == 0) {
if (result == 0) break;
if (isdigit(result->d_name[0])) {
char path[NAME_MAX+1];
char buf[NAME_MAX+1];
snprintf(path, sizeof(path), "/proc/self/fd/%s",
result->d_name);
ssize_t bytes = readlink(path, buf, sizeof(buf));
buf[bytes] = '\0';
if (strcmp(file_of_interest, buf) == 0) break;
}
}
free(entry);
closedir(d);
if (result) return FILE_IS_FOUND;
}
return FILE_IS_NOT_FOUND;
From your comment, it seems what you want to do is to retrieve an existing FILE * if one has already been created by a previous call to fopen() on the file. There is no mechanism provided by the standard C library to iterate through all currently opened FILE *. If there was such a mechanism, you could derive its file descriptor with fileno(), and then query /proc/self/fd/# with readlink() as shown above.
This means you will need to use a data structure to manage your open FILE *s. Probably a hash table using the file name as the key would be the most useful for you.
If you tend to do it in shell, you can simply use lsof $filename.
You can use int flock(int fd, int operation); to mark a file as locked and also to check if it is locked.
Apply or remove an advisory lock on the open file specified by fd.
The argument operation is one of the following:
LOCK_SH Place a shared lock. More than one process may hold a
shared lock for a given file at a given time.
LOCK_EX Place an exclusive lock. Only one process may hold an
exclusive lock for a given file at a given time.
LOCK_UN Remove an existing lock held by this process.
flock should work in a threaded app if you open the file separately in each thread:
multiple threads able to get flock at the same time
There's more information about flock and it's potential weaknesses here.
I don't know much in the way of multithreading on Windows, but you have a lot of options if you're on Linux. Here is a FANTASTIC resource. You might also take advantage of any file-locking features offered inherently or explicitly by the OS (ex: fcntl). More on Linux locks here. Creating and manually managing your own mutexes offers you more flexibility than you would otherwise have. user814064's comment about flock() looks like the perfect solution, but it never hurts to have options!
Added a code example:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
FILE *fp;
int counter;
pthread_mutex_t fmutex = PTHREAD_MUTEX_INITIALIZER;
void *foo() {
// pthread_mutex_trylock() checks if the mutex is
// locked without blocking
//int busy = pthread_mutex_trylock(&fmutex);
// this blocks until the lock is released
pthread_mutex_lock(&fmutex);
fprintf(fp, "counter = %d\n", counter);
printf("counter = %d\n", counter);
counter++;
pthread_mutex_unlock(&fmutex);
}
int main() {
counter = 0;
fp = fopen("threads.txt", "w");
pthread_t thread1, thread2;
if (pthread_create(&thread1, NULL, &foo, NULL))
printf("Error creating thread 1");
if (pthread_create(&thread2, NULL, &foo, NULL))
printf("Error creating thread 2");
pthread_join(thread1, NULL);
pthread_join(thread2, NULL);
fclose(fp);
return 0;
}
If you need to determine whether another thread opened a file instead of knowing that a file was already opened, you're probably doing it the wrong way.
In a multithreaded application, you want to manage resources used in common in a list accessible by all the threads. That list needs to be managed in a multithread safe manner. This just means you need to lock a mutex, do things with the list, then unlock the mutex. Further, reading/writing to the files by more than one thread can be very complicated. Again, you need locking to do that safely. In most cases, it's much easier to mark the file as "busy" (a.k.a. a thread is using that file) and wait for the file to be "ready" (a.k.a. no thread is using it).
So assuming you have a form of linked list implementation, you can have a search of the list in a way similar to:
my_file *my_file_find(const char *filename)
{
my_file *l, *result = NULL;
pthread_mutex_lock(&fmutex);
l = my_list_of_files;
while(l != NULL)
{
if(strcmp(l->filename, filename) == 0)
{
result = l;
break;
}
l = l->next;
}
pthread_mutex_unlock(&fmutex);
return result;
}
If the function returns NULL, then no other threads had the file open while searching (since the mutex was unlocked, another thread could have opened the file before the function executed the return). If you need to open the file in a safe manner (i.e. only one thread can open file filename) then you need to have a my_file_open() function which locks, searches, adds a new my_file if not found, then return that new added my_file pointer. If the file already exists, then the my_file_open() probably returns NULL meaning that it could not open a file which that one thread can use (i.e. another thread is already using it).
Just remember that you can't unlock the mutex between the search and the add. So you can't use the my_file_find() function above without first getting a lock on your mutex (in which case you probably want to have recursive mutexes).
In other words, you can search the exiting list, grow the existing list, and shrink (a.k.a. close a file) only if you first lock the mutex, do ALL THE WORK, then unlock the mutex.
This is valid for any kind of resources, not just files. It could be memory buffers, a graphical interface widget, a USB port, etc.