When does file stream locking occur in glibc? - c

Reading the glibc documentation, I recently learned that calls to getc may have to wait to acquire a lock to read a file. I wanted to verify that when using buffering a lock is only acquired when the actual file needs to be read to replenish the buffer.
Thanks!

The lock invoked by getc provides application-level locking of the stdio FILE object, to allow thread-safe access to the same FILE object by multiple threads in the same application. As such, it will need to be acquired every time a character is read, not just when the buffer is replenished.
But, if you aren't accessing the FILE from multiple threads, you'll never have to wait for the lock. If the overhead of acquiring/releasing the lock is too much (measure this; don't just assume), you also have the option of manually locking/unlocking using flockfile and funlockfile, then using getc_unlocked.

Related

about locking in fread/fwrite and called from different processes

It seems that in Linux C calls to fread and fwrite are locked as on man pages (man fwrite) some unlocked function there are mentioned (unlocked_stdio).
As fare are you are aware, are these locks valid across process or do they lock only within the same process?
They don't even lock within the process. They only lock the actual stream object on which you call them. If, for example, you have two FILE* objects that reference the same underlying file or terminal, fread and fwrite will happily allow them to trample each other, even in the same process.

Making all the children sleep from another child thread

I am trying to develop a program with POSIX threads in which i have a child thread which will be updating the content of a file and the database between certain intervals and there will be other children who reads data from the file and database all the time. So i don't want any thread to read the file or database while they are being written by the single updater thread. So my idea is to make all other children threads sleep from the child thread which will update the file and database. sleep() makes the calling thread sleep. Is there any way the above scenario can be implemented?!
EDIT:
I have two different functions for reading and writing the file. Most of the threads access the read method so they aren't vulnerable but they might be if they try to read in between while the periodic thread which accesses the write method is updating the file's contents.
You do not want to use sleep for this at all. Instead, use a reader/writer lock. The updater thread must acquire the lock (in write mode) before it modifies the data. And the other threads must acquire the lock (in read mode) before reading the data.
Note that if your reader threads are reading continuously, the writer will get starved and never acquire the lock. So you will need some separate mechanism such as a flag the updater can set that tells the readers to please stop reading and release their locks. If the readers only read occasionally this shouldn't be such an issue (unless there are tons of readers in which case you may have an architectural problem).

Unlocking a mutex after calling trylock()

I have a threaded server that can add/append/read files and relay data to the client.
If a file is being added, no other thread can append/read it. If a file is being appended, no threads can append/read it. If a file is being read, no other thread can append to it. However, if a file is being read, other files can read it.
Currently I have a mutex system that will do this, except it won't allow multiple reads.
To fix this, in the read method, I will change:
pthread_mutex_lock(&(fm->mutex));//LOCK
//do some things`
...
pthread_mutex_unlock(&(fm->mutex));
to
pthread_mutex_trylock(&(fm->mutex));//TRYLOCK [NonBlocking, so the thread can continue the read]
//do some things`
...
pthread_mutex_unlock(&(fm->mutex));
Question
How can I unlock the file without allowing the other methods (just append really) to begin writing to the file before all the other read()'s have finished?
Example
For example, if the reading thread that originally locked the file completes and unlocks the file and there are still other threads trying to read the file, then an appending thread gets the chance to lock the file and begin appending while the others are still reading, which is a no-no.
Idea
I want to keep a count of the number of threads currently reading a file. When a thread finishes, reduce the count. If the count is 0, meaning no threads are still reading, unlock the file. But, I'm worried that this would not be thread safe. If this is a viable solution, how could I make it thread safe? Another but, I believe only the original thread can successfully unlock the mutex.
It sounds like you may be looking for a read-write lock, which is provided by pthreads. It allows two modes of locking: a shared/read-lock mode, which can be locked by multiple threads at once, and an exclusive/write-lock mode, where the lock call won't return until all other threads (readers and writers) have given up their hold on the lock.
You could use a semaphore instead of the mutex (see this link about the differences). The semaphore does thread-safe synchronized counting for you.
You can live without an additional mutex to lock the file for writing if you limit the number of simultaneous read accesses to a (sufficient large) number N and require the semaphore to be increased by that number for write access. This way you can only gain write access if the number of readers is zero and all other readers will be locked out until your writer has finished.
Note that the POSIX documentation for pthread_mutex_lock() says:
If successful, the pthread_mutex_lock(), pthread_mutex_trylock(), and pthread_mutex_unlock() functions shall return zero; otherwise, an error number shall be returned to indicate the error.
Since you don't show your code testing the return values, you don't know whether your lock operations (in particular) succeeded or not.
Separately, since you want a read/write lock, why not use one:
pthread_rwlock_rdlock()
pthread_rwlock_wrlock()
pthread_rwlock_unlock()
pthread_rwlock_init()
pthread_rwlock_destroy()
There are four pthread_rwlockattr_*() functions and a total of 9 pthread_rwlock_*() functions; I only listed the most important functions in the family.

Accessing a file by several processes

this is a design question more than a coding problem. I have a parent process that will fork many children. Each of the children is supposed to read and write on the same text file.
How can we achieve this safely?
My thoughts:
create the file pointer in the parent, then create a binary semaphore on it. And processes will compete on obtaining the file pointer and write on the file. In the read case i don't need a semaphore.
Please tell me if i got it wrong.
I am using C under linux.
Thank you.
POSIX systems have kernel level file locks using fcntl and/or flock. Their history is a bit complicated and their use and semantics not always obvious but they do work, especially in simple cases. For locking an entire file, flock is easier to use IMO. If you need to lock only parts of a file, fcntl provides that ability.
As an aside, file locking over NFS is not safe on all (most?) platforms.
man 2 flock
man 2 fcntl
http://en.wikipedia.org/wiki/File_locking#In_Unix-like_systems
Also, keep in mind that file locks are "advisory" only. They don't actually prevent you from writing/reading/etc to a file if you bypass acquiring the lock.
If writers are appending data to the file, your approach seems fine (at least up until the file becomes too large for the file system).
If writers are doing file replacement, then I would approach it something like this:
The reading API would check the time of last modification (with fstat()) against a cached value. If the time has changed, the file is re-opened, and the cached modification time updated, before the read is performed.
The writing API would acquire a lock, and write to a temporary file. Then, the actual data file is replaced by calling rename(), after which the lock is released.
If writers can write anywhere in the file, then you probably want are more structured file than just plain text, similar to a database. In such a case, some kind of reader-writer lock should be used to manage data consistency and data integrity.

flock(), then fgets(): low-level locks, then stdio read/write library functions. Is it possible?

I'm writing a server web.
Each connection is served by a separate thread, so I don't know in advance the number of threads.
There are also a group of text files (don't know the number, too), and each thread can read/write on each file.
A file can be written by just one thread a time, but different threads can write on different files at the same time.
If a file is read by one or more threads (reads can be concurrent), no thread can write on THAT file.
Now, I noticed this (Thread safe multi-file writing) solution, but I'd like also to use functions as fgets(), for example.
So, can I flock() a file, and then use a fgets() or another stdio read/write library function?
First of all, use fcntl, not flock. The latter is a non-standard, deprecated BSD function and does not work with NFS and possibly other filesystems. fcntl locking on the other hand is POSIX standard and is intended to work everywhere.
Now if you want to use file-level reader-writer locking mixed with stdio, it will work, but you have to take some care to ensure that buffering does not break your assumptions about locks. The method I'm about to explain is not the only one, but I believe it's the clearest/simplest:
When you want to operate on one of your files with stdio, obtaining the correct type of lock (read or write, aka shared of exclusive) should be the first thing you do after fopen. Use fileno to get the file descriptor number and apply the lock to it. After that, perform your entire read or write operation. Do not make any attempt to unlock the file; instead, call fclose to close the file and let it be implicitly unlocked when it's closed. Otherwise you may release the lock while unbuffered data is still unwritten, or later read data that was buffered before the lock was released, that's no longer valid after the lock is released.

Resources