I am trying to develop a program with POSIX threads in which i have a child thread which will be updating the content of a file and the database between certain intervals and there will be other children who reads data from the file and database all the time. So i don't want any thread to read the file or database while they are being written by the single updater thread. So my idea is to make all other children threads sleep from the child thread which will update the file and database. sleep() makes the calling thread sleep. Is there any way the above scenario can be implemented?!
EDIT:
I have two different functions for reading and writing the file. Most of the threads access the read method so they aren't vulnerable but they might be if they try to read in between while the periodic thread which accesses the write method is updating the file's contents.
You do not want to use sleep for this at all. Instead, use a reader/writer lock. The updater thread must acquire the lock (in write mode) before it modifies the data. And the other threads must acquire the lock (in read mode) before reading the data.
Note that if your reader threads are reading continuously, the writer will get starved and never acquire the lock. So you will need some separate mechanism such as a flag the updater can set that tells the readers to please stop reading and release their locks. If the readers only read occasionally this shouldn't be such an issue (unless there are tons of readers in which case you may have an architectural problem).
Related
I'm experimenting with a fictional server/client application where the client side launches request threads by a (possibly very large) period of time, with small in-between delays. Each request thread writes on the 'public' fifo (known by all client and server threads) the contents of the request, and receives the server answer in a 'private' fifo that is created by the server with a name that is implicitly known (in my case, it's 'tmp/processId.threadId').
The public fifo is opened once in the main (request thread spawner) thread so that all request threads may write to it.
Since I don't care about the return value of my request threads and I can't make sure how many request threads I create (so that I store their ids and join them later), I opted to create the threads in a detached state, exit the main thread when the specified timeout expires and let the already spawned threads live on their own.
All of this is fine, however, I'm not closing the public fifo anywhere after all spawned request threads finish: after all, I did exit the main thread without waiting. Is this a small kind of disaster, in which case I absolutely need to count the active threads (perhaps with a condition variable) and close the fifo when it's 0? Should I just accept that the file is not explicitly getting closed, and let the OS do it?
All of this is fine, however, I'm not closing the public fifo anywhere
after all spawned request threads finish: after all, I did exit the
main thread without waiting. Is this a small kind of disaster, in
which case I absolutely need to count the active threads (perhaps with
a condition variable) and close the fifo when it's 0? Should I just
accept that the file is not explicitly getting closed, and let the OS
do it?
Supposing that you genuinely mean a FIFO, such as might be created via mkfifo(), no, it's not a particular issue that the process does not explicitly close it. If any open handles on it remain when the process terminates, they will be closed. Depending on the nature of the termination, it might be that pending data are not flushed, but that is of no consequence if the FIFO is used only for communication among the threads of one process.
But it possibly is an issue that the process does not remove the FIFO. A FIFO has filesystem persistence. Once you create one, it lives until it no longer has any links to the filesystem and is no longer open in any process (like any other file). Merely closing it does not cause it to be removed. Aside from leaving clutter on your filesystem, this might cause issues for concurrent or future runs of the program.
If indeed you are using your FIFOs only for communication among the threads of a single process, then you would probably be better served by pipes.
I managed to solve this issue setting up a cleanup rotine with atexit, which is called when the process terminates, ie. all threads finish their work.
I'm writing a piece of software that does a single very long task. To allow interruption, we have added a check-pointing function that periodically (on the order of minutes) dumps an image of the program state to disk. This takes some time, however, so I would like to switch to a model where the checkpoints are written on a separate thread rather than blocking the primary worker. (Yes, I know I need to keep it thread-safe.)
As I see it, there are two primary methods of accomplishing this task:
For each checkpoint, I pthread_create() a thread which will execute the checkpointing function once and then terminate.
For each checkpoint, I pthread_cond_signal() a single waiting thread that executes the checkpointing function and then returns to waiting.
Both methods require making an atomic copy of my working state and passing it to the checkpoint thread, as well as ensuring that the checkpoint complete successfully before I try another.
My question is if there is a compelling reason to use one method over the other.
I would argue that pthreads are a bad fit for your requirements:Regardless of whether you spawn a new thread for each backup or use a threadpool, you need to make a deep copy of your working-set, which is expensive. Also, you may need extensive synchronization if you go with the thread-pool. Instead, there's a much easier way to do it:fork().The child process inherits the entire memory-space of the parent, but on modern OSs, the copy is lazy (copy on write). Also, you don't need to worry about cleaning up the thread you started, because the fork()ed child releases its resources when it terminates. If your original program is already multithreaded, you may wish to make sure to only use async-safe functions in the child, but thankfully write() is async-safe (as is open() and unlink()). To avoid your child turning into a zombie, you need to call waitid(P_ALL, 0, siginfo_t *infop, WEXITED | WNOHANG) in a loop until it returns nonzero or the siginfo_t * indicates that the child has not yet exited. This avoids stalling the parent in case the child is not done with the backup before the next backup-point is reached.
Don't go with continually creating/terminating/destroying/joining threads if you can possibly avoid it. It's expensive in terms of latency and cycles, has the risk of unwanted multiple threads doing overlapping work and is difficult to debug.
Just create one thread once, at app startup, and don't terminate it. Loop it round some synchro object and sSignal it when you need to, or run a timer or sleep loop to perform your image dumps.
I have a threaded server that can add/append/read files and relay data to the client.
If a file is being added, no other thread can append/read it. If a file is being appended, no threads can append/read it. If a file is being read, no other thread can append to it. However, if a file is being read, other files can read it.
Currently I have a mutex system that will do this, except it won't allow multiple reads.
To fix this, in the read method, I will change:
pthread_mutex_lock(&(fm->mutex));//LOCK
//do some things`
...
pthread_mutex_unlock(&(fm->mutex));
to
pthread_mutex_trylock(&(fm->mutex));//TRYLOCK [NonBlocking, so the thread can continue the read]
//do some things`
...
pthread_mutex_unlock(&(fm->mutex));
Question
How can I unlock the file without allowing the other methods (just append really) to begin writing to the file before all the other read()'s have finished?
Example
For example, if the reading thread that originally locked the file completes and unlocks the file and there are still other threads trying to read the file, then an appending thread gets the chance to lock the file and begin appending while the others are still reading, which is a no-no.
Idea
I want to keep a count of the number of threads currently reading a file. When a thread finishes, reduce the count. If the count is 0, meaning no threads are still reading, unlock the file. But, I'm worried that this would not be thread safe. If this is a viable solution, how could I make it thread safe? Another but, I believe only the original thread can successfully unlock the mutex.
It sounds like you may be looking for a read-write lock, which is provided by pthreads. It allows two modes of locking: a shared/read-lock mode, which can be locked by multiple threads at once, and an exclusive/write-lock mode, where the lock call won't return until all other threads (readers and writers) have given up their hold on the lock.
You could use a semaphore instead of the mutex (see this link about the differences). The semaphore does thread-safe synchronized counting for you.
You can live without an additional mutex to lock the file for writing if you limit the number of simultaneous read accesses to a (sufficient large) number N and require the semaphore to be increased by that number for write access. This way you can only gain write access if the number of readers is zero and all other readers will be locked out until your writer has finished.
Note that the POSIX documentation for pthread_mutex_lock() says:
If successful, the pthread_mutex_lock(), pthread_mutex_trylock(), and pthread_mutex_unlock() functions shall return zero; otherwise, an error number shall be returned to indicate the error.
Since you don't show your code testing the return values, you don't know whether your lock operations (in particular) succeeded or not.
Separately, since you want a read/write lock, why not use one:
pthread_rwlock_rdlock()
pthread_rwlock_wrlock()
pthread_rwlock_unlock()
pthread_rwlock_init()
pthread_rwlock_destroy()
There are four pthread_rwlockattr_*() functions and a total of 9 pthread_rwlock_*() functions; I only listed the most important functions in the family.
I know that this isn't a "homework helper website", but I got insane in the last days because i have to implement the access to resource avoiding starvation and i can't figure out how to do that. Can anyone help me with some application examples or documentation? The assignment is: a resource may be used by 2 types of processes: black and white. When the resource is used by the white processes, it can not be used by the black processes and vice-versa. Implement the access to the resource avoiding starvation. Is this a producer-consumer case?
Let's make a few assumptions (for the sake of discussion):
Our processes will be threads -- not actual software processes, there's a difference which may be important in your assignment.
White processes are Readers.
Black processes are Writers.
Our common resource is particular Variable.
Mutual exclusion locks (mutex):
A mutex is a type of exclusive lock, it has a binary state, it's either locked or unlocked. You can lock it, unlock it or check to see if it's locked or not.
Threads can lock each other out using mutex (mutual exclusion locks) just as processes can lock each other out using semaphores.
When you want to protect a variable from being used by two threads at once you create a mutex for that variable and write every thread so that it attempts to lock the mutex before attempting to use the variable and unlock it after they're done.
This makes any first thread lock the mutex and any subsequent thread block until the first thread unlocks the mutex basically forcing all of these threads to line up and operate on that particular variable sequentially.
This is a bit ineffective when you just want to read the variable, not change its value, because two threads reading the same content doesn't create any conflict or invalid data. Two threads writing at the same time might however corrupt the data.
Readers/Writers locks (RWL):
Most implementations of Readers/Write locks will use a shared lock and an exclusive lock, but they expose a simple usage approach: if you want to read grab a "read lock", if you want to write grab a "write lock".
"Read locks" are not exclusive and they allow multiple readers to be reading at one particular time (without blocking).
"Write locks" are exclusive and only one writer can be writing at one particular time (without blocking).
Starvation:
First step: Readers/Writers Locks is the event when a first (read) thread grabs a "read lock" on the variable, a second (write) tries to grab a "write lock" but is blocked until all readers finish reading.
Second step: before the first thread finishes reading, a third (read) thread grabs a "read lock" on the variable; this means the second (write) thread has to wait for this third thread to finish.
Repeat the second step until starvation is achieved.
Avoiding starvation with Seqlock:
A seqlock is implemented with one mutex and some counters. It always allows reading, even while the writers are writing to the variable but it gives the readers a means of checking if the data has been written to during the time it was being read, if so it may be corrupt so the readers will have to reread the data and check for consistency again.
The "read & consistency check" phase runs in a loop until the check confirms consistency of the data, at which point the reader can continue with its usual task.
The writers use the mutex to grab exclusive access so they never overlap their operations.
This is good for high read low write situations. If there would be too many writers the readers would continuously loop rereading the data.
Your particular situation:
If black processes need to be able to share the resource among themselves and white processes need to be able to share the resource among themselves but white processes can't share the resource with black processes then the solution will not be either RWL or Seqlock.
A variation on the Seqlock algorithm might be your solution.
Generally, is a problem in which it comes to access a shared resource (or mutex).
If you have two object of the same class, both threads:
pseudo-code:
loop
if shared_resource is free
lock shared_resource
do something
free shared_resource
This in VERY broad terms!
I'm trying to understand how asynchronous file operations being emulated using threads. I've found next-to-nothing materials to read about the subject.
Is it possible that:
a process uses a thread to open a regular file (HDD).
the parent gets the file descriptor from the thread, now it may close the thread.
the parent uses the file descriptor with a new thread, reading X bytes from the file.
the parent gets the file descriptor with the seek-position of the current file state.
the parent may repeat these operations, without the need to open, or seek, every time it wishes to "continue" reading a new chunk of the file?
This is just a wild guess of mine, would appreciate if anybody mind to shed more light to clarify how it's being emulated efficiently.
UPDATE:
By efficient I actually mean that I don't want the thread to "wait" since the moment the file been opened. Think of a HTTP non-blocking daemon which serves a client with a huge file, you want to use the thread to read chunks of the file without blocking the daemon - but you don't want to keep the thread busy while "waiting" for the actual transfer to take place, you want to use the thread for other blocking operations of other clients.
To understand asynchronous I/O better, it may be helpful to think in terms of overlapping operation. That is, the number of pending operations (operations that have been started but not yet completed) can simutaneously go above one.
A diagram that explains asynchronous I/O might look like this: http://msdn.microsoft.com/en-us/library/aa365683(VS.85).aspx
If you are using the asynchronous I/O capabilities provided by the underlying Operating System, then it is possible to asynchronously read from multiple files without spawning a equal number of threads.
If your underlying Operating System does not provide asynchronous I/O, or if you decide not to use it, in other words, you wish to emulate asynchronous operation by only using blocking I/O (the regular Read/Write provided by the Operating System) then it is necessary to spawn as many threads as the number of simutaneous I/O operations. This is because when a thread is making a function call to blocking I/O, the thread cannot continue its execution until the operation finishes. In order to start another blocking I/O operation, that operation has to be issued from another thread that is not already occupied.
When you open/create a file fire up a thread. Now store that thread id/ptr as your file handle.
Basically the thread will do nothing except sit in a loop waiting for an "event". A semaphore would be good here. When you want to do a read then you add the read command to a queue (remember to critical section the stack add), return a unique id, and then you increment the semaphore. If the thread is asleep it will now wake up and grab the first message off the queue and process it. When it has completed you remove the command from the queue.
To poll if a file read has completed you can, simply, check to see if its in the command queue. If its not there then the command has completed.
Furthermore if you want to allow synchronous reads as well then you can wait after sending the message through for an "event" to get triggered by the completion. You then check to see if the unique id is the queue and if it isn't you return control. If it still is then you go back to a wait state until the relevant unique id has been processed.