Suppose there are two type of concurrent threads, lets say writer and reader (where the reader thread reads the different elements after they are written by the writer).
The writer has the following functions:
+create element (suppose there are 3 types of elements)
+increase element stock (it should be done separately after the creation phase)
The reader has the following function:
+Iterate over the whole database reducing by one unit the element stock until every single inserted element has stock 0 (including the stock after the increase stock phase)
Each element has two variables:
+stock (integer)
+internal data (void pointer) --> Can be used as the programmer wishes in order to achieve syncronization
In order to avoid race conditions, pthread_mutex and pthread_cond_wait functions are used.
My approach to solve this problem concurrently is the following:
write
pthread_mutex_lock(&mutex)
set_internal_data(element_id, 1)
create_element(element_id)
pthread_cond_signal(&inserted,&mutex)
pthread_mutex_unlock(&mutex)
pthread_mutex_lock(&mutex)
set_internal_data(element_id, 1)
get_stock(element_id, prev_element_stock)
update_stock(element_id, prev_element_stock+ element_stock)
pthread_cond_signal(&inserted,&mutex)
pthread_mutex_unlock(&mutex)
read
get_internal_data(element_id, element_internal_data)
while(element_internal_data)
pthread_cond_wait(&inserted,&mutex)
read operation
Note: every created element has 1 unit of stock. Before update_stock, it could happen that the reader reduces this element by one unit but this would not imply that the element is deleted from the database
My questions are:
1) Do you guys think this is the most efficient way to use the internal data variable in order to synchronize the operations?
2) The write operations are inside mutexes but the cond_wait operation is not inside a mutex. Would it be strictly neccesary to have this cond_wait operation inside a mutex?
It is necessary that the read function holds a mutex if you want the thread to block until there is something to read or if you intend to change in any way the shared state. The goal of pthread_cond_wait is to block a thread, release the mutex it holds until a specific condition is met (in your case something is writen) which is notified by using pthread_cond_signal or pthread_condition_broadcast, afterwards it will reaquire the mutex and go on with the read.
For the write operation I think you have typos in the name of the functions but the functions should look like:
pthread_mutex_lock(&mutex)
write_to_the_shared_state
pthread_cond_signal(&cond) / pthread_cond_broadcast(&cond)
pthread_mutex_unlock(&mutex)
and for the read
pthread_mutex_lock(&mutex)
while(data = try_to_read)
pthread_cond_wait(&cond, &mutex)
pthread_mutex_unlock(&mutex)
return data
This does not include error checking. And if you both write / read to be blocking you will have to mix both example above.
Related
How can I implement a binary semaphore using the POSIX counting semaphore API? I am using an unnamed semaphore and need to limit its count to 1. I believe I can't use a mutex because I need to be able to unlock from another thread.
If you actually want a semaphore that "absorbs" multiple posts without allowing multiple waits to succeed, and especially if you want to be strict about that, POSIX semaphores are not a good underlying promitive to use to implement it. The right set of primitives to implement it on top of is a mutex, a condition variable, and a bool protected by the mutex. When changing the bool from 0 to 1, you signal the condition variable.
With that said, what you're asking for is something of a smell; it inherently has ambiguous orderings. For example if threads A and B both post the semaphore one after another, and threads X and Y are both just starting to wait, it's possible with your non-counting semaphore that either both waits succeed or that only one does, depending on the order of execution: ABXY or AXBY (or other comparable permutation). Thus, the pattern is likely erroneous unless either there's only one thread that could possibly psot at any given time (in which case, why would it post more than once? maybe this is a non-issue) or ability to post is controlled by holding some sort of lock (again, in which case why would it post more than once?). So if you don't have a design flaw here, it's likely that just using a counting semaphore but not posting it more than once gives the behavior you want.
If that's not the case, then there's probably some other data associated with the semaphore that's not properly synchronized, and you're trying to use the semaphore like a condition variable for it. If that's the case, just put a proper mutex and condition variable around it and use them, and forget the semaphore.
One comment for addressing your specific situation:
I believe I can't use a mutex because I need to be able to unlock from another thread.
This becomes a non-issue if you use a combination of mutex and condition variable, because you don't keep the mutex locked while working. Instead, the fact that the combined system is in-use is part of the state protected by the mutex (e.g. the above-mentioned bool) and any thread that can obtain the mutex can change it (to return it to a released state).
For example, we have 5 pieces of data.(assume we have a lot of space, different version of data will not overlap each others.)
DATA0, DATA1, DATA2, DATA3, DATA4.
We have 3 threads(less than 5) working on those data.
Thread 1, working on DATA1 (version 0), has accessed some data from both DATA0(version 0) and DATA2(version 0), and create DATA1(version 1).
Thread 2, working on DATA3 (version 0), has accessed some data from both DATA2(version 0) and DATA4(version 0), and create DATA3(version 1).
Thread 3, working on DATA2 (version 0), has accessed some data from both DATA1(version 0) and DATA3(version 0), and create DATA2(version 1).
Now, if thread 1 finishes first. It has several choices, it can work on DATA0 (to create DATA0 version 1) since DATA1(version 0) and DATA4 (version 0) is available (Assume DATA0 & DATA4 are neighbors). It can also work on DATA 2 if it finds out that both DATA1(version1) and DATA3(version1) are available and create DATA2(version 2).
The requirement is the next version of data can be processed once it's neighbor data is ready(in 1 lower version).
At last, I want all threads to exit when all data arrive at version 10.
Question: How to implement this scheme using pthread library.
Note: I want to have data in different versions at the same time, so to create a barrier and make sure all data reach the same version is not an option.
Lets discuss the implementation. To have all versions (0~10) stored we would need 5*11*sizeof(data) space. Let us create two arrays of size 5 x 11. First array is DATA such that DATA[i][j] is the j th version of data i. Second array is an 'Access Matrix' - A, it denotes the state of an index, it could be:
Not started
In Progress
Completed
Algorithm: Each thread would search for an index [i][j] in the matrix such that, index [i-1][j-1] and [i+1][j-1] is 'Completed'. It would set A[i][j] to 'In Progress' while working on it. In case i=0, i-1 refers to n-1, if i=n-1, i+1 refers to 0. (like a circular queue). When all entries in the last column are 'Completed', the thread terminates. Otherwise it searches for a new data which is not completed.
Using pthread library to realize this:
Important variables: mutex, conditional variables.
pthread_mutex_t mutex= PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t condvar= PTHREAD_COND_INITIALIZER;
mutex is a 'lock'. We use it when we need to make an operation atomic. Atomic operation refers to an operation that needs to be done in 1 step without breaking execution. 'condvar' is a condition variable. Using it a thread can sleep until a condition is reached, when it is reached, the thread is woken up. This avoids busy waiting using a loop.
Here, our atomic operation is updating A. Reason: If the threads simultaneously update A, it may lead to race conditions such as more than 1 thread working on a Data in parallel.
To realize this, we search and set A inside the lock. Once A is set, we release the lock and work on the data. But if no available data was found which could be worked on, we wait on the conditional variable - condvar. When we call wait on condvar, we also pass mutex. While inside the lock, wait function releases the mutex lock and waits for the conditional variable to be signaled. Once it is signaled, it requires the lock and proceeds with execution. While waiting process is in sleeping state and hence does not waste CPU time.
Whenever any thread finishes working on a piece of data, it may prepare 1 or more other samples for being worked on. Hence after a thread finishes work, it signals all other threads to check for a 'workable' Data before continuing the algorithm. Pseudo code for this is as follows:
Read the comments and function names. They describe in detail the working of pthread library. While compilation with gcc add -lpthread flag and for further details of the library looking up the man pages of these functions is more than sufficient.
void thread(void)
{
//Note: If there are various threads in the line pthread_mutex_lock(&mutex)
// each will wait till the lock is released and acquired. Until then it will sleep.
pthread_mutex_lock(&mutex); //Do the searching inside the lock
while(lastColumnNotDone){ //This will avoid previously searched indices being updated
//Search for a workable index
if(found)
{ //As A has been updated and set to in progress, no need to hold lock. As we start work on the thread we release the lock so other process might use it.
pthread_mutex_unlock(&mutex); //Note:
//WORK ON DATA
pthread_mutex_lock(&mutex); //Restore lock to in order to continue thread's execution safely.
pthread_cond_broadcast(&condvar); //Sends a wake up signal to all threads which are waiting for the conditional variable 'condvar'.
}
else //No executable data found
pthread_cond_wait(&condvar,&mutex); //While waiting, we pass the address of mutex as second parameter to wait function. This releases the lock on mutex while this function is waiting and tries to reacquire it once condvar is signaled.
}
pthread_mutex_unlock(&mutex);
}
Search and checking if all data is completed in the while loop condition can be optimized but that is a different algorithms question. Key idea here is use of pthread library and thread concept.
A is a common access matrix. Do NOT update it outside of lock.
While checking anything with respect to A, such as finding a process or checking if all data is done, lock must be held. Otherwise A can be changed by a different thread at the same time a thread is reading it.
We acquire and release locks using the functions pthread_mutex_lock and pthread_mutex_unlock. Remember, these functions take pointers of the mutex and not it's value. It is a variable that needs to be accessed and updated.
Avoid holding the lock for long amounts of time. This will cause the threads to wait for a long time for small access needs.
When calling wait, be sure that lock is held. Wait unlocks the mutex held passed as the second parameter during the duration of it's wait. After receiving the signal to wake up it tries to acquire the lock once again.
I'm working on a project using pthreads and i made my own implementation of Readers Writer Lock which has the following methods:
Lock for reader (several can read simultaneously)
Lock for writer (only one can write)
Unlock (for both reader/writer)
I've tested it and it works well, my problem is more logical,
in my program i want several threads to do some testing on numbers in a specific range, and if a number was found that answers my criteria i want them to add them to a shared resource which is an array.
If the number is already found by another thread and exists in the array, continue searching.
Here is a pseudo code for my algorithm:
X = lowest number to search, X' = highest number to search,
func = the test for the number, ARR = the array shared between the threads,
RWL_R = Lock for reader, RWL_W Lock for writer, RWL_U = Unlock.
FOR each NUM from X to X' DO:
IF func(NUM) = true DO:
FOR each cell of ARR DO:
RWL_R // lock for reader since i'm reading from ARR
IF ARR[cell] contains NUM DO:
RWL_U // unlock since no longer reading from ARR
skip to the next iteration of the first FOR loop
ENDIF
END FOR
RWL_U // unlock since no longer reading from ARR
////// PROBLEM HERE ////////
RWL_W // lock to write to array since for loop ended without finding the same NUM
ARR[cell] <- NUM
RWL_U // unlock since finished writing into array
END IF
END FOR
As you can see the logic is fine since that little line i marked with the ugly caps "PROBLEM HERE". inside that little gap between the reader unlock and the writer lock a race condition might (and does) occur.
So i have 2 possible results:
(The Good)
Thread_A find number N, locks the array for reading.
Thread_B finds the same number N, waiting to check the array but it's currently locked by Thread_A.
Thread_A finishes going through the array and the number N is not there, unlocks the lock and locks it again as writer, adding N to the array, unlocks the lock and finishes his job.
Thread_B can now check the array, number N is there so it skips to number N2 and the rest works hot it's supposed to.
(The bad)
Thread_A find number N, locks the array for reading.
Thread_B finds the same number N, waiting to check the array but it's currently locked by Thread_A.
Thread_A finishes going through the array and the number N is not there, unlocks the lock.
Thread_B takes over the lock and locking it as reader, checking the array and number N is still not there (Thread_A haven't added it yet).
Thread_B unlocks the lock.
Either Thread_A or Thread_B now locks the lock for writing, adding number N, unlocking the lock and finishes.
The thread that waited now locks the lock, adds the same number N, unlocks and finishes.
So i'm now trying to find the best logical way to fix the issue, i can only think about locking as a writer when checking the array and not unlocking it until finishing writing, or to create a method that switches "atomically" from reader lock to writer lock, but that's kind of "Cheating" and not using the "Readers Writer Lock" mechanism as it's supposed to be used.
What's a better logical way to use it here?
The two options you give are suboptimal in most scenarios, since one prevents multiple readers from checking at the same time (and presumably, after a while, readers are much more common than writers), while the other is impossible; even if you switch "atomically" from reader lock to writer lock, two different readers could both determine a value X is not present, both request an upgrade to write lock, the first one writes X then unlocks, while the second waits its turn then writes X again and you end up violating the uniqueness invariant.
The real answer depends on the common usage patterns:
If writing is more likely to happen than reading in general (concurrent reading is uncommon), then you just drop the reader/writer lock and use a simple mutex. Reader/writer locks often have overhead that isn't worth it if you aren't regularly using concurrent readers, and even if the overhead is trivial (sometimes it is; Windows' SRW Locks used solely in "exclusive" (writer) mode where recursive acquisition isn't needed is as fast or faster than critical sections), if the program logic becomes more complicated, or you need to constantly switch from read to write locks and back, the additional logic and acquisition/release overhead can cost more than just a single lock and unlock with exclusive access.
If you have other frequently used code which reads but never writes, and readers that might need to write are less common and regularly have to write, then use the solution you suggested; keep the reader/writer lock, but have the "might write" code lock for write from the beginning; it eliminates concurrency in the "might write" code path, but "only read" users can operate concurrently.
If reading is much more common than writing (which is the usual use case for something like this if the code provided is the only code that accesses the array; otherwise your array will grow out of control), then perform a double check after "upgrading" to write lock; perform the same scan a second time after upgrading to a write lock to make sure no one grabbed the write lock and added the missing value, then write if the value is still missing. This can be optimized to avoid rescanning if the array only has new values added, with existing values never changing or being deleted. You'd just remember where you left off checking, and scan any new entries added between your original scan and when you acquired the write lock.
The pseudo-code for #3 would look like:
FOR each NUM from X to X' DO:
IF func(NUM) = true DO:
RWL_R // lock for reader outside inner loop since repeatedly reading from ARR
cell = 0
WHILE cell < ARR.length DO:
IF ARR[cell] contains NUM DO:
RWL_U // unlock since no longer reading from ARR
skip to the next iteration of the first FOR loop
ENDIF
cell += 1
END WHILE
RWL_U // unlock since no longer reading from ARR
RWL_W // lock to write to array since for loop ended without finding the same NUM
// NEW!!! Check newly added cells
WHILE cell < ARR.length DO:
IF ARR[cell] contains NUM DO:
RWL_U // unlock since no longer reading from ARR
skip to the next iteration of the first FOR loop
ENDIF
cell += 1
END WHILE
// If we got here, not in newly added cells either, so add to end
ARR[cell] <- NUM
RWL_U // unlock since finished writing into array
END IF
END FOR
i can only think about locking as a writer when checking the array and not unlocking it until finishing writing
That would certainly be viable, but it would prevent any concurrency of the array scans. If those consume a significant portion of the program's run time, then making the scans concurrent is highly desirable.
or to create a method that switches "atomically" from reader lock to writer lock, but that's kind of "Cheating" and not using the "Readers Writer Lock" mechanism as it's supposed to be used.
That's not viable, because you cannot promote a read lock to a write lock while more than one thread holds the read lock. A write lock must exclude not just other writers but also all readers. You'll end up with deadlocks in the case you describe, because two or more threads holding the read lock need to promote it to a write lock, but that cannot happen for any of them until all of the others release the lock.
In any event, even if you allowed a write lock to coexist with read locks, that would not reliably prevent two threads considering the same number both scanning the file to its current end, not seeing the number, and, in turn, appending it to the array.
If you want to provide for concurrent array scans but prevent duplicates being added to your array then you need at least a bit more communication between threads.
One relatively simple approach would be to implement a simple transactional system. Your reader / writer lock would support promoting a read lock to a write lock while there are multiple readers, but only for reader locks obtained since the last time a writer lock was released. The thread successfully performing such a promotion could then be confident that the data were not modified while it was reading them, and it could therefore safely update the array. Promotion failure would be analogous to failure to commit a transaction -- the thread experiencing such a failure would need to rescan from the beginning.
That will work reasonably well if there are comparatively many appearances of each number, for then the likelihood of rescans will decline as the cost of rescans increases.
You could also consider more sophisticated mechanisms whereby threads that acquire a write lock somehow inform readers what value they are writing, so that any readers scanning for that value can abort early. That will be trickier to write, however, and although it sounds efficient, it might actually be slower under some circumstances as a result of requiring more synchronization among threads.
Instead of a single readers/writer lock, use two locks: one for reading and one for writing. Each thread would acquire the read-lock and scan the array. If the array should be modified, then the thread would acquire the write-lock and re-scan the array to ensure that the number hasn't already been added.
Here's the pseudo-code:
acquire_read_lock();
if (!is_value_in_array(value)) {
acquire_write_lock();
if (!is_value_in_array(value)) {
add_value_to_array(value);
}
release_write_lock();
}
release_read_lock();
Consider we have three thread, bool status_flag[500] array, and working situations as follow :
Two threads only writing in status_flag array at different index. while third thread is only reading at any index.
All three thread writing at different index. While all three threads reading at any index.
In writing operation we are just setting the flag never reset it again.
status_flag [i] = true;
In reading operation we are doing something like that :
for(;;){ //spinning to get flag true
if(status_flag [i] == true){
//do_something ;
break;
}
}
What happen if compiler optimize (branch prediction) code?
I have read lot about lock but still having confusion to conclude result. Please help me to conclude.
POSIX is quite clear on this:
Applications shall ensure that access to any memory location by more than one thread of control (threads or processes) is restricted such that no thread of control can read or modify a memory location while another thread of control may be modifying it.
So without locking you're not allowed to read memory that some other thread may be writing. Furthermore, that part of POSIX describes which function will synchronize the memory between the threads. Before both threads have called any of the functions listed in there, you have no guarantee that the changes made by one thread will be visible to the other thread.
If all the threads are operating on different index value, then you do not need a lock. Basically it is equivalent to using different variables.
In your code , the value of the variable i is not set or modified. So it is reading only a particular index of flag. And for writing you are using different index, in this case no need to use lock.
I was reading a text on semaphores and their operations. The author emphasized that the wait() and post() operations of semaphores should be executed atomically, otherwise mutual exclusion of the threads may be violated. Can anybody, please explain to me what he means? I am new to multi-threading by the way
The operation of context switch, where a task / process is replaced by another by the kernel is asynchronous and undeterministic.
Let's examine the following code:
x++;
seems simple, hah ?
However this code is prone to synchronization errors, if x is shared among different tasks / process.
To understand that, you must understand the concept of atomic operation.
Atomic operation are instructions that the processor can execute on a single clock.
It usually involves reading a register, writing a register, etc.
back to the code example:
What actually happens behind the scenes (assembly) when incrementing a variable is
that the cpu reads the value of the variable into a register.
then, it increments it.
and then it saves it back to the original place it tool it from (memory).
As you can see, a simple operation like this involves 3 cpu steps.
context switch can occur between these 3 steps.
Let's take an example of two threads that needs to increment the same variable x.
Let's examine the pseudo assembly code of an imaginary (yet possible) scenario
read the value to register (thread 1)
increment the value (thread 1)
CONTEXT SWITCH
read the value to register (thread 2)
increment the value (thread 2)
save the value (thread 2)
CONTEXT SWITCH
save the value (thread 1)
if x was 3, it appears that it needs to be 5 now, but it will be 4.
Now, let's refer to your original question.
a semaphore / mutex is actually a variable.
and when a process wants to to take it it increments it.
Yes, if wait() and post() operations on semaphore are not executed atomically mutual execution of thread can be violated.
For example, consider a semaphore with value S = 1 and processes P1 and P2 try to to execute wait() simultaneously as below,
At time T0, T1 process P1, P2 finds the value of semaphore as S = 1 respectively followed by decrementing the semaphore to acquire the lock and enter the critical section simultaneously violating the mutual execution of threads.
To employ atomicity between wait() and post() spin-locking until the lock is obtained is advised.