I am not much into scheduling threads, i have like 4-5 threads and each of them will add data to one same buffer at random time.
How i can schedule the threads so there is no case two or more threads to access the buffer at same time ?
I am coding in C on Windows environment.
Thanks in advance.
The shared buffer needs to be protected from concurrent reads/writes by different threads. A synchronization object should be used to prevent this from occuring. Anytime a thread wants to read from or write to the shared buffer it would acquire the lock, perform its operations on the shared buffer and release the lock once it no longers requires the buffer.
An example synchronization object would be CriticalSection:
static CRITICAL_SECTION shared_buffer_lock;
static char shared_buffer[10];
int main()
{
InitializeCriticalSection(&shared_buffer_lock);
/* Start threads */
...
/* Wait for threads */
...
DeleteCriticalSection(&shared_buffer_lock);
return 0;
}
/* In thread.. */
/* Obtain sole access to 'shared_buffer' */
EnterCriticalSection(&shared_buffer_lock);
/* Use 'shared_buffer' ... */
/* Release sole access of 'shared_buffer' */
LeaveCriticalSection(&shared_buffer_lock);
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
int sharedData=0;
void *func(void *parm)
{
int rc;
printf("Thread Entered\n");
pthread_mutex_lock(&mutex);
/********** Critical Section *******************/
printf("Start critical section, holding lock\n");
++sharedData;
printf("End critical section, release lock\n");
/********** Critical Section *******************/
pthread_mutex_unlock(&mutex);
}
The example above shows what you are looking for, using pthreads library. Acquire the mutex with pthread_mutex_lock and release it with pthread_mutex_unlock. All threads that request the same lock will be blocked until mutex is released. This guarantees that only one thread has access to your shared data.
You need to implement an exclusive access to the ressource (buffer). Under Windows I would use Mutexes for it (see CreateMutex and WaitForSingleObject/WaitForMultipleObjects in Windows API).
Related
I'm studying on condition variables of Pthread. When I'm reading the explanation of pthread_cond_signal, I see the following.
The pthread_cond_signal() function shall unblock at least one of
the
threads that are blocked on the specified condition variable cond (if
any threads are blocked on cond).
Till now I knew pthread_cond_signal() would make only one thread to wake up at a time. But, the quoted explanation says at least one. What does it mean? Can it make more than one thread wake up? If yes, why is there pthread_cond_broadcast()?
En passant, I wish the following code taken from UNIX Systems Programming book of Robbins is also related to my question. Is there any reason the author's pthread_cond_broadcast() usage instead of pthread_cond_signal() in waitbarrier function? As a minor point, why is !berror checking needed too as a part of the predicate? When I try both of them by changing, I cannot see any difference.
/*
The program implements a thread-safe barrier by using condition variables. The limit
variable specifies how many threads must arrive at the barrier (execute the
waitbarrier) before the threads are released from the barrier.
The count variable specifies how many threads are currently waiting at the barrier.
Both variables are declared with the static attribute to force access through
initbarrier and waitbarrier. If successful, the initbarrier and waitbarrier
functions return 0. If unsuccessful, these functions return a nonzero error code.
*/
#include <errno.h>
#include <pthread.h>
#include <stdio.h>
static pthread_cond_t bcond = PTHREAD_COND_INITIALIZER;
static pthread_mutex_t bmutex = PTHREAD_MUTEX_INITIALIZER;
static int count = 0;
static int limit = 0;
int initbarrier(int n) { /* initialize the barrier to be size n */
int error;
if (error = pthread_mutex_lock(&bmutex)) /* couldn't lock, give up */
return error;
if (limit != 0) { /* barrier can only be initialized once */
pthread_mutex_unlock(&bmutex);
return EINVAL;
}
limit = n;
return pthread_mutex_unlock(&bmutex);
}
int waitbarrier(void) { /* wait at the barrier until all n threads arrive */
int berror = 0;
int error;
if (error = pthread_mutex_lock(&bmutex)) /* couldn't lock, give up */
return error;
if (limit <= 0) { /* make sure barrier initialized */
pthread_mutex_unlock(&bmutex);
return EINVAL;
}
count++;
while ((count < limit) && !berror)
berror = pthread_cond_wait(&bcond, &bmutex);
if (!berror) {
fprintf(stderr,"soner %d\n",
(int)pthread_self());
berror = pthread_cond_broadcast(&bcond); /* wake up everyone */
}
error = pthread_mutex_unlock(&bmutex);
if (berror)
return berror;
return error;
}
/* ARGSUSED */
static void *printthread(void *arg) {
fprintf(stderr,"This is the first print of thread %d\n",
(int)pthread_self());
waitbarrier();
fprintf(stderr,"This is the second print of thread %d\n",
(int)pthread_self());
return NULL;
}
int main(void) {
pthread_t t0,t1,t2;
if (initbarrier(3)) {
fprintf(stderr,"Error initilizing barrier\n");
return 1;
}
if (pthread_create(&t0,NULL,printthread,NULL))
fprintf(stderr,"Error creating thread 0.\n");
if (pthread_create(&t1,NULL,printthread,NULL))
fprintf(stderr,"Error creating thread 1.\n");
if (pthread_create(&t2,NULL,printthread,NULL))
fprintf(stderr,"Error creating thread 2.\n");
if (pthread_join(t0,NULL))
fprintf(stderr,"Error joining thread 0.\n");
if (pthread_join(t1,NULL))
fprintf(stderr,"Error joining thread 1.\n");
if (pthread_join(t2,NULL))
fprintf(stderr,"Error joining thread 2.\n");
fprintf(stderr,"All threads complete.\n");
return 0;
}
Due to spurious wake-ups pthread_cond_signal could wake up more than one thread.
Look for word "spurious" in pthread_cond_wait.c from glibc.
In waitbarrier it must wake up all threads when they all have arrived to that point, hence it uses pthread_cond_broadcast.
Can [pthread_cond_signal()] make more than one thread wake up?
That's not guaranteed. On some operating system, on some hardware platform, under some circumstances it could wake more than one thread. It is allowed to wake more than one thread because that gives the implementer more freedom to make it work in the most efficient way possible for any given hardware and OS.
It must wake at least one waiting thread, because otherwise, what would be the point of calling it?
But, if your applicaton needs a signal that is guaranteed to wake all of the waiting threads, then that is what pthread_cond_broadcast() is for.
Making efficient use of a multi-processor system is hard. https://www.e-reading.club/bookreader.php/134637/Herlihy,Shavit-_The_art_of_multiprocessor_programming.pdf
Most programming language and library standards allow similar freedoms in the behavior of multi-threaded programs, for the same reason: To allow programs to achieve high performance on a variety of different platforms.
I would like to know what's the best way to wake up a number of threads at once with a loop
I thinked about using a pthread_cond but I don't need a mutex since I'm not waiting for a ressources
Theres's n threads with :
start of thread
loop
action
wait
end of loop
And then the main thread wwill notice them all to start working.
Is it better to use a pthread cond with a mutex per thread or is it better to recreate the threads each time the task is finished ?
You can indeed use a condition variable for this. The way it works is that your condition is over some shared variable that the main thread sets to indicate that the worker threads should continue. You do require a mutex because you do have a shared resource - the shared variable that the main thread sets and the worker threads read.
Something like:
/* shared variables */
int run_iteration = -1;
pthread_mutex_t iteration_lock = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t iteration_cond = PTHREAD_COND_INITIALIZER;
Worker threads:
int current_iteration = 0;
while (... )
{
/* Wait for main to signal start of current_iteration */
pthread_mutex_lock(&iteration_lock);
while (run_iteration < current_iteration)
pthread_cond_wait(&iteration_cond, &iteration_lock);
pthread_mutex_unlock(&iteration_lock);
/* ... Execute current_iteration ... */
current_iteration++;
}
Main thread:
/* signal start of next iteration */
pthread_mutex_lock(&iteration_lock);
run_iteration++;
pthread_cond_broadcast(&iteration_cond);
pthread_mutex_unlock(&iteration_lock);
Alternatively, if you need your threads to run in lockstep then a pthread barrier (pthread_barrier_t) could be used instead.
I've been working on a project lately, and i need to manage a pair of thread pools.
What the worker threads in the pools do is basically execute some kind of pop operation to each respective queue, eventually wait on a condition variable (pthread_cond_t) if there is no available value in the queue, and once they get an item, parse it and execute operations accordingly.
What i'm concerned about is the fact that i want to have no memory leaks, and to achieve that i noticed that calling a pthread_cancel on each thread when the main process is exiting is definitely a bad idea, as it leaves a lot of garbage around.
The point is, my first thought was to use a exit flag which i can set when the threads need to exit, so that they can easily free memory and call a pthread_exit...
I guess i should set this flag, then send a broadcast signal to the threads waiting on the condition variable and check the flag right after the pop operation...
Is this really the correct way to implement a good thread pool termination? I don't feel that much confident about this...
I'm writing some pseudo-code here to explain what i'm talking about
Each pool thread will run some code structured like this:
/* Worker thread (which will run on each pool thread) */
{ /* Thread initialization */ ... }
loop {
value = pop();
{ /* Mutex lock because of the shared flag */ ... }
if (flag) {{ /* Free memory and unlock mutex */ ... } pthread_exit(); }
{ /* Unlock the mutex */ ... }
{ /* Elaborate value */ ... }
}
return NULL;
And there will be some kind of pool_stopRunning() function which will look like:
/* pool_stopRunning() function code */
{ /* Acquire flag mutex */ ... }
setFlag(flag);
{ /* Unlock flag mutex */ ... }
{ /* Acquire queue mutex */ ... }
pthread_cond_broadcast(...);
{ /* Unlock queue mutex */ ... }
Thanks in advance, i just need to be sure that there isn't a fancy-er way to stop a thread pool... (or get to know a better way, by any chance)
As always, i'm sorry if there is any typo, i'm not and english speaker and it's kind of late right now >:
What you are describing will work, but I would suggest a different approach...
You already have a mechanism for assigning tasks to threads, complete with all appropriate synchronization. So instead of complicating the design with some new parallel mechanism, just define a new type of task called "STOP". If there are N threads servicing a queue and you want to terminate them, push N STOP tasks onto the queue. Then just wait for all of the threads to terminate. (This last can be done via "join", so it should not require any new mechanism, either.)
No muss, no fuss.
With respect to symmetry with setting the flag and reducing serialization, this code:
{ /* Mutex lock because of the shared flag */ ... }
if (flag) {{ /* Free memory and unlock mutex */ ... } pthread_exit(); }
{ /* Unlock the mutex */ ... }
should look like this:
{ /* Mutex lock because of the shared flag */ ... }
flagcopy = readFlag();
{ /* Unlock the mutex */ ... }
if (flagcopy) {{ /* Free memory ... } pthread_exit(); }
Having said that, you can (should?) factor the mutex code into the setFlag and readFlag methods.
There is one more thing. If the flag is only a boolean and it is only changed once before the whole thing shuts down (i.e., it's never unset after being set), then I would argue that protecting the read with a mutex is not required.
I say this because if the above assumptions are true and if the loop's duration is very short and the loop iteration frequency is high, then you would be imposing undue serialization upon the business task and potentially increasing the response time unacceptably.
I am beginner to SO, so please let me know if the question is not clear.
I am using two threads for example A and B. And i have a global variable 'p'.
Thread A is while looping and incrementing the value of 'p'.At the same time B is trying to set the 'p' with some other value(both are two different thread functions).
If I am using mutex for synchronizations, and the thread A get the mutex and incrementation the 'p' in a while loop,but it does not release the mutex.
So my question is that if the thread A doesn’t release the mutex can the thread B access the variable 'p'??
EDIT
The thread B is also protected accses to 'p' using mutex.
If the thread A lock using pthread_mutex_lock(), and doesn’t release it , then what happen if the same thread(A) try to access the lock again(remember the thread A is while looping)
For example
while(1)
{
pthread_mutex_lock(&mutex);
p = 10;
}
Is there any problem with this code if the mutex is never released?
You can still access the variable in thread B as the mutex is a separate object not connected to the variable. If You call mutex lock from thread B before accessing p then the thread B will wait for mutex to be released. In fact the thread A will only execute loop body once as it will wait for the mutex to be released before it can lock it again.
If You don't unlock the mutex then any call to lock the same mutex will wait indefinitely, but the variable will be writable.
In Your example access to variable p is what is called a critical section, or the part of code that is between mutex lock and mutex release.
There is no restriction on mutex, you need to write your program to following the rules of using mutex.
Here is the basic steps to use mutex on shared resource:
Acquire lock first
do job (increase for A, set value for B)
Release lock,
If both A & B follow the rules, then B can't modify it, while A keeps the lock.
Or, if your thread B don't acquire the lock first, it of cause could modify the variable, but that would be a bug for concurrent programming.
And, by the way, you can also use condition together with mutex, so that you can let threads wait & notify each other, instead of looping all the time which is a waste of machine resource.
For your updated question
On linux, in c, there are mainly 3 methods to acquire lock of mutex, what happens when a thread can't get the lock depends on which methods u use.
int pthread_mutex_lock(pthread_mutex_t * mutex );
if it's already locked by another thread, then it block until the lock is unlocked,
int pthread_mutex_trylock(pthread_mutex_t * mutex );
similar to pthread_mutex_lock(), but it won't block, instead return error EBUSY,
int pthread_mutex_timedlock(pthread_mutex_t *restrict mutex, const struct timespec *restrict abs_timeout);
similar to pthread_mutex_lock(), but it will wait for a timeout before return error ETIMEDOUT,
Simple example of statically initialized mutex
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
static int p = 0;
static pthread_mutex_t locker = PTHREAD_MUTEX_INITIALIZER;
static void *
threadFunc(void *arg)
{
int err;
err = pthread_mutex_lock(&locker);
if (err != 0){
perror("pthread_mutex_lock failed");
exit(1);
}
p++;
err = pthread_mutex_unlock(&locker);
if (err != 0){
perror("pthread_mutex_unlock failed");
exit(1);
}
return NULL;
}
int
main(int argc, char *argv[])
{
pthread_t A, B;
pthread_create(&A, NULL, threadFunc, NULL);
pthread_create(&B, NULL, threadFunc, NULL);
pthread_join(A, NULL);
pthread_join(B, NULL);
printf("p = %d\n", p);
return 0;
}
Error checking in main is omitted for brevity but should be used. If you do not release mutex program will never finish, thread B will never get lock.
Consider the following test program:
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <strings.h>
#include <unistd.h>
#include <signal.h>
#include <pthread.h>
pthread_mutex_t mutex;
pthread_mutexattr_t mattr;
pthread_t thread1;
pthread_t thread2;
pthread_t thread3;
void mutex_force_unlock(pthread_mutex_t *mutex, pthread_mutexattr_t *mattr)
{
int e;
e = pthread_mutex_destroy(mutex);
printf("mfu: %s\n", strerror(e));
e = pthread_mutex_init(mutex, mattr);
printf("mfu: %s\n", strerror(e));
}
void *thread(void *d)
{
int e;
e = pthread_mutex_trylock(&mutex);
if (e != 0)
{
printf("thr: %s\n", strerror(e));
mutex_force_unlock(&mutex, &mattr);
e = pthread_mutex_unlock(&mutex);
printf("thr: %s\n", strerror(e));
if (e != 0) pthread_exit(NULL);
e = pthread_mutex_lock(&mutex);
printf("thr: %s\n", strerror(e));
}
pthread_exit(NULL);
}
void * thread_deadtest(void *d)
{
int e;
e = pthread_mutex_lock(&mutex);
printf("thr2: %s\n", strerror(e));
e = pthread_mutex_lock(&mutex);
printf("thr2: %s\n", strerror(e));
pthread_exit(NULL);
}
int main(void)
{
/* Setup */
pthread_mutexattr_init(&mattr);
pthread_mutexattr_settype(&mattr, PTHREAD_MUTEX_ERRORCHECK);
//pthread_mutexattr_settype(&mattr, PTHREAD_MUTEX_NORMAL);
pthread_mutex_init(&mutex, &mattr);
/* Test */
pthread_create(&thread1, NULL, &thread, NULL);
pthread_join(thread1, NULL);
if (pthread_kill(thread1, 0) != 0) printf("Thread 1 has died.\n");
pthread_create(&thread2, NULL, &thread, NULL);
pthread_join(thread2, NULL);
pthread_create(&thread3, NULL, &thread_deadtest, NULL);
pthread_join(thread3, NULL);
return(0);
}
Now when this program runs, I get the following output:
Thread 1 has died.
thr: Device busy
mfu: Device busy
mfu: No error: 0
thr: Operation not permitted
thr2: No error: 0
thr2: Resource deadlock avoided
Now I know this has been asked a number of times before, but is there any way to forcefully unlock a mutex? It seems the implementation will only allow the mutex to be unlocked by the thread that locked it as it seems to actively check, even with a normal mutex type.
Why am I doing this? It has to do with coding a bullet-proof network server that has the ability to recover from most errors, including ones where the thread terminates unexpectedly. At this point, I can see no way of unlocking a mutex from a thread that is different than the one that locked it. So the way that I see it is that I have a few options:
Abandon the mutex and create a new one. This is the undesirable option as it creates a memory leak.
Close all network ports and restart the server.
Go into the kernel internals and release the mutex there bypassing the error checking.
I have asked this before but, the powers that be absolutely want this functionality and they will not take no for an answer (I've already tried), so I'm kinda stuck with this. I didn't design it this way, and I would really like to shoot the person who did, but that's not an option either.
And before someone says anything, my usage of pthread_kill is legal under POSIX...I checked.
I forgot to mention, this is FreeBSD 9.3 that we are working with.
Use a robust mutex, and if the locking thread dies, fix the mutex with pthread_mutex_consistent().
If mutex is a robust mutex in an inconsistent state, the
pthread_mutex_consistent() function can be used to mark the state
protected by the mutex referenced by mutex as consistent again.
If an owner of a robust mutex terminates while holding the mutex, the
mutex becomes inconsistent and the next thread that acquires the mutex
lock shall be notified of the state by the return value [EOWNERDEAD].
In this case, the mutex does not become normally usable again until
the state is marked consistent.
If the thread which acquired the mutex lock with the return value
[EOWNERDEAD] terminates before calling either
pthread_mutex_consistent() or pthread_mutex_unlock(), the next thread
that acquires the mutex lock shall be notified about the state of the
mutex by the return value [EOWNERDEAD].
Well, you cannot do what you ask wit a normal pthread mutex, since, as you say, you can only unlock a mutex from the thread that locked it.
What you can do is wrap locking/unlocking of a mutex such that you have a pthread cancel handler that unlocks the mutex if the thread terminates. To give you an idea:
void cancel_unlock_handler(void *p)
{
pthread_mutex_unlock(p);
}
int my_pthread_mutex_lock(pthread_mutex_t *m)
{
int rc;
pthread_cleanup_push(cancel_unlock_handler, m);
rc = pthread_mutex_lock(&m);
if (rc != 0) {
pthread_cleanup_pop(0);
}
return rc;
}
int my_pthread_mutex_unlock(pthread_mutex_t *m)
{
pthread_cleanup_pop(0);
return pthread_mutex_unlock(&m);
}
Now you'll need to use the my_pthread_mutex_lock/my_pthread_mutex_unlock instead of the pthread lock/unlock functions.
Now, threads don't really terminate "unexpectedly", either it calls pthread_exit or it ends, or you pthread_kill it, in which case the above will suffice (also note that threads exit only at certain cancellation points, so there's no race conditions e.g.between pushing the cleanup handler and locking the mutex) , but logical error or undefined behavior might leave erroneous state affecting the whole process, and you're better off re-starting the whole process.
I have come up with a workable method to deal with this situation. As I mentioned before, FreeBSD does not support robust mutexes so that option is out. Also one a thread has locked a mutex, it cannot be unlocked by any means.
So what I have done to solve the problem is to abandon the mutex and place its pointer onto a list. Since the lock wrapper code uses pthread_mutex_trylock and then relinquishes the CPU if it fails, no thread can get stuck on waiting for a permanently locked mutex. In the case of a robust mutex, the thread locking the mutex will be able recover it if it gets EOWNERDEAD as the return code.
Here's some things that are defined:
/* Checks to see if we have access to robust mutexes. */
#ifndef PTHREAD_MUTEX_ROBUST
#define TSRA__ALTERNATE
#define TSRA_MAX_MUTEXABANDON TSRA_MAX_MUTEX * 4
#endif
/* Mutex: Mutex Data Table Datatype */
typedef struct mutex_lock_table_tag__ mutexlock_t;
struct mutex_lock_table_tag__
{
pthread_mutex_t *mutex; /* PThread Mutex */
tsra_daclbk audcallbk; /* Audit Callback Function Pointer */
tsra_daclbk reicallbk; /* Reinit Callback Function Pointer */
int acbkstat; /* Audit Callback Status */
int rcbkstat; /* Reinit Callback Status */
pthread_t owner; /* Owner TID */
#ifdef TSRA__OVERRIDE
tsra_clnup_t *cleanup; /* PThread Cleanup */
#endif
};
/* ******** ******** Global Variables */
pthread_rwlock_t tab_lock; /* RW lock for mutex table */
pthread_mutexattr_t mtx_attrib; /* Mutex attributes */
mutexlock_t *mutex_table; /* Mutex Table */
int tabsizeentry; /* Table Size (Entries) */
int tabsizebyte; /* Table Size (Bytes) */
int initialized = 0; /* Modules Initialized 0=no, 1=yes */
#ifdef TSRA__ALTERNATE
pthread_mutex_t *mutex_abandon[TSRA_MAX_MUTEXABANDON];
pthread_mutex_t mtx_abandon; /* Abandoned Mutex Lock */
int mtx_abandon_count; /* Abandoned Mutex Count */
int mtx_abandon_init = 0; /* Initialization Flag */
#endif
pthread_mutex_t mtx_recover; /* Mutex Recovery Lock */
And here's some code for the lock recovery:
/* Attempts to recover a broken mutex. */
int tsra_mutex_recover(int lockid, pthread_t tid)
{
int result;
/* Check Prerequisites */
if (initialized == 0) return(EDOOFUS);
if (lockid < 0 || lockid >= tabsizeentry) return(EINVAL);
/* Check Mutex Owner */
result = pthread_equal(tid, mutex_table[lockid].owner);
if (result != 0) return(0);
/* Lock Recovery Mutex */
result = pthread_mutex_lock(&mtx_recover);
if (result != 0) return(result);
/* Check Mutex Owner, Again */
result = pthread_equal(tid, mutex_table[lockid].owner);
if (result != 0)
{
pthread_mutex_unlock(&mtx_recover);
return(0);
}
/* Unless the system supports robust mutexes, there is
really no way to recover a mutex that is being held
by a thread that has terminated. At least in FreeBSD,
trying to destory a mutex that is held will result
in EBUSY. Trying to overwrite a held mutex results
in a memory fault and core dump. The only way to
recover is to abandon the mutex and create a new one. */
#ifdef TSRA__ALTERNATE /* Abandon Mutex */
pthread_mutex_t *ptr;
/* Too many abandoned mutexes? */
if (mtx_abandon_count >= TSRA_MAX_MUTEXABANDON)
{
result = TSRA_PROGRAM_ABORT;
goto error_1;
}
/* Get a read lock on the mutex table. */
result = pthread_rwlock_rdlock(&tab_lock);
if (result != 0) goto error_1;
/* Perform associated data audit. */
if (mutex_table[lockid].acbkstat != 0)
{
result = mutex_table[lockid].audcallbk();
if (result != 0)
{
result = TSRA_PROGRAM_ABORT;
goto error_2;
}
}
/* Allocate New Mutex */
ptr = malloc(sizeof(pthread_mutex_t));
if (ptr == NULL)
{
result = errno;
goto error_2;
}
/* Init new mutex and abandon the old one. */
result = pthread_mutex_init(ptr, &mtx_attrib);
if (result != 0) goto error_3;
mutex_abandon[mtx_abandon_count] = mutex_table[lockid].mutex;
mutex_abandon[mtx_abandon_count] = mutex_table[lockid].mutex;
mtx_abandon_count++;
mutex_table[lockid].mutex = ptr;
#else /* Recover Mutex */
/* Try locking the mutex and see what we get. */
result = pthread_mutex_trylock(mutex_table[lockid].mutex);
switch (result)
{
case 0: /* No error, unlock and return */
pthread_unlock_mutex(mutex_table[lockid].mutex);
return(0);
break;
case EBUSY: /* No error, return */
return(0);
break;
case EOWNERDEAD: /* Error, try to recover mutex. */
if (mutex_table[lockid].acbkstat != 0)
{
result = mutex_table[lockid].audcallbk();
if (result != 0)
{
if (mutex_table[lockid].rcbkstat != 0)
{
result = mutex_table[lockid].reicallbk();
if (result != 0)
{
result = TSRA_PROGRAM_ABORT;
goto error_2;
}
}
else
{
result = TSRA_PROGRAM_ABORT;
goto error_2;
}
}
}
else
{
result = TSRA_PROGRAM_ABORT;
goto error_2;
}
break;
case EDEADLK: /* Error, deadlock avoided, abort */
case ENOTRECOVERABLE: /* Error, recovery failed, abort */
/* NOTE: We shouldn't get this, but if we do... */
abort();
break;
default:
/* Ambiguous situation, best to abort. */
abort();
break;
}
pthread_mutex_consistant(mutex_table[lockid].mutex);
pthread_mutex_unlock(mutex_table[lockid].mutex);
#endif
/* Housekeeping */
mutex_table[lockid].owner = pthread_self();
pthread_mutex_unlock(&mtx_recover);
/* Return */
return(0);
/* We only get here on errors. */
#ifdef TSRA__ALTERNATE
error_3:
free(ptr);
error_2:
pthread_rwlock_unlock(&tab_lock);
#else
error_2:
pthread_mutex_unlock(mutex_table[lockid].mutex);
#endif
error_1:
pthread_mutex_unlock(&mtx_recover);
return(result);
}
Because FreeBSD is an evolving operating system like Linux is, I have made provisions to allow for the use of robust mutexes in the future. Since without robust mutexes, there really is no way to do enhanced error checking which is available if robust mutexes are supported.
For a robust mutex, enhanced error checking is performed to verify the need to recover the mutex. For systems that do not support robust mutexes, we have to trust the caller to verify that the mutex in question needs to be recovered. Besides, there is some checking to make sure that there is only one thread performing the recovery. All other threads blocking on the mutex are blocked. I have given some thought about how to signal other threads that a recovery is in progress, so that aspect of the routine still needs work. In a recovery situation, I'm thinking about comparing pointer values to see if the mutex was replaced.
In both cases, an audit routine can be set as a callback function. The purpose of the audit routine is to verify and correct any data discrepancies in the protected data. If the audit fails to correct the data, then another callback routine, the data reinitialize routine, is invoked. The purpose of this is to reinitialize the data that is protected by the mutex. If that fail, then abort() is called to terminate program execution and drop a core file for debugging purposes.
For the abandoned mutex case, the pointer is not thrown away, but is placed on a list. If too many mutexes are abandoned, then the program is aborted. As mentioned above, in the mutex lock routine, pthread_mutex_trylock is used instead of pthread_mutex_lock. This way, no thread can be permanently blocked on a dead mutex. So once the pointer is switched in the mutex table to point to the new mutex, all threads waiting on the mutex will immediately switch to the new mutex.
I am sure there are bugs/errors in this code, but this is a work in progress. Although not quite finished and debugged, I feel that there is enough here to warrant an answer to this question.
Well as you probably aware, a thread which locks a mutex, has the sole ownership of that resource. So it has got all the rights to unlock it. There is no way, atleast till now, to force a thread, give up its resource, without having to do a round about way, that you had did in your code.
However, this would be my approach.
Have a single thread, that owns a mutex, called as Resource thread. Make sure that, this thread receives & responds events to other worker thread.
When a worker thread, wanna enter into critical section, it registers with Resource thread to lock a mutex on it's behalf. When done, the worker thread assumes that, it has got exclusive access to critical section. The assumption is valid because, any other worker thread, which needs to get access to critical section, has to go through the same step.
Now assume that, there is another thread, who wants to force the former worker thread, to unlock, then he can make a special call, maybe a flag or with high priority thread to grant access. The resource thread, on comparing the flag / priority of the requesting thread, will unlock the mutex and lock again for the requesting thread.
I don't know for sure your use-case fully, but just my 2 cents. If you like it, don't forget vote my answer.
You could restart just the process with the crashed thread using function from the exec family to change the process image. I assume that it will be faster to reload the process than to reboot the sever.