This is an algorithm that does not use OS synchronization primitives until two or more threads really access the critical section. Even in recursive "locks" of same thread, there is no real lock until a second thread is involved.
http://home.comcast.net/~pjbishop/Dave/QRL-OpLocks-BiasedLocking.pdf
There are two functions:
int qrlgeneric_acquire(qrlgeneric_lock *L, int id);
void qrlgeneric_release(qrlgeneric_lock *L, int acquiredquickly);
qrlgeneric_acquire: called when the thread wants to lock. id is thread id
qrlgeneric_release: called when the thread wants to unlock
Example:
Thread_1 which already locked calls qrlgeneric_acquire again, so a recursive lock will be performed. At the same time, Thread_2 calls qrlgeneric_acquire, so there will be contention (two threads wants to lock, real os sync primitive will be used).
Thread_1 will reach this condition on line 4.
04 if (BIASED(id) == status) // SO: this means this thread already has this lock
05 {
06 L->lockword.h.quicklock = 1;
07 if (BIASED(id) == HIGHWORD(L->lockword.data))
08 return 1;
09 L->lockword.h.quicklock = 0; /* I didn’t get the lock, so be sure */
10 /* not to block the process that did */
11 }
Thread_2 will reach this condition on line 35. CAS is compare_and_swap atomic operation.
34 unsigned short biaslock = L->lockword.h.quicklock;
35 if (CAS(&L->lockword,
36 MAKEDWORD(biaslock, status),
37 MAKEDWORD(biaslock, REVOKED)))
38 {
39 /* I’m the revoker. Set up the default lock. */
40 /* *** INITIALIZE AND ACQUIRE THE DEFAULT LOCK HERE *** */
41 /* Note: this is an uncontended acquire, so it */
42 /* can be done without use of atomics if this is */
43 /* desirable. */
44 L->lockword.h.status = DEFAULT;
45
46 /* Wait until quicklock is free */
47 while (LOWWORD(L->lockword.data))
48 ;
49 return 0; /* And then it’s mine */
50 }
From the comments on line 9 and 47, you can see that the statement at line 9 is there for support the statement on line 47 so the Thread_2 doesn't spin lock there forever.
QUESTION: It seems from those comments on line 9 and 47 that those two conditions above should never both succeed, otherwise the Thread_2 will spin lock on the line 47 because the statement on line 9 will not be executed. THE PROBLEM is I need to help to understand how it is possible that it will never happen that both them succeed, because I still think it can happen:
1. Thread_1: 06 L->lockword.h.quicklock = 1;
2. Thread_2: 34 unsigned short biaslock = L->lockword.h.quicklock;
3. Thread_1: if (BIASED(id) == HIGHWORD(L->lockword.data))
4. Thread_2: 35 if (CAS(&L->lockword,MAKEDWORD(biaslock, status),MAKEDWORD(biaslock, REVOKED)))
3. This condition will succeed because Thread_2 didn't change anything yet.
4. This condition will succeed, because the points 1 and 3 didn't affect it.
The result is I think they can both succeed, but this means that Thread_2 will spin lock on the line 47 until the Thread_1 releases the lock. I think this is definitely wrong and shouldn't happen, so I probably don't understand it. Can anyone help?
Whole algorithm:
/* statuses for qrl locks */
#define BIASED(id) ((int)(id) << 2)
#define NEUTRAL 1
#define DEFAULT 2
#define REVOKED 3
#define ISBIASED(status) (0 == ((status) & 3))
/* word manipulation (big-endian versions shown here) */
#define MAKEDWORD(low, high) (((unsigned int)(low) << 16) | (high))
#define HIGHWORD(dword) ((unsigned short)dword)
#define LOWWORD(dword) ((unsigned short)(((unsigned int)(dword)) >> 16))
typedef volatile struct tag_qrlgeneric_lock
{
volatile union
{
volatile struct
{
volatile short quicklock;
volatile short status;
}
h;
volatile int data;
}
lockword;
/* *** PLUS WHATEVER FIELDS ARE NEEDED FOR THE DEFAULT LOCK *** */
}
qrlgeneric_lock;
int qrlgeneric_acquire(qrlgeneric_lock *L, int id)
{
int status = L->lockword.h.status;
/* If the lock’s mine, I can reenter by just setting a flag */
if (BIASED(id) == status)
{
L->lockword.h.quicklock = 1;
if (BIASED(id) == HIGHWORD(L->lockword.data))
return 1;
L->lockword.h.quicklock = 0; /* I didn’t get the lock, so be sure */
/* not to block the process that did */
}
if (DEFAULT != status)
{
/* If the lock is unowned, try to claim it */
if (NEUTRAL == status)
{
if (CAS(&L->lockword, /* By definition, if we saw */
MAKEDWORD(0, NEUTRAL), /* neutral, the lock is unheld */
MAKEDWORD(1, BIASED(id))))
{
return 1;
}
/* If I didn’t bias the lock to me, someone else just grabbed
it. Fall through to the revocation code */
status = L->lockword.h.status; /* resample */
}
/* If someone else owns the lock, revoke them */
if (ISBIASED(status))
{
do
{
unsigned short biaslock = L->lockword.h.quicklock;
if (CAS(&L->lockword,
MAKEDWORD(biaslock, status),
MAKEDWORD(biaslock, REVOKED)))
{
/* I’m the revoker. Set up the default lock. */
/* *** INITIALIZE AND ACQUIRE THE DEFAULT LOCK HERE *** */
/* Note: this is an uncontended acquire, so it */
/* can be done without use of atomics if this is */
/* desirable. */
L->lockword.h.status = DEFAULT;
/* Wait until quicklock is free */
while (LOWWORD(L->lockword.data))
;
return 0; /* And then it’s mine */
}
/* The CAS could have failed and we got here for either of
two reasons. First, another process could have done the
revoking; in this case we need to fall through to the
default path once the other process is finished revoking.
Secondly, the bias process could have acquired or released
the biaslock field; in this case we need merely retry. */
status = L->lockword.h.status;
}
while (ISBIASED(L->lockword.h.status));
}
/* If I get here, the lock has been revoked by someone other
than me. Wait until they’re done revoking, then fall through
to the default code. */
while (DEFAULT != L->lockword.h.status)
;
}
/* Regular default lock from here on */
assert(DEFAULT == L->lockword.h.status);
/* *** DO NORMAL (CONTENDED) DEFAULT LOCK ACQUIRE FUNCTION HERE *** */
return 0;
}
void qrlgeneric_release(qrlgeneric_lock *L, int acquiredquickly)
{
if (acquiredquickly)
L->lockword.h.quicklock = 0;
else
{
/* *** DO NORMAL DEFAULT LOCK RELEASE FUNCTION HERE *** */
}
}
Related
I'm writing an extremely simple program to demonstrate a Pthreads implementation I ported from C++ back to C.
I create two lock-step threads and give them two jobs
One increments a1 once per step
One decrements a2 once per step
During the synchronized phase (When the mutexes are locked for both t1 and t2) I compare a1 and a2 to see if we should stop stepping.
I'm wondering if i'm going crazy here because not only does the variable not always change after stepping and locking but they sometimes change at different rates as if the threads were running even after the locks.
EDIT: Yes, I did research this. Yes, the C++ implementation works. Yes, the C++ implementation is nearly identical to this one, but I had to cast PTHREAD_MUTEX_INITIALIZER and PTHREAD_COND_INITIALIZER in c and pass this as the first argument to every function. I spent a while trying to debug this (short of whipping out gdb) to no avail.
#ifndef LOCKSTEPTHREAD_H
#define LOCKSTEPTHREAD_H
#include <pthread.h>
#include <stdio.h>
typedef struct {
pthread_mutex_t myMutex;
pthread_cond_t myCond;
pthread_t myThread;
int isThreadLive;
int shouldKillThread;
void (*execute)();
} lsthread;
void init_lsthread(lsthread* t);
void start_lsthread(lsthread* t);
void kill_lsthread(lsthread* t);
void kill_lsthread_islocked(lsthread* t);
void lock(lsthread* t);
void step(lsthread* t);
void* lsthread_func(void* me_void);
#ifdef LOCKSTEPTHREAD_IMPL
//function declarations
void init_lsthread(lsthread* t){
//pthread_mutex_init(&(t->myMutex), NULL);
//pthread_cond_init(&(t->myCond), NULL);
t->myMutex = (pthread_mutex_t)PTHREAD_MUTEX_INITIALIZER;
t->myCond = (pthread_cond_t)PTHREAD_COND_INITIALIZER;
t->isThreadLive = 0;
t->shouldKillThread = 0;
t->execute = NULL;
}
void destroy_lsthread(lsthread* t){
pthread_mutex_destroy(&t->myMutex);
pthread_cond_destroy(&t->myCond);
}
void kill_lsthread_islocked(lsthread* t){
if(!t->isThreadLive)return;
//lock(t);
t->shouldKillThread = 1;
step(t);
pthread_join(t->myThread,NULL);
t->isThreadLive = 0;
t->shouldKillThread = 0;
}
void kill_lsthread(lsthread* t){
if(!t->isThreadLive)return;
lock(t);
t->shouldKillThread = 1;
step(t);
pthread_join(t->myThread,NULL);
t->isThreadLive = 0;
t->shouldKillThread = 0;
}
void lock(lsthread* t){
if(pthread_mutex_lock(&t->myMutex))
puts("\nError locking mutex.");
}
void step(lsthread* t){
if(pthread_cond_signal(&(t->myCond)))
puts("\nError signalling condition variable");
if(pthread_mutex_unlock(&(t->myMutex)))
puts("\nError unlocking mutex");
}
void* lsthread_func(void* me_void){
lsthread* me = (lsthread*) me_void;
int ret;
if (!me)pthread_exit(NULL);
if(!me->execute)pthread_exit(NULL);
while (!(me->shouldKillThread)) {
ret = pthread_cond_wait(&(me->myCond), &(me->myMutex));
if(ret)pthread_exit(NULL);
if (!(me->shouldKillThread) && me->execute)
me->execute();
}
pthread_exit(NULL);
}
void start_lsthread(lsthread* t){
if(t->isThreadLive)return;
t->isThreadLive = 1;
t->shouldKillThread = 0;
pthread_create(
&t->myThread,
NULL,
lsthread_func,
(void*)t
);
}
#endif
#endif
This is my driver program:
#define LOCKSTEPTHREAD_IMPL
#include "include/lockstepthread.h"
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
unsigned char a1, a2;
void JobThread1(){
unsigned char copy = a1;
copy++;
a1 = copy;
}
void JobThread2(){
unsigned char copy = a2;
copy--;
a2 = copy;
}
int main(){
char inputline[2048];
inputline[2047] = '\0';
lsthread t1, t2;
init_lsthread(&t1);
init_lsthread(&t2);
t1.execute = JobThread1;
t2.execute = JobThread2;
printf(
"\nThis program demonstrates threading by having"
"\nTwo threads \"walk\" toward each other using unsigned chars."
"\nunsigned Integer overflow guarantees the two will converge."
);
printf("\nEnter a number for thread 1 to process: ");
fgets(inputline, 2047,stdin);
a1 = (unsigned char)atoi(inputline);
printf("\nEnter a number for thread 2 to process: ");
fgets(inputline, 2047,stdin);
a2 = (unsigned char)atoi(inputline);
start_lsthread(&t1);
start_lsthread(&t2);
unsigned int i = 0;
lock(&t1);
lock(&t2);
do{
printf("\n%u: a1 = %d, a2 = %d",i++,(int)a1,(int)a2);
fflush(stdout);
step(&t1);
step(&t2);
lock(&t1);
lock(&t2);
}while(a1 < a2);
kill_lsthread_islocked(&t1);
kill_lsthread_islocked(&t2);
destroy_lsthread(&t1);
destroy_lsthread(&t2);
return 0;
}
Example program usage:
Enter a number for thread 1 to process: 5
Enter a number for thread 2 to process: 10
0: a1 = 5, a2 = 10
1: a1 = 5, a2 = 10
2: a1 = 5, a2 = 10
3: a1 = 5, a2 = 10
4: a1 = 5, a2 = 10
5: a1 = 5, a2 = 10
6: a1 = 6, a2 = 9
7: a1 = 6, a2 = 9
8: a1 = 7, a2 = 9
9: a1 = 7, a2 = 9
10: a1 = 7, a2 = 9
11: a1 = 7, a2 = 9
12: a1 = 8, a2 = 9
So, what's the deal?
Generally speaking, it sounds like what you're really looking for is a barrier. Nevertheless, I answer the question as posed.
Yes, the C++ implementation works. Yes, the C++ implementation is
nearly identical to this one, but I had to cast
PTHREAD_MUTEX_INITIALIZER and PTHREAD_COND_INITIALIZER in c and pass
this as the first argument to every function. I spent a while trying
to debug this (short of whipping out gdb) to no avail.
That seems unlikely. There are data races and undefined behavior all over the code presented, whether interpreted as C or as C++.
General design
Since you provide an explicit lock() function, which seems reasonable, you should provide an explicit unlock() function as well. Any other functions that expect to be called with the mutex locked should return with the mutex locked, so that the caller can explicitly pair lock() calls with unlock() calls. Failure to adhere to this pattern invites bugs.
In particular, step() should not unlock the mutex unless it also locks it, but I think a non-locking version will suit the purpose.
Initialization
I had to cast PTHREAD_MUTEX_INITIALIZER and PTHREAD_COND_INITIALIZER in c
No, you didn't, because you can't, at least not if pthread_mutex_t and pthread_cond_t are structure types. Initializers for structure types are not values. They do not have types and cannot be cast. But you can form compound literals from them, and that is what you have inadvertently done. This is not a conforming way to assign a value to a pthread_mutex_t or a pthread_cond_t.* The initializer macros are designated for use only in initializing variables in their declarations. That's what "initializer" means in this context.
Example:
pthread_mutex_t mutex = PTREAD_MUTEX_INITIALIZER;
Example:
struct {
pthread_mutex_t mutex;
pthread_cond_t cv;
} example = { PTHREAD_MUTEX_INITIALIZER, PTHREAD_COND_INITIALIZER };
To initialize a mutex or condition variable object in any other context requires use of the corresponding initialization function, pthread_mutex_init() or pthread_cond_init().
Data races
Non-atomic accesses to shared data by multiple concurrently-running threads must be protected by a mutex or other synchronization object if any of the accesses are writes (exceptions apply for accesses to mutexes and other synchronization objects themselves). Shared data in your example include file-scope variables a1 and a2, and most of the members of your lsthread instances. Your lsthread_func and driver both sometimes fail to lock the appropriate mutex before accessing those shared data, and some of the accesses involved are indeed writes, so undefined behavior ensues. Observing unexpected values of a1 and a2 is an entirely plausible manifestation of that undefined behavior.
Condition variable usage
The thread that calls pthread_cond_wait() must do so while holding the specified mutex locked. Your lsthread_func() does not adhere to that requirement, so more undefined behavior ensues. If you're very lucky, that might manifest as an immediate spurious wakeup.
And speaking of spurious wakeups, you do not guard against them. If one does occur then lsthread_func() blithely goes on to perform another iteration of its loop. To avoid this, you need shared data somewhere upon which the condition of the condition variable is predicated. The standard usage of a CV is to check that predicate before waiting, and to loop back and check it again after waking up, repeatedly if necessary, not proceeding until the predicate evaluates true.
Synchronized stepping
The worker threads do not synchronize directly with each other, so only the driver can ensure that one does not run ahead of the other. But it doesn't. The driver does nothing at all to ensure that either thread has completed a step before signaling both threads to perform another. Condition variables do not store signals, so if, by some misfortune of scheduling or by the nature of the tasks involved, one thread should get a step ahead of the other, it will remain ahead until and unless the error happens to be spontaneously balanced by a misstep on the other side.
Probably you want to add a lsthread_wait() function that waits for the thread to complete a step. That would involve use of the CV from the opposite direction.
Overall, you could provide (better) for single-stepping by
Adding a member to type lsthread to indicate whether the thread should be or is executing a step vs. whether it is between steps and should wait.
typedef struct {
// ...
_Bool should_step;
} lsthread;
Adding the aforementioned lsthread_wait(), maybe something like this:
// The calling thread must hold t->myMutex locked
void lsthread_wait(lsthread *t) {
// Wait, if necessary, for the thread to complete a step
while (t->should_step) {
pthread_cond_wait(&t->myCond, &t->myMutex);
}
assert(!t->should_step);
// Prepare to perform another step
t->should_step = 1;
}
That would be paired with a revised version of lsthread_func():
void* lsthread_func(void* me_void){
lsthread* me = (lsthread*) me_void;
if (!me) pthread_exit(NULL);
lock(me); // needed to protect access to *me members and to globals
while (!me->shouldKillThread && me->execute) {
while (!me->should_step && !me->shouldKillThread) {
int ret = pthread_cond_wait(&(me->myCond), &(me->myMutex));
if (ret) {
unlock(me); // mustn't forget to unlock
pthread_exit(NULL);
}
}
assert(me->should_step || me->shouldKillThread);
if (!me->shouldKillThread && me->execute) {
me->execute();
}
// Mark and signal step completed
me->should_step = 0;
ret = pthread_cond_broadcast(me->myCond);
if (ret) break;
}
unlock(me);
pthread_exit(NULL);
}
Modifying step() to avoid it unlocking the mutex.
Modifying the driver loop to use the new wait function appropriately
lock(&t1);
lock(&t2);
do {
printf("\n%u: a1 = %d, a2 = %d", i++, (int) a1, (int) a2);
fflush(stdout);
step(&t1);
step(&t2);
lsthread_wait(&t1);
lsthread_wait(&t2);
} while (a1 < a2); // both locks must be held when this condition is evaluated
kill_lsthread_islocked(&t2);
kill_lsthread_islocked(&t1);
unlock(&t2);
unlock(&t1);
That's not necessarily all the changes that would be required, but I think I've covered the all the key points.
Final note
The above suggestions are based on the example program, in which different worker threads do not access any of the same shared data. That makes it feasible to use per-thread mutexes to protect the shared data they do access. If the workers accessed any of the same shared data, and those or any thread running concurrently with them modified that same data, then per-worker-thread mutexes would not offer sufficient protection.
* And if pthread_mutex_t and pthread_cond_t were pointer or integer types, which is allowed, then the compiler would have accepted the assignments in question without a cast (which would actually be a cast in that case), but those assignments still would be non-conforming as far as pthreads is concerned.
The summary of the problem is the following: Given a global resource of size N and M threads with their resource size of Xi (i=1,M) , syncronize the threads such that a thread is allocated,it does its stuff and then it is deallocated.
The main problem is that there are no resources available and the thread has to wait until there is enough memory. I have tried to "block" it with a while statement, but I realized that two threads can pass the while loop and the first to be allocated can change the global resource such that the second thread does not have enough space, but it has already passed the conditional section.
//piece of pseudocode
...
int MAXRES = 100;
// in thread function
{
while (thread_res > MAXRES);
lock();
allocate_res(); // MAXRES-=thread_res;
unlock();
// do own stuff
lock()
deallocate(); // MAXRES +=thread_res;
unlock();
}
To make a robust solution you need something more. As you noted, you need a way to wait until the condition ' there are enough resources available for me ', and a proper mutex has no such mechanism. At any rate, you likely want to bury that in the allocator, so your mainline thread code looks like:
....
n = Allocate()
do stuff
Release(n)
....
and deal with the contention:
int navail = N;
int Acquire(int n) {
lock();
while (n < navail) {
unlock();
lock();
}
navail -= n;
unlock();
return n;
}
void Release(int n) {
lock();
navail += n;
unlock();
}
But such a spin waiting system may fail because of priorities -- if the highest priority thread is spinning, a thread trying to Release may not be able to run. Spinning isn't very elegant, and wastes energy. You could put a short nap in the spin, but if you make the nap too short it wastes energy, too long and it increases latency.
You really want a signalling primitive like semaphores or condvars rather than a locking primitive. With a semaphore, it could look impressively like:
Semaphore *sem;
int Acquire(int n) {
senter(sem, n);
return n;
}
int Release(int n) {
sleave(sem, n);
return n;
}
void init(int N) {
sem = screate(N);
}
Update: Revised to use System V semaphores, which provides the ability to specify arbitrary 'checkout' and 'checkin' to the semaphore value. Overall logic the same.
Disclaimer: I did not use System V for few years, test before using, in case I missed some details.
int semid ;
// Call from main
do_init() {
shmkey = ftok(...) ;
semid = shmget(shmkey, 1, ...) ;
// Setup 100 as max resources.
struct sembuf so ;
so.sem_num = 0 ;
so.sem_op = 100 ;
so.sem_flg = 0 ;
semop(semid, &so, 1) ;
}
// Thread work
do_thread_work() {
int N = <number of resources>
struct sembuf so ;
so.sem_num = 0;
so.sem_op = -N ;
so.sem_flg = 0 ;
semop(semid, &so, 1) ;
... Do thread work
// Return resources
so.sem_op = +N ;
semop(semid, &so, 1) ;
}
As per: https://pubs.opengroup.org/onlinepubs/009695399/functions/semop.html, this will result in atomic checkout of multiple items.
Note: the ... as sections related to IPC_NOWAIT and SEM_UNDO, not relevant to this case.
If sem_op is a negative integer and the calling process has alter permission, one of the following shall occur:
If semval(see ) is greater than or equal to the absolute value of sem_op, the absolute value of sem_op is subtracted from semval. ...
...
If semval is less than the absolute value of sem_op ..., semop() shall increment the semncnt associated with the specified semaphore and suspend execution of the calling thread until one of the following conditions occurs:
The value of semval becomes greater than or equal to the absolute value of sem_op. When this occurs, the value of semncnt associated with the specified semaphore shall be decremented, the absolute value of sem_op shall be subtracted from semval and, ... .
The racers should have an equal chance of winning. When I run the program the results seem to be correct, both racers win about half the time, but I dont think I am using the mutex_trylock correctly. Is it actually doing anything the way with how I implemented it? I am new to C so I dont know a lot about this.
Program Description:
We assume two racers, at two diagonally opposite corner of a rectangular region. They have to traverse along the roads along the peripheri of the region. There are two bridges on two opposite sides of the rectangle. In order to complete one round of traversal around this, the racers have to get pass for both the bridge at a time. The conditions of the race are
1) Only one racer can get a pass at a time.
2) Before one starts one round, he has to request and get both the passes and then after finishing that round has to release the passes, and make new try to get those passes for the next round.
3) Racer1 (R1) will acquire bridge-pass B1 first, then B0. R0 will acquire B0 and then B1.
4) There is a maximum number of rounds prefixed. Whoever reaches that number first will be the winner and the race will stop.
This is how the situation looks before starting.
B0
R0-------- ~ -------------
| |
| |
| |
| |
--------- ~ ------------- R1
B1
#include<stdio.h>
#include<pthread.h>
#include<stdlib.h>
#define THREAD_NUM 2
#define MAX_ROUNDS 200000
#define TRUE 1
#define FALSE 0
/* mutex locks for each bridge */
pthread_mutex_t B0, B1;
/* racer ID */
int r[THREAD_NUM]={0,1};
/* number of rounds completed by each racer */
int numRounds[THREAD_NUM]={0,0};
void *racer(void *); /* prototype of racer routine */
int main()
{
pthread_t tid[THREAD_NUM];
void *status;
int i,j;
/* create 2 threads representing 2 racers */
for (i = 0; i < THREAD_NUM; i++)
{
/*Your code here */
pthread_create(&tid[i], NULL, racer, &r[i]);
}
/* wait for the join of 2 threads */
for (i = 0; i < THREAD_NUM; i++)
{
/*Your code here */
pthread_join(tid[i], &status);
}
printf("\n");
for(i=0; i<THREAD_NUM; i++)
printf("Racer %d finished %d rounds!!\n", i, numRounds[i]);
if(numRounds[0]>=numRounds[1]) printf("\n RACER-0 WINS.\n\n");
else printf("\n RACER-1 WINS..\n\n");
return (0);
}
void *racer(void *arg)
{
int index = *(int*)arg, NotYet;
while( (numRounds[0] < MAX_ROUNDS) && (numRounds[1] < MAX_ROUNDS) )
{
NotYet = TRUE;
/* RACER 0 tries to get both locks before she makes a round */
if(index==0){
/*Your code here */
pthread_mutex_trylock(&B0);
pthread_mutex_trylock(&B1);
}
/* RACER 1 tries to get both locks before she makes a round */
if(index==1){
/*Your code here */
pthread_mutex_trylock(&B1);
pthread_mutex_trylock(&B0);
}
numRounds[index]++; /* Make one more round */
/* unlock both locks */
pthread_mutex_unlock(&B0);
pthread_mutex_unlock(&B1);
/* random yield to another thread */
}
printf("racer %d made %d rounds !\n", index, numRounds[index]);
pthread_exit(0);
}
when first thread locks B0 and if second get scheduled to lock B1, it will cause deadlock. If first mutex is locked and second is not locked, then release first mutex and loop again. This loop can be smaller if tried with mutex_lock and not trylock.
I encountered a very bizarre bug when I test my interrupt module of my os class project which is based on HOCA system.
When I start my main function (which is from line66 to line101), but when I set the breakpoint at line92, gdb says
No line 92 in the current file. Make breakpoint pending on future shared library load?
Do you guys know what's going on here?
Furthermore, when I set the breakpoint at line 92 and continue GDB, it reports :"
trap: nonexistant memory
address: -1
memory size: 131072
ERROR: address greater than MEMORYSIZE
Program received signal SIGSEGV, Segmentation fault.
0x0000002e in ?? ()
"
Source code is as follow:
/* This module coordinates the initialization of the nucleus and it starts the execution
* of the first process, p1(). It also provides a scheduling function. The module contains
* two functions: main() and init(). init() is static. It also contains a function that it
* exports: schedule().
*/
#include "../../h/const.h"
#include "../../h/types.h"
#include "../../h/util.h"
#include "../../h/vpop.h"
#include "../../h/procq.e"
#include "../../h/asl.e"
#include "../../h/trap.h"
#include "../../h/int.h"
proc_link RQ; /* pointer to the tail of the Ready Queue */
state_t st; /* the starting state_t */
extern int p1();
/* This function determines how much physical memory there is in the system.
* It then calls initProc(), initSemd(), trapinit() and intinit().
*/
void static init(){
STST(&st);
if(st.s_sp%PAGESIZE != 0){
st.s_sp -= st.s_sp%PAGESIZE;
}
initProc();
initSemd();
trapinit();
intinit();
}
/* If the RQ is not empty this function calls intschedule() and loads the state of
* the process at the head of the RQ. If the RQ is empty it calls intdeadlock().
*/
void schedule(){
proc_t *front;
front = headQueue(RQ);
if (checkPointer(front)) {
intschedule();
LDST(&(front->p_s));
}
else {
intdeadlock();
}
}
/* This function calls init(), sets up the processor state for p1(), adds p1() to the
* RQ and calls schedule().
*/
void main(){
proc_t *pp1; // pointer to process table entry
state_t pp1state; //process state
long curr_time; // to store the time
init(); // initialize the process table, semaphore...
/*setup the processor state for p1(), adds p1() to the ReadyQueue */
RQ.next = (proc_t *) ENULL;
RQ.index = 1;
pp1 = allocProc();
if(!checkPointer(pp1)){
return;
}
pp1->parent = (proc_t *) ENULL; // ENULL is set to -1
pp1->child = (proc_t *) ENULL;
pp1->sibling_next = pp1;
pp1->sibling_prev = pp1;
pp1state.s_sp = st.s_sp - (PAGESIZE*2);
pp1state.s_pc = (int)p1;
pp1state.s_sr.ps_s = 1; // here should be line 92
STCK(&curr_time); //store the CPU time to curr_time
pp1->p_s = pp1state;
pp1->start_time = curr_time;
insertProc(&RQ, pp1);
schedule();
return;
}
Compile without optimizations. Use O0 gcc flag for that.
In case of one-shot timer i can use semaphore to wait for timer callback completion.
But if timer was fired several times it doesn't help. Consider the following code:
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <signal.h>
#include <time.h>
#include <unistd.h>
#include <pthread.h>
#define N 10
void timer_threaded_function(sigval_t si)
{
uint8_t *shared_resource = si.sival_ptr;
sleep(rand() % 7);
/* ... manipulate with shared_resource */
return;
}
int main()
{
struct sigevent sig_ev = {0};
uint8_t *shared_resource = malloc(123);
timer_t timer_id;
int i;
sig_ev.sigev_notify = SIGEV_THREAD;
sig_ev.sigev_value.sival_ptr = shared_resource;
sig_ev.sigev_notify_function = timer_threaded_function;
sig_ev.sigev_notify_attributes = NULL;
timer_create(CLOCK_REALTIME, &sig_ev, &timer_id);
for (i = 0; i < N; i++) {
/* arm timer for 1 nanosecond */
timer_settime(timer_id, 0,
&(struct itimerspec){{0,0},{0,1}}, NULL);
/* sleep a little bit, so timer will be fired */
usleep(1);
}
/* only disarms timer, but timer callbacks still can be running */
timer_delete(timer_id);
/*
* TODO: safe wait for all callbacks to end, so shared resource
* can be freed without races.
*/
...
free(shared_resource);
return 0;
}
timer_delete() only disarms timer (if it is was armed) and frees assocoated with timer resources. But timer callbacks still can be running. So we cannot free shared_resource, otherwise race condition may occur. Is there any method to cope with this situation?
I thougth about reference counting, but it doesn't help, because we doesn't know how much threads actually will try to access shared resource (cause of timer overruns).
It is thoroughly unsatisfactory :-(. I have looked, and there does not seem to be any way to discover whether the sigevent (a) has not been fired, or (b) is pending, or (c) is running, or (d) has completed.
Best I can suggest is an extra level of indirection, and a static to point at the shared resource. So:
static foo_t* p_shared ;
....
p_shared = shared_resourse ;
.....
sig_ev.sigev_value.sival_ptr = &p_shared ;
where foo_t is the type of the shared resource.
Now we can use some atomics... in timer_threaded_function():
foo_t** pp_shared ;
foo_t* p_locked ;
foo_t* p_shared ;
pp_shared = so.sival_ptr ;
p_locked = (void*)UINPTR_MAX ;
p_shared = atomic_swap(pp_shared, p_locked) ;
if (p_shared == p_locked)
return ; // locked already.
if (p_shared == NULL)
return ; // destroyed already.
.... proceed to do the usual work ...
if (atomic_cmp_swap(pp_shared, &p_locked, p_shared))
return ; // was locked and is now restored
assert(p_locked == NULL) ;
... the shared resource needs to be freed ...
And in the controlling thread:
timer_delete(timer_id) ; // no more events, thank you
p_s = atomic_swap(&p_shared, NULL) ; // stop processing events
if (p_s == (void*)UINTPTR_MAX)
// an event is being processed.
if (p_s != NULL)
... the shared resource needs to be freed ...
When the event thread finds that the shared resourse needs to be freed, it can do it itself, or signal to the controlling thread that the event has been processed, so that the controlling thread can go ahead and do the free. That's largely a matter of taste.
Basically, this is using atomics to provide a sort of a lock, whose value is tri-state: NULL <=> destroyed ; UINTPTR_MAX <=> locked ; anything else <=> unlocked.
The down-side is the static p_shared, which has to remain in existence until the timer_threaded_function() has finished and will never be called again... and since those are precisely the things that are unknowable, static p_shared is, effectively, a fixture :-(.