Some doubts when reading the operating system material of implementing locks
struct lock {
int locked;
struct queue q;
int sync; /* Normally 0. */
};
void lock_acquire(struct lock *l) {
intr_disable();
while (swap(&l->sync, 1) != 0) {
/* Do nothing */
}
if (!l->locked) {
l->locked = 1;
l->sync = 0;
} else {
queue_add(&l->q, thread_current());
thread_block(&l->sync);
}
intr_enable();
}
void lock_release(struct lock *l) {
intr_disable();
while (swap(&l->sync, 1) != 0) {
/* Do nothing */
}
if (queue_empty(&l->q) {
l->locked = 0;
} else {
thread_unblock(queue_remove(&l->q));
}
l->sync = 0;
intr_enable();
}
What is the purpose of sync?
My gut feeling is that the solutions are all broken. For a lock working correctly the lock_acquire needs to have acquire semantics and the lock_release needs to have release semantics. This way the loads/stores inside the critical section can't move outside of the critical section + you have a happens before edge between a lock release and a subsequent lock acquire on the same lock.
If you take a look at the spinning version:
struct lock {
int locked;
};
void lock_acquire(struct lock *l) {
while (swap(&l->locked, 1)) {
/* Do nothing */
}
}
void lock_release(struct lock *l) {
l->locked = 0;
}
The assignment of locked=0 is just an ordinary store. This means that it can be reordered with other loads and stores before it + it doesn't provide a happens before edge.
It seems to me that 'sync' is a way for the thread to let the OS know that the lock is in use since bot "lock" and "unlock" waits for 'sync" value to be changed before proceeding.
(It's a bit peculiar that interrupts are disabled before checking the 'sync' value)
Related
I'm trying to make a stack that I implemented thread safe using semaphors. It works when I push a single object onto the stack, but terminal freezes up as soon as I try to push a second item onto the stack or pop an item off of the stack. This is what I have so far and am not sure where I'm messing up. Everything complies right, but the terminal just freezes as previously stated
Heres where I create the stack
sem_t selements, sspace;
pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
BlockingStack *new_BlockingStack(int max_size)
{
sem_init(&selements, 0, 0);
sem_init(&sspace, 0, max_size);
BlockingStack *newBlockingStack = malloc(sizeof(BlockingStack));
newBlockingStack->maxSize = max_size;
newBlockingStack->stackTop = -1;
newBlockingStack->element = malloc(max_size * sizeof(void *));
if (newBlockingStack == NULL)
{
return NULL;
}
if (newBlockingStack->element == NULL)
{
free(newBlockingStack);
return NULL;
}
return newBlockingStack;
}
And here are the Push and Pop:
bool BlockingStack_push(BlockingStack *this, void *element)
{
sem_wait(&sspace);
pthread_mutex_lock(&m);
if (this->stackTop == this->maxSize - 1)
{
return false;
}
if (element == NULL)
{
return false;
}
this->element[++this->stackTop] = element;
return true;
pthread_mutex_unlock(&m);
sem_post(&selements);
}
void *BlockingStack_pop(BlockingStack *this)
{
sem_wait(&selements);
pthread_mutex_lock(&m);
if (this->stackTop == -1)
{
return NULL;
}
else
{
return this->element[this->stackTop--];
}
pthread_mutex_unlock(&m);
sem_post(&sspace);
}
SUGGESTED CHANGES:
sem_t sem;
...
BlockingStack *new_BlockingStack(int max_size)
{
sem_init(&sem, 0, 1);
...
bool BlockingStack_push(BlockingStack *this, void *element)
{
sem_wait(&sem);
...
sem_post(&sem);
...
Specifically:
I would only initialize one semaphore object unless I was SURE I needed others
I would use the same semaphore for push() and pop()
pshared: 0 should be sufficent for synchronizing different pthreads inside your single process.
Initialize the semaphore to 1, because the first thing you'll do for either "push" or "pop" is sem_wait().
For thread safety you already have mutex used (pthread_mutex_lock(&m) and pthread_mutex_unlock(&m)). Using such mutual exclusion is enough for that purpose. Once one thread obtains the mutex, other thread blocks on pthread_mutex_lock(&m) call.
And only the thread currently obtaining the mutex can call pthread_mutex_unlock(&m).
OK, i was working on this and finally cracked the answer after doing a little bit of internet research and debugging my code. The error was the the mutex_unlock and the sem_post had to come before the return.
Take my pop for example:
void *BlockingStack_pop(BlockingStack *this)
{
sem_wait(&selements);
pthread_mutex_lock(&m);
if (this->stackTop == -1)
{
return NULL;
}
else
{
return this->element[this->stackTop--];
}
pthread_mutex_unlock(&m);
sem_post(&sspace);
}
notice how the pthread_mutex_unlock(&m); and the sem_post(&sspace); come after the return, they actually must be placed before every return like so:
void *BlockingStack_pop(BlockingStack *this)
{
...
pthread_mutex_unlock(&m);
sem_post(&sspace);
return NULL;
...
pthread_mutex_unlock(&m);
sem_post(&sspace);
return this->element[this->stackTop--];
...
}
I'm working on a college assignment where we are to implement parallelized A* search for a 15 puzzle. For this part, we are to use only one priority queue (I suppose to see that the contention by multiple threads would limit speedup). A problem I am facing is properly synchronizing popping the next "candidate" from the priority queue.
I tried the following:
while(1) {
// The board I'm trying to pop.
Board current_board;
pthread_mutex_lock(&priority_queue_lock);
// If the heap is empty, wait till another thread adds new candidates.
if (pq->heap_size == 0)
{
printf("Waiting...\n");
pthread_mutex_unlock(&priority_queue_lock);
continue;
}
current_board = top(pq);
pthread_mutex_unlock(&priority_queue_lock);
// Generate the new boards from the current one and add to the heap...
}
I've tried different variants of the same idea, but for some reason there are occasions where the threads get stuck on "Waiting". The code works fine serially (or with two threads), so that leads me to believe this is the offending part of the code. I can post the entire thing if necessary. I feel like it's an issue with my understanding of the mutex lock though. Thanks in advance for help.
Edit:
I've added the full code for the parallel thread below:
// h and p are global pointers initialized in main()
void* parallelThread(void* arg)
{
int thread_id = (int)(long long)(arg);
while(1)
{
Board current_board;
pthread_mutex_lock(&priority_queue_lock);
current_board = top(p);
pthread_mutex_unlock(&priority_queue_lock);
// Move blank up.
if (current_board.blank_x > 0)
{
int newpos = current_board.blank_x - 1;
Board new_board = current_board;
new_board.board[current_board.blank_x][current_board.blank_y] = new_board.board[newpos][current_board.blank_y];
new_board.board[newpos][current_board.blank_y] = BLANK;
new_board.blank_x = newpos;
new_board.goodness = get_goodness(new_board.board);
new_board.turncount++;
if (check_solved(new_board))
{
printf("Solved in %d turns",new_board.turncount);
exit(0);
}
if (!exists(h,new_board))
{
insert(h,new_board);
push(p,new_board);
}
}
// Move blank down.
if (current_board.blank_x < 3)
{
int newpos = current_board.blank_x + 1;
Board new_board = current_board;
new_board.board[current_board.blank_x][current_board.blank_y] = new_board.board[newpos][current_board.blank_y];
new_board.board[newpos][current_board.blank_y] = BLANK;
new_board.blank_x = newpos;
new_board.goodness = get_goodness(new_board.board);
new_board.turncount++;
if (check_solved(new_board))
{
printf("Solved in %d turns",new_board.turncount);
exit(0);
}
if (!exists(h,new_board))
{
insert(h,new_board);
push(p,new_board);
}
}
// Move blank right.
if (current_board.blank_y < 3)
{
int newpos = current_board.blank_y + 1;
Board new_board = current_board;
new_board.board[current_board.blank_x][current_board.blank_y] = new_board.board[current_board.blank_x][newpos];
new_board.board[current_board.blank_x][newpos] = BLANK;
new_board.blank_y = newpos;
new_board.goodness = get_goodness(new_board.board);
new_board.turncount++;
if (check_solved(new_board))
{
printf("Solved in %d turns",new_board.turncount);
exit(0);
}
if (!exists(h,new_board))
{
insert(h,new_board);
push(p,new_board);
}
}
// Move blank left.
if (current_board.blank_y > 0)
{
int newpos = current_board.blank_y - 1;
Board new_board = current_board;
new_board.board[current_board.blank_x][current_board.blank_y] = new_board.board[current_board.blank_x][newpos];
new_board.board[current_board.blank_x][newpos] = BLANK;
new_board.blank_y = newpos;
new_board.goodness = get_goodness(new_board.board);
new_board.turncount++;
if (check_solved(new_board))
{
printf("Solved in %d turns",new_board.turncount);
exit(0);
}
if (!exists(h,new_board))
{
insert(h,new_board);
push(p,new_board);
}
}
}
return NULL;
}
I tried the following:
I don't see anything wrong with the code that follows, assuming that top also removes the board from the queue. It's wasteful (if the queue is empty, it will spin locking and unlocking the mutex), but not wrong.
I've added the full code
This is useless without the code for exists, insert and push.
One general observation:
pthread_mutex_lock(&priority_queue_lock);
current_board = top(p);
pthread_mutex_unlock(&priority_queue_lock);
In the code above, your locking is "ouside" of the top function. But here:
if (!exists(h,new_board))
{
insert(h,new_board);
push(p,new_board);
}
you either do no locking at all (in which case that's a bug), or you do locking "inside" exists, insert and push.
You should not mix "inside" and "outside" locking. Pick one or the other and stick with it.
If you in fact do not lock the queue inside exists, insert, etc. then you have a data race and are thinking of mutexes incorrectly: they protect invariants, and you can't check whether the queue is empty in parallel with another thread executing "remove top element" -- these operations require serialization, and thus must both be done under a lock.
I have the following code which runs in 2 threads started by an init call from the main thread. One for writing to a device, one for reading. My app is called by other threads to add items to the queues. pop_queue handles all locking, as does push_queue. Whenever I modify a req r, I lock it's mutex. q->process is a function pointer to one of either write_sector, read_setor. I need to guard against simultaneous calls to the two function pointers, so I'm using a mutex on the actual process call, however this is not working.
According to the text program, I am making parallel calls to the process functions. How is that possible given I lock immediatly before and unlock immediately afterwards?
The following error from valgrind --tool=helgrind might help?
==3850== Possible data race during read of size 4 at 0xbea57efc by thread #2
==3850== at 0x804A290: request_handler (diskdriver.c:239)
Line 239 is r->state = q->process(*device, &r->sd) +1
void *
request_handler(void *arg)
{
req *r;
queue *q = arg;
int writing = !strcmp(q->name, "write");
for(;;) {
/*
* wait for a request
*/
pop_queue(q, &r, TRUE);
/*
* handle request
* req r is unattached to any lists, but must lock it's properties incase being redeemed
*/
printf("Info: driver: (%s) handling req %d\n", q->name, r->id);
pthread_mutex_lock(&r->lock);
pthread_mutex_lock(&q->processing);
r->state = q->process(*device, &r->sd) +1;
pthread_mutex_unlock(&q->processing);
/*
* if writing, return the SectorDescriptor
*/
if (writing) {
printf("Info: driver (write thread) has released a sector descriptor.\n");
blocking_put_sd(*sd_store, r->sd);
r->sd = NULL;
}
pthread_mutex_unlock(&r->lock);
pthread_cond_signal(&r->changed);
}
}
EDIT
Here is the one other location where the req's properties are read
int redeem_voucher(Voucher v, SectorDescriptor *sd)
{
int result;
if (v == NULL){
printf("Driver: null voucher redeemed!\n");
return 0;
}
req *r = v;
pthread_mutex_lock(&r->lock);
/* if state = 0 job still running/queued */
while(r->state==0) {
printf("Driver: blocking for req %d to finish\n", r->id);
pthread_cond_wait(&r->changed, &r->lock);
}
sd = &r->sd;
result = r->state-1;
r->sd = NULL;
r->state = WAIT;
//printf("Driver: req %d completed\n", r->id);
pthread_mutex_unlock(&r->lock);
/*
* return req to pool
*/
push_queue(&pool_q, r);
return result;
}
EDIT 2
here's the push_ and pop_queue functions
int
pop_queue(struct queue *q, req **r, int block)
{
pthread_mutex_lock(&q->lock);
while(q->head == NULL) {
if(block) {
pthread_cond_wait(&q->wait, &q->lock);
}
else {
pthread_mutex_unlock(&q->lock);
return FALSE;
}
}
req *got = q->head;
q->head = got->next;
got->next = NULL;
if(!q->head) {
/* just removed last element */
q->tail = q->head;
}
*r = got;
pthread_mutex_unlock(&q->lock);
return TRUE;
}
/*
* perform a standard linked list insertion to the queue specified
* handles all required locking and signals any listeners
* return: int - if insertion was successful
*/
int
push_queue(queue *q, req *r)
{
/*
* push never blocks,
*/
if(!r || !q)
return FALSE;
pthread_mutex_lock(&q->lock);
if(q->tail) {
q->tail->next = r;
q->tail = r;
}
else {
/* was an empty queue */
q->tail = q->head = r;
}
pthread_mutex_unlock(&q->lock);
pthread_cond_signal(&q->wait);
return TRUE;
}
Based on the available information, it seems that a likely possibility then is that another thread is modifying the data pointed to by *device. Perhaps it is being modified while the q->processing mutex is not held.
Your line
pthread_cond_signal(&r->changed);
let me suspect that you have other code that is also manipulating the structure pointed to by r. In any case it makes not much sense if you have nobody waiting for that condition variable. (And you should invert the unlock and signal lines.)
So, probably your error is just somewhere else, where you access r simultaneaously without taking the lock on the mutex. You didn't show us the rest of your code, so saying more would be even more guess work.
I am developing a userspace premptive thread library(fibre) that uses context switching as the base approach. For this I wrote a scheduler. However, its not performing as expected. Can I have any suggestions for this.
The structure of the thread_t used is :
typedef struct thread_t {
int thr_id;
int thr_usrpri;
int thr_cpupri;
int thr_totalcpu;
ucontext_t thr_context;
void * thr_stack;
int thr_stacksize;
struct thread_t *thr_next;
struct thread_t *thr_prev;
} thread_t;
The scheduling function is as follows:
void schedule(void)
{
thread_t *t1, *t2;
thread_t * newthr = NULL;
int newpri = 127;
struct itimerval tm;
ucontext_t dummy;
sigset_t sigt;
t1 = ready_q;
// Select the thread with higest priority
while (t1 != NULL)
{
if (newpri > t1->thr_usrpri + t1->thr_cpupri)
{
newpri = t1->thr_usrpri + t1->thr_cpupri;
newthr = t1;
}
t1 = t1->thr_next;
}
if (newthr == NULL)
{
if (current_thread == NULL)
{
// No more threads? (stop itimer)
tm.it_interval.tv_usec = 0;
tm.it_interval.tv_sec = 0;
tm.it_value.tv_usec = 0; // ZERO Disable
tm.it_value.tv_sec = 0;
setitimer(ITIMER_PROF, &tm, NULL);
}
return;
}
else
{
// TO DO :: Reenabling of signals must be done.
// Switch to new thread
if (current_thread != NULL)
{
t2 = current_thread;
current_thread = newthr;
timeq = 0;
sigemptyset(&sigt);
sigaddset(&sigt, SIGPROF);
sigprocmask(SIG_UNBLOCK, &sigt, NULL);
swapcontext(&(t2->thr_context), &(current_thread->thr_context));
}
else
{
// No current thread? might be terminated
current_thread = newthr;
timeq = 0;
sigemptyset(&sigt);
sigaddset(&sigt, SIGPROF);
sigprocmask(SIG_UNBLOCK, &sigt, NULL);
swapcontext(&(dummy), &(current_thread->thr_context));
}
}
}
It seems that the "ready_q" (head of the list of ready threads?) never changes, so the search of the higest priority thread always finds the first suitable element. If two threads have the same priority, only the first one has a chance to gain the CPU. There are many algorithms you can use, some are based on a dynamic change of the priority, other ones use a sort of rotation inside the ready queue. In your example you could remove the selected thread from its place in the ready queue and put in at the last place (it's a double linked list, so the operation is trivial and quite inexpensive).
Also, I'd suggest you to consider the performace issues due to the linear search in ready_q, since it may be a problem when the number of threads is big. In that case it may be helpful a more sophisticated structure, with different lists of threads for different levels of priority.
Bye!
I'm having a hard time figuring out how to manage deallocation of memory in multithreaded environments. Specifically what I'm having a hard time with is using a lock to protect a structure, but when it's time to free the structure, you have to unlock the lock to destroy the lock itself. Which will cause problems if a separate thread is waiting on that same lock that you need to destroy.
I'm trying to come up with a mechanism that has retain counts, and when the object's retain count is 0, it's all freed. I've been trying a number of different things but just can't get it right. As I've been doing this it seems like you can't put the locking mechanism inside of the structure that you need to be able to free and destroy, because that requires you unlock the the lock inside of it, which could allow another thread to proceed if it was blocked in a lock request for that same structure. Which would mean that something undefined is guaranteed to happen - the lock was destroyed, and deallocated so either you get memory access errors, or you lock on undefined behavior..
Would someone mind looking at my code? I was able to put together a sandboxed example that demonstrates what I'm trying without a bunch of files.
http://pastebin.com/SJC86GDp
#include <pthread.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
struct xatom {
short rc;
pthread_rwlock_t * rwlck;
};
typedef struct xatom xatom;
struct container {
xatom * atom;
};
typedef struct container container;
#define nr 1
#define nw 2
pthread_t readers[nr];
pthread_t writers[nw];
container * c;
void retain(container * cont);
void release(container ** cont);
short retain_count(container * cont);
void * rth(void * arg) {
short rc;
while(1) {
if(c == NULL) break;
rc = retain_count(c);
}
printf("rth exit!\n");
return NULL;
}
void * wth(void * arg) {
while(1) {
if(c == NULL) break;
release((container **)&c);
}
printf("wth exit!\n");
return NULL;
}
short retain_count(container * cont) {
short rc = 1;
pthread_rwlock_rdlock(cont->atom->rwlck);
printf("got rdlock in retain_count\n");
rc = cont->atom->rc;
pthread_rwlock_unlock(cont->atom->rwlck);
return rc;
}
void retain(container * cont) {
pthread_rwlock_wrlock(cont->atom->rwlck);
printf("got retain write lock\n");
cont->atom->rc++;
pthread_rwlock_unlock(cont->atom->rwlck);
}
void release(container ** cont) {
if(!cont || !(*cont)) return;
container * tmp = *cont;
pthread_rwlock_t ** lock = (pthread_rwlock_t **)&(*cont)->atom->rwlck;
pthread_rwlock_wrlock(*lock);
printf("got release write lock\n");
if(!tmp) {
printf("return 2\n");
pthread_rwlock_unlock(*lock);
if(*lock) {
printf("destroying lock 1\n");
pthread_rwlock_destroy(*lock);
*lock = NULL;
}
return;
}
tmp->atom->rc--;
if(tmp->atom->rc == 0) {
printf("deallocating!\n");
*cont = NULL;
pthread_rwlock_unlock(*lock);
if(pthread_rwlock_trywrlock(*lock) == 0) {
printf("destroying lock 2\n");
pthread_rwlock_destroy(*lock);
*lock = NULL;
}
free(tmp->atom->rwlck);
free(tmp->atom);
free(tmp);
} else {
pthread_rwlock_unlock(*lock);
}
}
container * new_container() {
container * cont = malloc(sizeof(container));
cont->atom = malloc(sizeof(xatom));
cont->atom->rwlck = malloc(sizeof(pthread_rwlock_t));
pthread_rwlock_init(cont->atom->rwlck,NULL);
cont->atom->rc = 1;
return cont;
}
int main(int argc, char ** argv) {
c = new_container();
int i = 0;
int l = 4;
for(i=0;i<l;i++) retain(c);
for(i=0;i<nr;i++) pthread_create(&readers[i],NULL,&rth,NULL);
for(i=0;i<nw;i++) pthread_create(&writers[i],NULL,&wth,NULL);
sleep(2);
for(i=0;i<nr;i++) pthread_join(readers[i],NULL);
for(i=0;i<nw;i++) pthread_join(writers[i],NULL);
return 0;
}
Thanks for any help!
Yes, you can't put the key inside the safe. Your approach with refcount (create object when requested and doesn't exist, delete on last release) is correct. But the lock must exist at least a moment before object is created and after it is destroyed - that is, while it is used. You can't delete it from inside of itself.
OTOH, you don't need countless locks, like one for each object you create. One lock that excludes obtaining and releasing of all objects will not create much performance loss at all. So just create the lock on init and destroy on program end. Otaining/releasing an object should take short enough that lock on variable A blocking access to unrelated variable B should almost never happen. If it happens - you can still introduce one lock per all rarely obtained variables and one per each frequently obtained one.
Also, there seems to be no point for rwlock, plain mutex suffices, and the create/destroy operations MUST exclude each other, not just parallel instances of themselves - so use pthread_create_mutex() family instead.