Mutual exclusion isn't exclusive - c

I have the following code which runs in 2 threads started by an init call from the main thread. One for writing to a device, one for reading. My app is called by other threads to add items to the queues. pop_queue handles all locking, as does push_queue. Whenever I modify a req r, I lock it's mutex. q->process is a function pointer to one of either write_sector, read_setor. I need to guard against simultaneous calls to the two function pointers, so I'm using a mutex on the actual process call, however this is not working.
According to the text program, I am making parallel calls to the process functions. How is that possible given I lock immediatly before and unlock immediately afterwards?
The following error from valgrind --tool=helgrind might help?
==3850== Possible data race during read of size 4 at 0xbea57efc by thread #2
==3850== at 0x804A290: request_handler (diskdriver.c:239)
Line 239 is r->state = q->process(*device, &r->sd) +1
void *
request_handler(void *arg)
{
req *r;
queue *q = arg;
int writing = !strcmp(q->name, "write");
for(;;) {
/*
* wait for a request
*/
pop_queue(q, &r, TRUE);
/*
* handle request
* req r is unattached to any lists, but must lock it's properties incase being redeemed
*/
printf("Info: driver: (%s) handling req %d\n", q->name, r->id);
pthread_mutex_lock(&r->lock);
pthread_mutex_lock(&q->processing);
r->state = q->process(*device, &r->sd) +1;
pthread_mutex_unlock(&q->processing);
/*
* if writing, return the SectorDescriptor
*/
if (writing) {
printf("Info: driver (write thread) has released a sector descriptor.\n");
blocking_put_sd(*sd_store, r->sd);
r->sd = NULL;
}
pthread_mutex_unlock(&r->lock);
pthread_cond_signal(&r->changed);
}
}
EDIT
Here is the one other location where the req's properties are read
int redeem_voucher(Voucher v, SectorDescriptor *sd)
{
int result;
if (v == NULL){
printf("Driver: null voucher redeemed!\n");
return 0;
}
req *r = v;
pthread_mutex_lock(&r->lock);
/* if state = 0 job still running/queued */
while(r->state==0) {
printf("Driver: blocking for req %d to finish\n", r->id);
pthread_cond_wait(&r->changed, &r->lock);
}
sd = &r->sd;
result = r->state-1;
r->sd = NULL;
r->state = WAIT;
//printf("Driver: req %d completed\n", r->id);
pthread_mutex_unlock(&r->lock);
/*
* return req to pool
*/
push_queue(&pool_q, r);
return result;
}
EDIT 2
here's the push_ and pop_queue functions
int
pop_queue(struct queue *q, req **r, int block)
{
pthread_mutex_lock(&q->lock);
while(q->head == NULL) {
if(block) {
pthread_cond_wait(&q->wait, &q->lock);
}
else {
pthread_mutex_unlock(&q->lock);
return FALSE;
}
}
req *got = q->head;
q->head = got->next;
got->next = NULL;
if(!q->head) {
/* just removed last element */
q->tail = q->head;
}
*r = got;
pthread_mutex_unlock(&q->lock);
return TRUE;
}
/*
* perform a standard linked list insertion to the queue specified
* handles all required locking and signals any listeners
* return: int - if insertion was successful
*/
int
push_queue(queue *q, req *r)
{
/*
* push never blocks,
*/
if(!r || !q)
return FALSE;
pthread_mutex_lock(&q->lock);
if(q->tail) {
q->tail->next = r;
q->tail = r;
}
else {
/* was an empty queue */
q->tail = q->head = r;
}
pthread_mutex_unlock(&q->lock);
pthread_cond_signal(&q->wait);
return TRUE;
}

Based on the available information, it seems that a likely possibility then is that another thread is modifying the data pointed to by *device. Perhaps it is being modified while the q->processing mutex is not held.

Your line
pthread_cond_signal(&r->changed);
let me suspect that you have other code that is also manipulating the structure pointed to by r. In any case it makes not much sense if you have nobody waiting for that condition variable. (And you should invert the unlock and signal lines.)
So, probably your error is just somewhere else, where you access r simultaneaously without taking the lock on the mutex. You didn't show us the rest of your code, so saying more would be even more guess work.

Related

Understanding Glib polling system for file descriptors

I'm trying to understand glib polling system. As I understand, polling is a technique to watch file descriptors for events. The function os_host_main_loop_wait runs in a loop. You can see that it calls glib_pollfds_fill, qemu_poll_ns and glib_pollfds_poll. I'm trying to understand what this loop does by calling each of these functions.
static GArray *gpollfds;
static void glib_pollfds_fill(int64_t *cur_timeout)
{
GMainContext *context = g_main_context_default();
int timeout = 0;
int64_t timeout_ns;
int n;
g_main_context_prepare(context, &max_priority);
glib_pollfds_idx = gpollfds->len;
n = glib_n_poll_fds;
do {
GPollFD *pfds;
glib_n_poll_fds = n;
g_array_set_size(gpollfds, glib_pollfds_idx + glib_n_poll_fds);
//Gets current index's address on gpollfds array
pfds = &g_array_index(gpollfds, GPollFD, glib_pollfds_idx);
//Fills gpollfds's each element (pfds) with the file descriptor to be polled
n = g_main_context_query(context, max_priority, &timeout, pfds,
glib_n_poll_fds);
//g_main_context_query returns the number of records actually stored in fds , or,
//if more than n_fds records need to be stored, the number of records that need to be stored.
} while (n != glib_n_poll_fds);
if (timeout < 0) {
timeout_ns = -1;
} else {
timeout_ns = (int64_t)timeout * (int64_t)SCALE_MS;
}
*cur_timeout = qemu_soonest_timeout(timeout_ns, *cur_timeout);
}
static void glib_pollfds_poll(void)
{
GMainContext *context = g_main_context_default();
GPollFD *pfds = &g_array_index(gpollfds, GPollFD, glib_pollfds_idx);
if (g_main_context_check(context, max_priority, pfds, glib_n_poll_fds)) {
g_main_context_dispatch(context);
}
}
static int os_host_main_loop_wait(int64_t timeout)
{
GMainContext *context = g_main_context_default();
int ret;
g_main_context_acquire(context);
glib_pollfds_fill(&timeout);
qemu_mutex_unlock_iothread();
replay_mutex_unlock();
ret = qemu_poll_ns((GPollFD *)gpollfds->data, gpollfds->len, timeout); //RESOLVES TO: g_poll(fds, nfds, qemu_timeout_ns_to_ms(timeout));
replay_mutex_lock();
qemu_mutex_lock_iothread();
glib_pollfds_poll();
g_main_context_release(context);
return ret;
}
So, as I understand, g_poll polls the array of file descriptors with a timeout. What it means? It means it waits for the timeout. If something happens (there's data in the fd to be read for example), I don't know what it does.
Then glib_pollfds_poll calls g_main_context_check and then g_main_context_dispatch.
According to glib's documentation, what g_main_context_check does is:
Passes the results of polling back to the main loop.
What that means?
Then g_main_context_dispatch
dispatches all sources
, which I also don't know what it means.
Entire source can be founde here: https://github.com/qemu/qemu/blob/14e5526b51910efd62cd31cd95b49baca975c83f/util/main-loop.c

Strange deadlock in producer-consumer queue

With C's pthread library, I'm trying to implement a simple producer-consumer schema.
The producer generates random numbers and puts them into a queue like this
typedef struct {
int q[MAX_QUEUE];
int head;
int tail;
} queue;
The consumer just takes numbers one by one and prints them to the standard output. Synchronisation is done with one mutex and two condition variables: empty_queue (to suspend the consumer if the queue is empty) and full_queue (to suspend the producer if the queue is full). The problem is that both threads suspend themselves when reaching MAX_QUEUEelements produced/consumed and so they enter a deadlock situation. I think I have done everything correct, I can't figure out what I'm doing wrong.
Producer:
void* producer(void* args) {
unsigned seed = time(NULL);
int random;
queue *coda = (queue *) args;
while(1) {
Pthread_mutex_lock(&queue_lock);
while(coda->head == MAX_QUEUE-1) { // Full Queue
printf("Suspending producer\n");
fflush(stdout);
Pthread_cond_wait(&full_queue, &queue_lock);
}
random = rand_r(&seed) % 21;
enqueue(coda, random);
Pthread_cond_signal(&empty_queue);
Pthread_mutex_unlock(&queue_lock);
sleep(1);
}
return NULL;
}
Consumer:
void* consumer(void* args) {
queue *coda = (queue *) args;
int elem;
while(1) {
Pthread_mutex_lock(&queue_lock);
while(coda->head == coda->tail) { // Empty Queue
printf("Suspending Consumer\n");
fflush(stdout);
Pthread_cond_wait(&empty_queue, &queue_lock);
}
elem = dequeue(coda);
printf("Found %i\n",elem);
Pthread_cond_signal(&full_queue);
Pthread_mutex_unlock(&queue_lock);
}
return NULL;
}
Enqueue/Dequeue routines
static void enqueue(queue *q, int elem) {
q->q[(q->tail)] = elem;
(q->tail)++;
if(q->tail == MAX_QUEUE)
q->tail = 0;
}
static int dequeue(queue *q) {
int elem = q->q[(q->head)];
(q->head)++;
if(q->head == MAX_QUEUE)
q->head = 0;
return elem;
}
Pthread_* are just wrapper functions to the standard pthread_* library functions.
Output (with MAX_QUEUE = 10):
Suspending Consumer
Found 16
Suspending Consumer
Found 7
Suspending Consumer
Found 5
Suspending Consumer
Found 6
Suspending Consumer
Found 17
Suspending Consumer
Found 1
Suspending Consumer
Found 12
Suspending Consumer
Found 14
Suspending Consumer
Found 11
Suspending Consumer
Suspending producer
coda->head == MAX_QUEUE-1
This does not check whether the queue is full. There are two variables that describe the state of the queue, head and tail.
coda->head == coda->tail
This properly checks that the queue is empty. Note how both variables are used in the check.

Thread pool - handle a case when there are more tasks than threads

I'm just entered multithreaded programming and as part of an exercise trying to implement a simple thread pool using pthreads.
I have tried to use conditional variable to signal working threads that there are jobs waiting within the queue. But for a reason I can't figure out the mechanism is not working.
Bellow are the relevant code snippets:
typedef struct thread_pool_task
{
void (*computeFunc)(void *);
void *param;
} ThreadPoolTask;
typedef enum thread_pool_state
{
RUNNING = 0,
SOFT_SHUTDOWN = 1,
HARD_SHUTDOWN = 2
} ThreadPoolState;
typedef struct thread_pool
{
ThreadPoolState poolState;
unsigned int poolSize;
unsigned int queueSize;
OSQueue* poolQueue;
pthread_t* threads;
pthread_mutex_t q_mtx;
pthread_cond_t q_cnd;
} ThreadPool;
static void* threadPoolThread(void* threadPool){
ThreadPool* pool = (ThreadPool*)(threadPool);
for(;;)
{
/* Lock must be taken to wait on conditional variable */
pthread_mutex_lock(&(pool->q_mtx));
/* Wait on condition variable, check for spurious wakeups.
When returning from pthread_cond_wait(), we own the lock. */
while( (pool->queueSize == 0) && (pool->poolState == RUNNING) )
{
pthread_cond_wait(&(pool->q_cnd), &(pool->q_mtx));
}
printf("Queue size: %d\n", pool->queueSize);
/* --- */
if (pool->poolState != RUNNING){
break;
}
/* Grab our task */
ThreadPoolTask* task = osDequeue(pool->poolQueue);
pool->queueSize--;
/* Unlock */
pthread_mutex_unlock(&(pool->q_mtx));
/* Get to work */
(*(task->computeFunc))(task->param);
free(task);
}
pthread_mutex_unlock(&(pool->q_mtx));
pthread_exit(NULL);
return(NULL);
}
ThreadPool* tpCreate(int numOfThreads)
{
ThreadPool* threadPool = malloc(sizeof(ThreadPool));
if(threadPool == NULL) return NULL;
/* Initialize */
threadPool->poolState = RUNNING;
threadPool->poolSize = numOfThreads;
threadPool->queueSize = 0;
/* Allocate OSQueue and threads */
threadPool->poolQueue = osCreateQueue();
if (threadPool->poolQueue == NULL)
{
}
threadPool->threads = malloc(sizeof(pthread_t) * numOfThreads);
if (threadPool->threads == NULL)
{
}
/* Initialize mutex and conditional variable */
pthread_mutex_init(&(threadPool->q_mtx), NULL);
pthread_cond_init(&(threadPool->q_cnd), NULL);
/* Start worker threads */
for(int i = 0; i < threadPool->poolSize; i++)
{
pthread_create(&(threadPool->threads[i]), NULL, threadPoolThread, threadPool);
}
return threadPool;
}
int tpInsertTask(ThreadPool* threadPool, void (*computeFunc) (void *), void* param)
{
if(threadPool == NULL || computeFunc == NULL) {
return -1;
}
/* Check state and create ThreadPoolTask */
if (threadPool->poolState != RUNNING) return -1;
ThreadPoolTask* newTask = malloc(sizeof(ThreadPoolTask));
if (newTask == NULL) return -1;
newTask->computeFunc = computeFunc;
newTask->param = param;
/* Add task to queue */
pthread_mutex_lock(&(threadPool->q_mtx));
osEnqueue(threadPool->poolQueue, newTask);
threadPool->queueSize++;
pthread_cond_signal(&(threadPool->q_cnd));
pthread_mutex_unlock(&threadPool->q_mtx);
return 0;
}
The problem is that when I create a pool with 1 thread and add a lot of jobs to it, it does not executes all the jobs.
[EDIT:]
I have tried running the following code to test basic functionality:
void hello (void* a)
{
int i = *((int*)a);
printf("hello: %d\n", i);
}
void test_thread_pool_sanity()
{
int i;
ThreadPool* tp = tpCreate(1);
for(i=0; i<10; ++i)
{
tpInsertTask(tp,hello,(void*)(&i));
}
}
I expected to have input in like the following:
hello: 0
hello: 1
hello: 2
hello: 3
hello: 4
hello: 5
hello: 6
hello: 7
hello: 8
hello: 9
Instead, sometime i get the following output:
Queue size: 9 //printf added for debugging within threadPoolThread
hello: 9
Queue size: 9 //printf added for debugging within threadPoolThread
hello: 0
And sometimes I don't get any output at all.
What is the thing I'm missing?
When you call tpInsertTask(tp,hello,(void*)(&i)); you are passing the address of i which is on the stack. There are multiple problems with this:
Every thread is getting the same address. I am guessing the hello function takes that address and prints out *param which all point to the same location on the stack.
Since i is on the stack once test_thread_pool_sanity returns the last value is lost and will be overwritten by other code so the value is undefined.
Depending on then the worker thread works through the tasks versus when your main test thread schedules the tasks you will get different results.
You need the parameter passed to be saved as part of the task in order to guarantee it is unique per task.
EDIT: You should also check the return code of pthread_create to see if it is failing.

Writing a scheduler for a Userspace thread library

I am developing a userspace premptive thread library(fibre) that uses context switching as the base approach. For this I wrote a scheduler. However, its not performing as expected. Can I have any suggestions for this.
The structure of the thread_t used is :
typedef struct thread_t {
int thr_id;
int thr_usrpri;
int thr_cpupri;
int thr_totalcpu;
ucontext_t thr_context;
void * thr_stack;
int thr_stacksize;
struct thread_t *thr_next;
struct thread_t *thr_prev;
} thread_t;
The scheduling function is as follows:
void schedule(void)
{
thread_t *t1, *t2;
thread_t * newthr = NULL;
int newpri = 127;
struct itimerval tm;
ucontext_t dummy;
sigset_t sigt;
t1 = ready_q;
// Select the thread with higest priority
while (t1 != NULL)
{
if (newpri > t1->thr_usrpri + t1->thr_cpupri)
{
newpri = t1->thr_usrpri + t1->thr_cpupri;
newthr = t1;
}
t1 = t1->thr_next;
}
if (newthr == NULL)
{
if (current_thread == NULL)
{
// No more threads? (stop itimer)
tm.it_interval.tv_usec = 0;
tm.it_interval.tv_sec = 0;
tm.it_value.tv_usec = 0; // ZERO Disable
tm.it_value.tv_sec = 0;
setitimer(ITIMER_PROF, &tm, NULL);
}
return;
}
else
{
// TO DO :: Reenabling of signals must be done.
// Switch to new thread
if (current_thread != NULL)
{
t2 = current_thread;
current_thread = newthr;
timeq = 0;
sigemptyset(&sigt);
sigaddset(&sigt, SIGPROF);
sigprocmask(SIG_UNBLOCK, &sigt, NULL);
swapcontext(&(t2->thr_context), &(current_thread->thr_context));
}
else
{
// No current thread? might be terminated
current_thread = newthr;
timeq = 0;
sigemptyset(&sigt);
sigaddset(&sigt, SIGPROF);
sigprocmask(SIG_UNBLOCK, &sigt, NULL);
swapcontext(&(dummy), &(current_thread->thr_context));
}
}
}
It seems that the "ready_q" (head of the list of ready threads?) never changes, so the search of the higest priority thread always finds the first suitable element. If two threads have the same priority, only the first one has a chance to gain the CPU. There are many algorithms you can use, some are based on a dynamic change of the priority, other ones use a sort of rotation inside the ready queue. In your example you could remove the selected thread from its place in the ready queue and put in at the last place (it's a double linked list, so the operation is trivial and quite inexpensive).
Also, I'd suggest you to consider the performace issues due to the linear search in ready_q, since it may be a problem when the number of threads is big. In that case it may be helpful a more sophisticated structure, with different lists of threads for different levels of priority.
Bye!

Deallocating memory in multi-threaded environment

I'm having a hard time figuring out how to manage deallocation of memory in multithreaded environments. Specifically what I'm having a hard time with is using a lock to protect a structure, but when it's time to free the structure, you have to unlock the lock to destroy the lock itself. Which will cause problems if a separate thread is waiting on that same lock that you need to destroy.
I'm trying to come up with a mechanism that has retain counts, and when the object's retain count is 0, it's all freed. I've been trying a number of different things but just can't get it right. As I've been doing this it seems like you can't put the locking mechanism inside of the structure that you need to be able to free and destroy, because that requires you unlock the the lock inside of it, which could allow another thread to proceed if it was blocked in a lock request for that same structure. Which would mean that something undefined is guaranteed to happen - the lock was destroyed, and deallocated so either you get memory access errors, or you lock on undefined behavior..
Would someone mind looking at my code? I was able to put together a sandboxed example that demonstrates what I'm trying without a bunch of files.
http://pastebin.com/SJC86GDp
#include <pthread.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
struct xatom {
short rc;
pthread_rwlock_t * rwlck;
};
typedef struct xatom xatom;
struct container {
xatom * atom;
};
typedef struct container container;
#define nr 1
#define nw 2
pthread_t readers[nr];
pthread_t writers[nw];
container * c;
void retain(container * cont);
void release(container ** cont);
short retain_count(container * cont);
void * rth(void * arg) {
short rc;
while(1) {
if(c == NULL) break;
rc = retain_count(c);
}
printf("rth exit!\n");
return NULL;
}
void * wth(void * arg) {
while(1) {
if(c == NULL) break;
release((container **)&c);
}
printf("wth exit!\n");
return NULL;
}
short retain_count(container * cont) {
short rc = 1;
pthread_rwlock_rdlock(cont->atom->rwlck);
printf("got rdlock in retain_count\n");
rc = cont->atom->rc;
pthread_rwlock_unlock(cont->atom->rwlck);
return rc;
}
void retain(container * cont) {
pthread_rwlock_wrlock(cont->atom->rwlck);
printf("got retain write lock\n");
cont->atom->rc++;
pthread_rwlock_unlock(cont->atom->rwlck);
}
void release(container ** cont) {
if(!cont || !(*cont)) return;
container * tmp = *cont;
pthread_rwlock_t ** lock = (pthread_rwlock_t **)&(*cont)->atom->rwlck;
pthread_rwlock_wrlock(*lock);
printf("got release write lock\n");
if(!tmp) {
printf("return 2\n");
pthread_rwlock_unlock(*lock);
if(*lock) {
printf("destroying lock 1\n");
pthread_rwlock_destroy(*lock);
*lock = NULL;
}
return;
}
tmp->atom->rc--;
if(tmp->atom->rc == 0) {
printf("deallocating!\n");
*cont = NULL;
pthread_rwlock_unlock(*lock);
if(pthread_rwlock_trywrlock(*lock) == 0) {
printf("destroying lock 2\n");
pthread_rwlock_destroy(*lock);
*lock = NULL;
}
free(tmp->atom->rwlck);
free(tmp->atom);
free(tmp);
} else {
pthread_rwlock_unlock(*lock);
}
}
container * new_container() {
container * cont = malloc(sizeof(container));
cont->atom = malloc(sizeof(xatom));
cont->atom->rwlck = malloc(sizeof(pthread_rwlock_t));
pthread_rwlock_init(cont->atom->rwlck,NULL);
cont->atom->rc = 1;
return cont;
}
int main(int argc, char ** argv) {
c = new_container();
int i = 0;
int l = 4;
for(i=0;i<l;i++) retain(c);
for(i=0;i<nr;i++) pthread_create(&readers[i],NULL,&rth,NULL);
for(i=0;i<nw;i++) pthread_create(&writers[i],NULL,&wth,NULL);
sleep(2);
for(i=0;i<nr;i++) pthread_join(readers[i],NULL);
for(i=0;i<nw;i++) pthread_join(writers[i],NULL);
return 0;
}
Thanks for any help!
Yes, you can't put the key inside the safe. Your approach with refcount (create object when requested and doesn't exist, delete on last release) is correct. But the lock must exist at least a moment before object is created and after it is destroyed - that is, while it is used. You can't delete it from inside of itself.
OTOH, you don't need countless locks, like one for each object you create. One lock that excludes obtaining and releasing of all objects will not create much performance loss at all. So just create the lock on init and destroy on program end. Otaining/releasing an object should take short enough that lock on variable A blocking access to unrelated variable B should almost never happen. If it happens - you can still introduce one lock per all rarely obtained variables and one per each frequently obtained one.
Also, there seems to be no point for rwlock, plain mutex suffices, and the create/destroy operations MUST exclude each other, not just parallel instances of themselves - so use pthread_create_mutex() family instead.

Resources