Using Semaphors to create a thread safe stack in C? - c

I'm trying to make a stack that I implemented thread safe using semaphors. It works when I push a single object onto the stack, but terminal freezes up as soon as I try to push a second item onto the stack or pop an item off of the stack. This is what I have so far and am not sure where I'm messing up. Everything complies right, but the terminal just freezes as previously stated
Heres where I create the stack
sem_t selements, sspace;
pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
BlockingStack *new_BlockingStack(int max_size)
{
sem_init(&selements, 0, 0);
sem_init(&sspace, 0, max_size);
BlockingStack *newBlockingStack = malloc(sizeof(BlockingStack));
newBlockingStack->maxSize = max_size;
newBlockingStack->stackTop = -1;
newBlockingStack->element = malloc(max_size * sizeof(void *));
if (newBlockingStack == NULL)
{
return NULL;
}
if (newBlockingStack->element == NULL)
{
free(newBlockingStack);
return NULL;
}
return newBlockingStack;
}
And here are the Push and Pop:
bool BlockingStack_push(BlockingStack *this, void *element)
{
sem_wait(&sspace);
pthread_mutex_lock(&m);
if (this->stackTop == this->maxSize - 1)
{
return false;
}
if (element == NULL)
{
return false;
}
this->element[++this->stackTop] = element;
return true;
pthread_mutex_unlock(&m);
sem_post(&selements);
}
void *BlockingStack_pop(BlockingStack *this)
{
sem_wait(&selements);
pthread_mutex_lock(&m);
if (this->stackTop == -1)
{
return NULL;
}
else
{
return this->element[this->stackTop--];
}
pthread_mutex_unlock(&m);
sem_post(&sspace);
}

SUGGESTED CHANGES:
sem_t sem;
...
BlockingStack *new_BlockingStack(int max_size)
{
sem_init(&sem, 0, 1);
...
bool BlockingStack_push(BlockingStack *this, void *element)
{
sem_wait(&sem);
...
sem_post(&sem);
...
Specifically:
I would only initialize one semaphore object unless I was SURE I needed others
I would use the same semaphore for push() and pop()
pshared: 0 should be sufficent for synchronizing different pthreads inside your single process.
Initialize the semaphore to 1, because the first thing you'll do for either "push" or "pop" is sem_wait().

For thread safety you already have mutex used (pthread_mutex_lock(&m) and pthread_mutex_unlock(&m)). Using such mutual exclusion is enough for that purpose. Once one thread obtains the mutex, other thread blocks on pthread_mutex_lock(&m) call.
And only the thread currently obtaining the mutex can call pthread_mutex_unlock(&m).

OK, i was working on this and finally cracked the answer after doing a little bit of internet research and debugging my code. The error was the the mutex_unlock and the sem_post had to come before the return.
Take my pop for example:
void *BlockingStack_pop(BlockingStack *this)
{
sem_wait(&selements);
pthread_mutex_lock(&m);
if (this->stackTop == -1)
{
return NULL;
}
else
{
return this->element[this->stackTop--];
}
pthread_mutex_unlock(&m);
sem_post(&sspace);
}
notice how the pthread_mutex_unlock(&m); and the sem_post(&sspace); come after the return, they actually must be placed before every return like so:
void *BlockingStack_pop(BlockingStack *this)
{
...
pthread_mutex_unlock(&m);
sem_post(&sspace);
return NULL;
...
pthread_mutex_unlock(&m);
sem_post(&sspace);
return this->element[this->stackTop--];
...
}

Related

Keeping track of all threads in a thread pool

I am looking at using the Windows Threading API and the issue it seems to have is you cannot keep track of when all the threads are completed. You can keep track of when the work item has been completed, assuming you kept track of each one. From my research there is no direct way to query the thread pool to see if the work items submitted has all be completed.
#include <windows.h>
#include <tchar.h>
#include <stdio.h>
VOID CALLBACK MyWorkCallback(PTP_CALLBACK_INSTANCE Instance, PVOID Parameter, PTP_WORK Work) {
DWORD threadId = GetCurrentThreadId();
BOOL bRet = FALSE;
printf("%d thread\n", threadId);
return;
}
int main() {
TP_CALLBACK_ENVIRON CallBackEnviron;
PTP_POOL pool = NULL;
PTP_CLEANUP_GROUP cleanupgroup = NULL;
PTP_WORK_CALLBACK workcallback = MyWorkCallback;
PTP_TIMER timer = NULL;
PTP_WORK work = NULL;
InitializeThreadpoolEnvironment(&CallBackEnviron);
pool = CreateThreadpool(NULL);
SetThreadpoolThreadMaximum(pool, 1);
SetThreadpoolThreadMinimum(pool, 3);
SetThreadpoolCallbackPool(&CallBackEnviron, pool);
for (int i = 0; i < 10; ++i) {
work = CreateThreadpoolWork(workcallback, NULL, &CallBackEnviron);
SubmitThreadpoolWork(work);
WaitForThreadpoolWorkCallbacks(work, FALSE); // This waits for the work item to get completed.
}
return 1;
}
Here is a simple example. What happens is on the WaitForThreadpoolWorkCallbacks I am able to wait on that specific work item. Which is no problem if I am doing a few things. However, if I am traversing a directory and have thousands of files that I need to have work done on them, I don't want to keep track of each individual work item. Is it possible to query the Thread Pool queue to see if anything is left for processing? Or to find out if any of the threads are still working?
you need keep track of active tasks ( like pendcnt in comment) +1. but this must not be global variable, but member in some struct. and pass pointer to this struct to work item. increment this counter before call SubmitThreadpoolWork and decrement from callback, before exit. but you also need and event - set this event in signal state, when counter became 0. and wait on event from main thread. if your code inside dll, which can be unloaded - you need also reference dll, before SubmitThreadpoolWork and FreeLibraryWhenCallbackReturns from callback. also important that counter value - was 1 (not 0) ininitally - so this is count_of_active_cb + 1, and decrement it before begin wait (if not do this - counter can became 0 early - for instance first callback exit before you activate second)
class Task
{
HANDLE _hEvent = 0;
ULONG _dwThreadId = 0;
LONG _dwRefCount = 1;
public:
~Task()
{
if (_hEvent) CloseHandle(_hEvent);
}
ULONG Init()
{
if (HANDLE hEvent = CreateEvent(0, 0, 0, 0))
{
_hEvent = hEvent;
return NOERROR;
}
return GetLastError();
}
void AddTask()
{
InterlockedIncrementNoFence(&_dwRefCount);
}
void EndTask()
{
if (!InterlockedDecrement(&_dwRefCount))
{
if (_dwThreadId != GetCurrentThreadId())
{
if (!SetEvent(_hEvent)) __debugbreak();
}
}
}
void Wait()
{
_dwThreadId = GetCurrentThreadId();
EndTask();
if (_dwRefCount && WaitForSingleObject(_hEvent, INFINITE) != WAIT_OBJECT_0) __debugbreak();
}
};
VOID CALLBACK MyWorkCallback(PTP_CALLBACK_INSTANCE Instance, PVOID Parameter, PTP_WORK /*Work*/)
{
// need only if your code in dll which can be unloaded
FreeLibraryWhenCallbackReturns(Instance, (HMODULE)&__ImageBase);
WCHAR sz[32];
swprintf_s(sz, _countof(sz), L"[%x] thread", GetCurrentThreadId());
MessageBoxW(0, 0, sz, MB_ICONINFORMATION);
reinterpret_cast<Task*>(Parameter)->EndTask();
}
void CbDemo()
{
Task task;
if (task.Init() == NOERROR)
{
ULONG n = 2;
do
{
if (PTP_WORK pwk = CreateThreadpoolWork(MyWorkCallback, &task, 0))
{
HMODULE hmod;
// need only if your code in dll which can be unloaded
if (GetModuleHandleExW(GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS, (PWSTR)&__ImageBase, &hmod))
{
task.AddTask();
SubmitThreadpoolWork(pwk);
}
CloseThreadpoolWork(pwk);
}
} while (--n);
MessageBoxW(0, 0, L"Main Thread", MB_ICONWARNING);
task.Wait();
__nop();
}
}

Implementing locks

Some doubts when reading the operating system material of implementing locks
struct lock {
int locked;
struct queue q;
int sync; /* Normally 0. */
};
void lock_acquire(struct lock *l) {
intr_disable();
while (swap(&l->sync, 1) != 0) {
/* Do nothing */
}
if (!l->locked) {
l->locked = 1;
l->sync = 0;
} else {
queue_add(&l->q, thread_current());
thread_block(&l->sync);
}
intr_enable();
}
void lock_release(struct lock *l) {
intr_disable();
while (swap(&l->sync, 1) != 0) {
/* Do nothing */
}
if (queue_empty(&l->q) {
l->locked = 0;
} else {
thread_unblock(queue_remove(&l->q));
}
l->sync = 0;
intr_enable();
}
What is the purpose of sync?
My gut feeling is that the solutions are all broken. For a lock working correctly the lock_acquire needs to have acquire semantics and the lock_release needs to have release semantics. This way the loads/stores inside the critical section can't move outside of the critical section + you have a happens before edge between a lock release and a subsequent lock acquire on the same lock.
If you take a look at the spinning version:
struct lock {
int locked;
};
void lock_acquire(struct lock *l) {
while (swap(&l->locked, 1)) {
/* Do nothing */
}
}
void lock_release(struct lock *l) {
l->locked = 0;
}
The assignment of locked=0 is just an ordinary store. This means that it can be reordered with other loads and stores before it + it doesn't provide a happens before edge.
It seems to me that 'sync' is a way for the thread to let the OS know that the lock is in use since bot "lock" and "unlock" waits for 'sync" value to be changed before proceeding.
(It's a bit peculiar that interrupts are disabled before checking the 'sync' value)

Thread pool - handle a case when there are more tasks than threads

I'm just entered multithreaded programming and as part of an exercise trying to implement a simple thread pool using pthreads.
I have tried to use conditional variable to signal working threads that there are jobs waiting within the queue. But for a reason I can't figure out the mechanism is not working.
Bellow are the relevant code snippets:
typedef struct thread_pool_task
{
void (*computeFunc)(void *);
void *param;
} ThreadPoolTask;
typedef enum thread_pool_state
{
RUNNING = 0,
SOFT_SHUTDOWN = 1,
HARD_SHUTDOWN = 2
} ThreadPoolState;
typedef struct thread_pool
{
ThreadPoolState poolState;
unsigned int poolSize;
unsigned int queueSize;
OSQueue* poolQueue;
pthread_t* threads;
pthread_mutex_t q_mtx;
pthread_cond_t q_cnd;
} ThreadPool;
static void* threadPoolThread(void* threadPool){
ThreadPool* pool = (ThreadPool*)(threadPool);
for(;;)
{
/* Lock must be taken to wait on conditional variable */
pthread_mutex_lock(&(pool->q_mtx));
/* Wait on condition variable, check for spurious wakeups.
When returning from pthread_cond_wait(), we own the lock. */
while( (pool->queueSize == 0) && (pool->poolState == RUNNING) )
{
pthread_cond_wait(&(pool->q_cnd), &(pool->q_mtx));
}
printf("Queue size: %d\n", pool->queueSize);
/* --- */
if (pool->poolState != RUNNING){
break;
}
/* Grab our task */
ThreadPoolTask* task = osDequeue(pool->poolQueue);
pool->queueSize--;
/* Unlock */
pthread_mutex_unlock(&(pool->q_mtx));
/* Get to work */
(*(task->computeFunc))(task->param);
free(task);
}
pthread_mutex_unlock(&(pool->q_mtx));
pthread_exit(NULL);
return(NULL);
}
ThreadPool* tpCreate(int numOfThreads)
{
ThreadPool* threadPool = malloc(sizeof(ThreadPool));
if(threadPool == NULL) return NULL;
/* Initialize */
threadPool->poolState = RUNNING;
threadPool->poolSize = numOfThreads;
threadPool->queueSize = 0;
/* Allocate OSQueue and threads */
threadPool->poolQueue = osCreateQueue();
if (threadPool->poolQueue == NULL)
{
}
threadPool->threads = malloc(sizeof(pthread_t) * numOfThreads);
if (threadPool->threads == NULL)
{
}
/* Initialize mutex and conditional variable */
pthread_mutex_init(&(threadPool->q_mtx), NULL);
pthread_cond_init(&(threadPool->q_cnd), NULL);
/* Start worker threads */
for(int i = 0; i < threadPool->poolSize; i++)
{
pthread_create(&(threadPool->threads[i]), NULL, threadPoolThread, threadPool);
}
return threadPool;
}
int tpInsertTask(ThreadPool* threadPool, void (*computeFunc) (void *), void* param)
{
if(threadPool == NULL || computeFunc == NULL) {
return -1;
}
/* Check state and create ThreadPoolTask */
if (threadPool->poolState != RUNNING) return -1;
ThreadPoolTask* newTask = malloc(sizeof(ThreadPoolTask));
if (newTask == NULL) return -1;
newTask->computeFunc = computeFunc;
newTask->param = param;
/* Add task to queue */
pthread_mutex_lock(&(threadPool->q_mtx));
osEnqueue(threadPool->poolQueue, newTask);
threadPool->queueSize++;
pthread_cond_signal(&(threadPool->q_cnd));
pthread_mutex_unlock(&threadPool->q_mtx);
return 0;
}
The problem is that when I create a pool with 1 thread and add a lot of jobs to it, it does not executes all the jobs.
[EDIT:]
I have tried running the following code to test basic functionality:
void hello (void* a)
{
int i = *((int*)a);
printf("hello: %d\n", i);
}
void test_thread_pool_sanity()
{
int i;
ThreadPool* tp = tpCreate(1);
for(i=0; i<10; ++i)
{
tpInsertTask(tp,hello,(void*)(&i));
}
}
I expected to have input in like the following:
hello: 0
hello: 1
hello: 2
hello: 3
hello: 4
hello: 5
hello: 6
hello: 7
hello: 8
hello: 9
Instead, sometime i get the following output:
Queue size: 9 //printf added for debugging within threadPoolThread
hello: 9
Queue size: 9 //printf added for debugging within threadPoolThread
hello: 0
And sometimes I don't get any output at all.
What is the thing I'm missing?
When you call tpInsertTask(tp,hello,(void*)(&i)); you are passing the address of i which is on the stack. There are multiple problems with this:
Every thread is getting the same address. I am guessing the hello function takes that address and prints out *param which all point to the same location on the stack.
Since i is on the stack once test_thread_pool_sanity returns the last value is lost and will be overwritten by other code so the value is undefined.
Depending on then the worker thread works through the tasks versus when your main test thread schedules the tasks you will get different results.
You need the parameter passed to be saved as part of the task in order to guarantee it is unique per task.
EDIT: You should also check the return code of pthread_create to see if it is failing.

Mutual exclusion isn't exclusive

I have the following code which runs in 2 threads started by an init call from the main thread. One for writing to a device, one for reading. My app is called by other threads to add items to the queues. pop_queue handles all locking, as does push_queue. Whenever I modify a req r, I lock it's mutex. q->process is a function pointer to one of either write_sector, read_setor. I need to guard against simultaneous calls to the two function pointers, so I'm using a mutex on the actual process call, however this is not working.
According to the text program, I am making parallel calls to the process functions. How is that possible given I lock immediatly before and unlock immediately afterwards?
The following error from valgrind --tool=helgrind might help?
==3850== Possible data race during read of size 4 at 0xbea57efc by thread #2
==3850== at 0x804A290: request_handler (diskdriver.c:239)
Line 239 is r->state = q->process(*device, &r->sd) +1
void *
request_handler(void *arg)
{
req *r;
queue *q = arg;
int writing = !strcmp(q->name, "write");
for(;;) {
/*
* wait for a request
*/
pop_queue(q, &r, TRUE);
/*
* handle request
* req r is unattached to any lists, but must lock it's properties incase being redeemed
*/
printf("Info: driver: (%s) handling req %d\n", q->name, r->id);
pthread_mutex_lock(&r->lock);
pthread_mutex_lock(&q->processing);
r->state = q->process(*device, &r->sd) +1;
pthread_mutex_unlock(&q->processing);
/*
* if writing, return the SectorDescriptor
*/
if (writing) {
printf("Info: driver (write thread) has released a sector descriptor.\n");
blocking_put_sd(*sd_store, r->sd);
r->sd = NULL;
}
pthread_mutex_unlock(&r->lock);
pthread_cond_signal(&r->changed);
}
}
EDIT
Here is the one other location where the req's properties are read
int redeem_voucher(Voucher v, SectorDescriptor *sd)
{
int result;
if (v == NULL){
printf("Driver: null voucher redeemed!\n");
return 0;
}
req *r = v;
pthread_mutex_lock(&r->lock);
/* if state = 0 job still running/queued */
while(r->state==0) {
printf("Driver: blocking for req %d to finish\n", r->id);
pthread_cond_wait(&r->changed, &r->lock);
}
sd = &r->sd;
result = r->state-1;
r->sd = NULL;
r->state = WAIT;
//printf("Driver: req %d completed\n", r->id);
pthread_mutex_unlock(&r->lock);
/*
* return req to pool
*/
push_queue(&pool_q, r);
return result;
}
EDIT 2
here's the push_ and pop_queue functions
int
pop_queue(struct queue *q, req **r, int block)
{
pthread_mutex_lock(&q->lock);
while(q->head == NULL) {
if(block) {
pthread_cond_wait(&q->wait, &q->lock);
}
else {
pthread_mutex_unlock(&q->lock);
return FALSE;
}
}
req *got = q->head;
q->head = got->next;
got->next = NULL;
if(!q->head) {
/* just removed last element */
q->tail = q->head;
}
*r = got;
pthread_mutex_unlock(&q->lock);
return TRUE;
}
/*
* perform a standard linked list insertion to the queue specified
* handles all required locking and signals any listeners
* return: int - if insertion was successful
*/
int
push_queue(queue *q, req *r)
{
/*
* push never blocks,
*/
if(!r || !q)
return FALSE;
pthread_mutex_lock(&q->lock);
if(q->tail) {
q->tail->next = r;
q->tail = r;
}
else {
/* was an empty queue */
q->tail = q->head = r;
}
pthread_mutex_unlock(&q->lock);
pthread_cond_signal(&q->wait);
return TRUE;
}
Based on the available information, it seems that a likely possibility then is that another thread is modifying the data pointed to by *device. Perhaps it is being modified while the q->processing mutex is not held.
Your line
pthread_cond_signal(&r->changed);
let me suspect that you have other code that is also manipulating the structure pointed to by r. In any case it makes not much sense if you have nobody waiting for that condition variable. (And you should invert the unlock and signal lines.)
So, probably your error is just somewhere else, where you access r simultaneaously without taking the lock on the mutex. You didn't show us the rest of your code, so saying more would be even more guess work.

Deallocating memory in multi-threaded environment

I'm having a hard time figuring out how to manage deallocation of memory in multithreaded environments. Specifically what I'm having a hard time with is using a lock to protect a structure, but when it's time to free the structure, you have to unlock the lock to destroy the lock itself. Which will cause problems if a separate thread is waiting on that same lock that you need to destroy.
I'm trying to come up with a mechanism that has retain counts, and when the object's retain count is 0, it's all freed. I've been trying a number of different things but just can't get it right. As I've been doing this it seems like you can't put the locking mechanism inside of the structure that you need to be able to free and destroy, because that requires you unlock the the lock inside of it, which could allow another thread to proceed if it was blocked in a lock request for that same structure. Which would mean that something undefined is guaranteed to happen - the lock was destroyed, and deallocated so either you get memory access errors, or you lock on undefined behavior..
Would someone mind looking at my code? I was able to put together a sandboxed example that demonstrates what I'm trying without a bunch of files.
http://pastebin.com/SJC86GDp
#include <pthread.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
struct xatom {
short rc;
pthread_rwlock_t * rwlck;
};
typedef struct xatom xatom;
struct container {
xatom * atom;
};
typedef struct container container;
#define nr 1
#define nw 2
pthread_t readers[nr];
pthread_t writers[nw];
container * c;
void retain(container * cont);
void release(container ** cont);
short retain_count(container * cont);
void * rth(void * arg) {
short rc;
while(1) {
if(c == NULL) break;
rc = retain_count(c);
}
printf("rth exit!\n");
return NULL;
}
void * wth(void * arg) {
while(1) {
if(c == NULL) break;
release((container **)&c);
}
printf("wth exit!\n");
return NULL;
}
short retain_count(container * cont) {
short rc = 1;
pthread_rwlock_rdlock(cont->atom->rwlck);
printf("got rdlock in retain_count\n");
rc = cont->atom->rc;
pthread_rwlock_unlock(cont->atom->rwlck);
return rc;
}
void retain(container * cont) {
pthread_rwlock_wrlock(cont->atom->rwlck);
printf("got retain write lock\n");
cont->atom->rc++;
pthread_rwlock_unlock(cont->atom->rwlck);
}
void release(container ** cont) {
if(!cont || !(*cont)) return;
container * tmp = *cont;
pthread_rwlock_t ** lock = (pthread_rwlock_t **)&(*cont)->atom->rwlck;
pthread_rwlock_wrlock(*lock);
printf("got release write lock\n");
if(!tmp) {
printf("return 2\n");
pthread_rwlock_unlock(*lock);
if(*lock) {
printf("destroying lock 1\n");
pthread_rwlock_destroy(*lock);
*lock = NULL;
}
return;
}
tmp->atom->rc--;
if(tmp->atom->rc == 0) {
printf("deallocating!\n");
*cont = NULL;
pthread_rwlock_unlock(*lock);
if(pthread_rwlock_trywrlock(*lock) == 0) {
printf("destroying lock 2\n");
pthread_rwlock_destroy(*lock);
*lock = NULL;
}
free(tmp->atom->rwlck);
free(tmp->atom);
free(tmp);
} else {
pthread_rwlock_unlock(*lock);
}
}
container * new_container() {
container * cont = malloc(sizeof(container));
cont->atom = malloc(sizeof(xatom));
cont->atom->rwlck = malloc(sizeof(pthread_rwlock_t));
pthread_rwlock_init(cont->atom->rwlck,NULL);
cont->atom->rc = 1;
return cont;
}
int main(int argc, char ** argv) {
c = new_container();
int i = 0;
int l = 4;
for(i=0;i<l;i++) retain(c);
for(i=0;i<nr;i++) pthread_create(&readers[i],NULL,&rth,NULL);
for(i=0;i<nw;i++) pthread_create(&writers[i],NULL,&wth,NULL);
sleep(2);
for(i=0;i<nr;i++) pthread_join(readers[i],NULL);
for(i=0;i<nw;i++) pthread_join(writers[i],NULL);
return 0;
}
Thanks for any help!
Yes, you can't put the key inside the safe. Your approach with refcount (create object when requested and doesn't exist, delete on last release) is correct. But the lock must exist at least a moment before object is created and after it is destroyed - that is, while it is used. You can't delete it from inside of itself.
OTOH, you don't need countless locks, like one for each object you create. One lock that excludes obtaining and releasing of all objects will not create much performance loss at all. So just create the lock on init and destroy on program end. Otaining/releasing an object should take short enough that lock on variable A blocking access to unrelated variable B should almost never happen. If it happens - you can still introduce one lock per all rarely obtained variables and one per each frequently obtained one.
Also, there seems to be no point for rwlock, plain mutex suffices, and the create/destroy operations MUST exclude each other, not just parallel instances of themselves - so use pthread_create_mutex() family instead.

Resources