I've got to write a program that counts series of first 10 terms (sorry for my language, this is the first time that I'm talking about math in english) given by formula (x^i)/i!. So, basically it's trivial. BUT, there's some special requirements. Every single term got to be counted by seperated thread, each of them working concurrent. Then all of them got to save results to common variable named result. After that they have to be added by main thread, which will display final result. All of it using pthreads and mutexes.
That's where I have a problem. I was thinking about using table to store results, but I was told by teacher, that it's not correct solution, cause then I don't have to use mutexes. Any ideas what to do and how to synchronize it? I'm completely new to pthread and mutex.
Here's what I got till now. I'm still working on it, so it's not working at the moment, it's just a scheme of a program, where I want to add mutexes. I hope it's not all wrong. ;p
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <pthread.h>
int number = 0;
float result = 0;
pthread_mutex_t term_lock;
pthread_mutex_t main_lock;
int save = 0; //condition variable
int factorial(int x) {
if(x==0 || x==1)
return 1;
return factorial(x-1)*x;
}
void *term(void *value) {
int x = *(int *)value;
float w;
if(save == 0) {
pthread_mutex_lock(&term_lock);
w = pow(x, number)/factorial(number);
result = w;
printf("%d term of series with x: %d is: %f\n", number, x, w);
number++;
save = 1;
pthread_mutex_unlock(&term_lock);
}
return NULL;
}
int main(void) {
int x, i, err = 0;
float final = 0;
pthread_t threads[10];
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
printf("Get X: \n");
scanf("%d", &x);
for(i=0; i<10; i++)
{
err = pthread_create(&threads[i], &attr, (void *)term, &x);
if(err) {
printf("Error creating threads.\n");
exit(-1);
}
}
i = 0;
while (number <= 10) {
//printf("While Result: %f, final %f\n", result, final); - shows that it's infinite loop
if(save) {
pthread_mutex_lock(&main_lock);
final = final + result;
save = 0;
pthread_mutex_unlock(&main_lock);
printf("If Result: %f, final %f\n", result, final); //final == last result
}
}
return 0;
}
EDIT: If it's not clear - I need help with solution how to store results of all threads in common variable and synchronizing it.
EDIT2: Possible solution - global variable result shared by all threads. Returned to main thread, it would be added to some local variable, so then I could just overwrite it's value with result from another thread. Of course it will require some synchronization, so another thread won't overwrite it before I add it in main thread. What do you think?
EDIT3: I've updated code with what I have right now. Output is giving me me values of 8-9 terms (printf in term), then program is still working, showing nothing. Commented printf showed me, that while loop is infinite. Also local variable final has just last value of result. What am I doing wrong?
It's rather contrived that the main thread should be the one to add the terms, but the individual threads must all write their results to the same variable. I would ordinarily expect each thread to add its own term to the result (which does require mutex), or possibly to put its result in an array (as you suggested), or to add it to a shared queue (which would require mutex), or even to write it to a pipe. Nevertheless, it can be done your teacher's way.
One of the key problems to solve is that you have to distinctly different operations that you need to synchronize:
The various computational threads' writes to the shared result variable
The main thread's reads of the result variable
You cannot use just a single synchronization construct because you cannot that way distinguish between the computational threads and the main thread. One way to approach this would be to synchronize the computational threads' writes via a mutex, as required, and to synchronize those vs. the main thread's reads via semaphores or condition variables. You could also do it with one or more additional mutexes, but not cleanly.
Additional notes:
the result variable in which your threads deposit their terms must be a global. Threads do not have access to the local variables of the function from which they are launched.
the signature of your term() function is incorrect for a thread start function. The argument must be of type void *.
thread start functions are no different from other functions in that their local variables are accessible only for the duration of the function execution. In particular, returning a pointer to a local variable cannot do anything useful, as any attempt to later dereference such a pointer produces undefined behavior.
I'm not going to write your homework for you, but here's an approach that can work:
The main thread initializes a mutex and two semaphores, the latter with initial values zero.
The main thread launches all the computational threads. Although it's ugly, you can feed them their numeric arguments by casting those to void *, and then casting them back in the term() function (since its argument should be a void *).
The main thread then loops. At each iteration, it
waits for semaphore 1 (sem_wait())
adds the value of the global result variable to a running total
posts to semaphore 2 (sem_post())
if as many iterations have been performed as there are threads, breaks from the loop
Meanwhile, each computational thread does this:
Computes the value of the appropriate term
locks the mutex
stores the term value in the global result variable
posts to semaphore 1
waits for semaphore 2
unlocks the mutex
Update:
To use condition variables for this job, it is essential to identify which shared state is being protected by those condition variables, as one must always protect against waking spurriously from a wait on a condition variable.
In this case, it seems natural that the shared state in question would involve the global result variable in which the computational threads return their results. There are really two general, mutually exclusive states of that variable:
Ready to receive a value from a computational thread, and
Ready for the main thread to read.
The computational threads need to wait for the first state, and the main thread needs to wait (repeatedly) for the second. Since there are two different conditions that threads will need to wait on, you need two condition variables. Here's an alternative approach using these ideas:
The main thread initializes a mutex and two condition variables, and sets result to -1.
The main thread launches all the computational threads. Although it's ugly, you can feed them their numeric arguments by casting those to void *, and then casting them back in the term() function (since its argument should be a void *).
The main thread locks the mutex
The main thread then loops. At each iteration, it
tests whether result is non-negative. If so, it
adds the value of result variable to a running total
if as many terms have been added as there are threads, breaks from the loop
sets result to -1.
signals condition variable 1
waits on condition variable 2
Having broken from the loop, the main thread unlocks the mutex
Meanwhile, each computational thread does this:
Computes its term
Locks the mutex
Loops:
Checks the value of result. If it is less than zero then breaks from the loop
waits on condition variable 1
Having broken from the loop, sets result to the computed term
signals condition variable 2
unlocks the mutex
The number is shared between all the threads, so you will need to protect that with a mutex (which is probably what your teacher is wanting to see)
pthread_mutex_t number_mutex;
pthread_mutex_t result_mutex;
int number = 0;
int result = 0;
void *term(int x) {
float w;
// Critical zone, make sure only one thread updates `number`
pthread_mutex_lock(&number_mutex);
int mynumber = number++;
pthread_mutex_unlock(&number_mutex);
// end of critical zone
w = pow(x, mynumber)/factorial(mynumber);
printf("%d term of series with x: %d is: %f\n", mynumber, x, w);
// Critical zone, make sure only one thread updates `result`
pthread_mutex_lock(&result_mutex);
result += w;
pthread_mutex_unlock(&result_mutex);
// end of critical zone
return (void *)0;
}
You should also remove the DETACHED state and do a thread-join at the end of your main program before printing out the result
Here is my solution to your problem:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <pthread.h>
int number=0;
float result[10];
pthread_mutex_t lock;
int factorial(int x) {
if(x==0 || x==1)
return 1;
return factorial(x-1)*x;
}
void *term(void *value) {
int x = *(int *)value;
float w;
pthread_mutex_lock(&lock);
w = pow(x, number)/factorial(number);
printf("%d term of series with x: %d is: %f\n", number, x, w);
result[number] = w;
number++;
pthread_mutex_unlock(&lock);
return NULL;
}
int main(void) {
int x, i, err;
pthread_t threads[10];
printf("Get X: \n");
scanf("%d", &x);
for(i=0; i<=9; i++)
{
err = pthread_create(&threads[i], NULL, term, &x);
if(err) {
printf("Error creating threads.\n");
exit(-1);
}
}
for(i = 0; i < 10; i++)
{
pthread_join(threads[i], NULL);
}
i = 0;
for(i=0; i<=9; i++)
{
printf("%f\n", result[i]);
}
return 0;
}
This code creates a global mutex pthread_mutex_t lock that (in this case) makes sure that same code is not executed by anyone at the same time: basically when one thread executes pthread_mutex_lock(&lock), it forbids any other thread from executing that part of the code until the "original" thread executes pthread_mutex_unlock(&lock).
The other important part is pthread_join: what this does is force the main thread to wait for the execution of every other thread created; this way, float result[10] is written before actually being worked on in the main thread (in this case, the last print instruction).
Other than that, I fixed a couple of bugs in your code that other users pointed out.
If result is to be a single variable, then one solution is to use an array of 20 mutexes: aMutex[20];. Main locks all 20 mutexes then starts the pthreads. Each pthread[i] computes a local term, waits for aMutex[i], stores it's value into result, then unlocks aMutex[10+i]. In main() for(i = 0; i < 20; i++){ unlock aMutex[i] to allow pthread[i] to store its value into result, then wait for aMutex[10+i] to know that result is updated, then add result to a sum. }
Related
I am using two threads to update a global variable. In order to achieve mutual exclusion I am using the test_and_set function. But this code is going to deadlock at some random point during execiution.
Please help.
#include <stdio.h>
#include <pthread.h>
#include <stdatomic.h>
int a = 0;
atomic_int lock = 0;
int test_and_set(int *lock)
{
int l = *lock;
*lock = 1;
return l;
}
void *func(void * param)
{
for(int i = 0; i < 100000; i++)
{
while(test_and_set(&lock));
a++;
lock = 0;
}
}
int main()
{
pthread_t p1, p2;
pthread_create(&p1, NULL, func);
pthread_create(&p2, NULL, func);
pthread_join(p1, NULL);
pthread_join(p2, NULL);
printf("%d, %d\n", a, b);
return 0;
}
Yes, it will deadlock sooner or later.
This is how:
Thread A will "get the lock" and be ready to execute a++ and lock = 0;
Thread B takes over and executes l = *lock;. Since A has the lock l is now 1.
Thread A takes over and executes both a++ and lock = 0;. This sets the lock to zero.
Thread B takes over and executes *lock = 1; and returns l that has the value 1.
Now it's a deadlock. The value of lock is 1. Thread A return 1 from test_and_set and will stay in the while. Thread B will start reading the lock but will always get a 1. There is no thread that will set the lock to zero. Bang...
Likewise you can create a situation where both threads gets "the lock" at the same time.
Long story short: You can't implement a lock like that.
From https://en.wikipedia.org/wiki/Test-and-set:
In computer science, the test-and-set instruction is an instruction used to write 1 (set) to a memory location and return its old value as a single atomic (i.e., non-interruptible) operation.
You don't have "a single atomic operation". Therefore thread A above can execute step 3 between thread B executing step 2 and step 4.
If your processor has a test-and-set instruction, you can try with assembler code.
Note: In the "real" world, task switches do not happen aligned with C statements. A single C statement it typically realized as a number of machine instructions. A task switch can happen at any machine instruction. Still the principle is the same.
I haven't yet figured where your deadlock is occurring but this is not the proper way to implement a lock of data shared by threads. Suppose that lock is 0 and that both threads call test_and_set simultaneously. They will both see that lock is 0 and will both set it to 1. That 0 gets returned and so both threads think that they have the lock.
A mutex (short for "mutual exclusion") accomplishes what you're attempting here by ensuring that the fore-mentioned asynchronous mess can't happen.
When you call pthread_mutex_lock on a mutex, that function will not return until the calling thread has acquired the lock. That means that any other thread calling pthread_mutex_lock will block until the first thread releases the mutex by calling pthread_mutex_unlock.
Here's how you should use the mutex.
int a = 0;
pthread_mutex_t lock;
void *func(void * param)
{
for(int i = 0; i < 100000; i++)
{
pthread_mutex_lock(&lock):
a++;
pthread_mutex_unlock(&lock);
}
}
int main() {
pthread_t p1, p2;
pthread_mutex_init(&lock,NULL);
...
}
This question already has an answer here:
Pthread_create() incorrect start routine parameter passing
(1 answer)
Closed 3 years ago.
I tried to build a program which should create threads and assign a Print function to each one of them, while the main process should use printf function directly.
Firstly, I made it without any synchronization means and expected to get a randomized output.
Later I tried to add a mutex to the Print function which was assigned to the threads and expected to get a chronological output but it seems like the mutex had no effect about the output.
Should I use a mutex on the printf function in the main process as well?
Thanks in advance
My code:
#include <stdio.h>
#include <pthread.h>
#include <errno.h>
pthread_t threadID[20];
pthread_mutex_t lock;
void* Print(void* _num);
int main(void)
{
int num = 20, indx = 0, k = 0;
if (pthread_mutex_init(&lock, NULL))
{
perror("err pthread_mutex_init\n");
return errno;
}
for (; indx < num; ++indx)
{
if (pthread_create(&threadID[indx], NULL, Print, &indx))
{
perror("err pthread_create\n");
return errno;
}
}
for (; k < num; ++k)
{
printf("%d from main\n", k);
}
indx = 0;
for (; indx < num; ++indx)
{
if (pthread_join(threadID[indx], NULL))
{
perror("err pthread_join\n");
return errno;
}
}
pthread_mutex_destroy(&lock);
return 0;
}
void* Print(void* _indx)
{
pthread_mutex_lock(&lock);
printf("%d from thread\n", *(int*)_indx);
pthread_mutex_unlock(&lock);
return NULL;
}
All questions of program bugs notwithstanding, pthreads mutexes provide only mutual exclusion, not any guarantee of scheduling order. This is typical of mutex implementations. Similarly, pthread_create() only creates and starts threads; it does not make any guarantee about scheduling order, such as would justify an assumption that the threads reach the pthread_mutex_lock() call in the same order that they were created.
Overall, if you want to order thread activities based on some characteristic of the threads, then you have to manage that yourself. You need to maintain a sense of which thread's turn it is, and provide a mechanism sufficient to make a thread notice when it's turn arrives. In some circumstances, with some care, you can do this by using semaphores instead of mutexes. The more general solution, however, is to use a condition variable together with your mutex, and some shared variable that serves as to indicate who's turn it currently is.
The code passes the address of the same local variable to all threads. Meanwhile, this variable gets updated by the main thread.
Instead pass it by value cast to void*.
Fix:
pthread_create(&threadID[indx], NULL, Print, (void*)indx)
// ...
printf("%d from thread\n", (int)_indx);
Now, since there is no data shared between the threads, you can remove that mutex.
All the threads created in the for loop have different value of indx. Because of the operating system scheduler, you can never be sure which thread will run. Therefore, the values printed are in random order depending on the randomness of the scheduler. The second for-loop running in the parent thread will run immediately after creating the child threads. Again, the scheduler decides the order of what thread should run next.
Every OS should have an interrupt (at least the major operating systems have). When running the for-loop in the parent thread, an interrupt might happen and leaves the scheduler to make a decision of which thread to run. Therefore, the numbers being printed in the parent for-loop are printed randomly, because all threads run "concurrently".
Joining a thread means waiting for a thread. If you want to make sure you print all numbers in the parent for loop in chronological order, without letting child thread interrupt it, then relocate the for-loop section to be after the thread joining.
I'm writing a program that performs gaussian elimination given an A and B matrix. I first grab divisor and multipliers, create pthreads which execute in gauss function which perform their operations on a single 'column'. Then I call main which generates new divisor and multipliers and passes back for another round of operations by the same threads. Using condition pthread vars to accomplish this.
The code hangs until I create a breakpoint which it then proceeds and finishes. Not sure what's holding it up. Could use some help.
#include <stdio.h>
#include <pthread.h>
//Need one mutex variable and two condition variables (one c var for
//communicating between threads, and one c var for communicating with main).
pthread_mutex_t mut = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
pthread_cond_t condM = PTHREAD_COND_INITIALIZER;
float arr[3][4] = {{2,-3,1, -22},{7,9,-3, 14},{6,7,2,91}};
float mults[3];
float divisor;
int num_items = 3;
void* gauss(void *mine)
{
int thread_count=0;
int x = *((int *)mine);
for(int i=0;i<num_items;i++)
{
/*do something*/
arr[i][x] = arr[i][x] / divisor;
for(int k=0;k<num_items;k++){
if(k!=i)
arr[k][x] -= mults[k] * arr[i][x];
}
/*lock || wait || signal*/
pthread_mutex_lock(&mut);
thread_count++;
if(thread_count < num_items)
pthread_cond_wait(&cond,&mut);
else
{
pthread_cond_signal(&condM);
pthread_cond_wait(&cond,&mut);
thread_count = 0;
}
pthread_mutex_unlock(&mut);
}
return NULL;
}
int main(int argc, const char * argv[]) {
int i, j;
pthread_t threadr[num_items+1]; /*thread id array */
int is[num_items+1];
printf("Test");
// /*input num items*/
// printf("input the number of items ");
// scanf("%d",&num_items);
//
// /*input A array*/
// printf("input A array\n");
// for(i=0;i<num_items;i++)
// for(j=0;j<num_items;j++)
// scanf("%f",&arr[i][j]);
//
// /*input B array*/
// printf("input B array\n");
// for(i=0;i<num_items;i++)
// scanf("%f",&arr[i][num_items]);
/*grab first divisor & multipliers*/
divisor = arr[0][0];
for(i=0;i<num_items;i++)
{
mults[i] = arr[i][0];
}
for(i=0;i<num_items+1;i++)
{
is[i]=i;
if(pthread_create(&threadr[i],NULL,gauss,(void *)&is[i]) != 0)
perror("Pthread_create fails");
}
for(i=1;i<num_items;i++)
{
pthread_mutex_lock(&mut);
pthread_cond_wait(&condM,&mut);
divisor = arr[i][i];
for(j=0;j<num_items;j++)
{
mults[j] = 1;
if(j != i)
mults[j] = arr[j][i];
}
pthread_cond_broadcast(&cond);
pthread_mutex_unlock(&mut);
}
printf("The X values are:\n");
for(i=0;i<num_items; i++) {
printf("%0.3f \n", arr[i][num_items]);
}
/*wait for all threads*/
for(i=0;i<num_items+1; i++)
if (pthread_join(threadr[i],NULL) != 0)
perror("Pthread_join fails");
return 0;
}
You have a race condition (at least one), and your code does not use its condition variables correctly. You can probably fix the former by fixing the latter. Also, I suspect that you intend gauss()'s local variable thread_count to be shared, but variables with no linkage are not shared.
First, the race condition. Consider the main thread: it starts the three other threads, and then locks the mutex and waits for condition variable condM to be signaled. But suppose the threads all manage to signal condM before the main thread starts waiting? Condition variable operations are immediate -- any signals to condM that occur before main() is waiting on it are lost.
Now let's shift gears to talk about condition variables. As the Linux manual for pthread_cond_wait() puts it:
When using condition variables there is always a Boolean predicate involving shared variables associated with each condition wait that is true if the thread should proceed. Spurious wakeups from the pthread_cond_timedwait() or pthread_cond_wait() functions may occur. Since the return from pthread_cond_timedwait() or pthread_cond_wait() does not imply anything about the value of this predicate, the predicate should be re-evaluated upon such return.
In other words, condition variables are used to suspend thread operations pending a given condition becoming true. In abstract terms, that condition is always "it's ok for this thread to proceed", but that's realized in context-specific terms. Most importantly, the fact that a thread wakes from its wait never inherently communicates that the condition is true; it merely indicates that the newly-woken thread should check whether the condition is true. Generally, the thread should also check before waiting for the first time, as the condition may already be true.
In pseudocode, that looks like this:
Thread 1:
lock mutex;
loop
if is_ok_to_proceed then exit loop;
wait on condition variable;
end loop
// ... maybe do mutex-protected work ...
unlock mutex
Thread 2:
lock mutex
// ... maybe do mutex-protected work ...
is_ok_to_proceed = true;
signal condition variable;
unlock mutex
Generally speaking, there is also (mutex-protected) code somewhere else that make the CV predicate false, so that sometimes the threads indeed do execute their waits.
Now consider how that applies to the race condition in main(). How does main() know whether to wait on condM()? There needs to be a shared variable somewhere that answers that for it, and its wait must be conditioned on the value of that variable. Any thread that means to allow the main thread to proceed must both set the appropriate value for the variable and signal condM. The main thread itself should set the variable, too, as needed, to indicate that it is not ready, at that time, to proceed.
Of course, your other CV usage suffers from the same kind of problem, too.
Writing my basic programs on multi threading and I m coming across several difficulties.
In the program below if I give sleep at position 1 then value of shared data being printed is always 10 while keeping sleep at position 2 the value of shared data is always 0.
Why this kind of output is coming ?
How to decide at which place we should give sleep.
Does this mean that if we are placing a sleep inside the mutex then the other thread is not being executed at all thus the shared data being 0.
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include<unistd.h>
pthread_mutex_t lock;
int shared_data = 0;
void * function(void *arg)
{
int i ;
for(i =0; i < 10; i++)
{
pthread_mutex_lock(&lock);
shared_data++;
pthread_mutex_unlock(&lock);
}
pthread_exit(NULL);
}
int main()
{
pthread_t thread;
void * exit_status;
int i;
pthread_mutex_init(&lock, NULL);
i = pthread_create(&thread, NULL, function, NULL);
for(i =0; i < 10; i++)
{
sleep(1); //POSITION 1
pthread_mutex_lock(&lock);
//sleep(1); //POSITION 2
printf("Shared data value is %d\n", shared_data);
pthread_mutex_unlock(&lock);
}
pthread_join(thread, &exit_status);
pthread_mutex_destroy(&lock);
}
When you sleep before you lock the mutex, then you're giving the other thread plenty of time to change the value of the shared variable. That's why you're seeing a value of "10" with the 'sleep' in position #1.
When you grab the mutex first, you're able to lock it fast enough that you can print out the value before the other thread has a chance to modify it. The other thread sits and blocks on the pthread_mutex_lock() call until your main thread has finished sleeping and unlocked it. At that point, the second thread finally gets to run and alter the value. That's why you're seeing a value of "0" with the 'sleep' at position #2.
This is a classic case of a race condition. On a different machine, the same code might not display "0" with the sleep call at position #2. It's entirely possible that the second thread has the opportunity to alter the value of the variable once or twice before your main thread locks the mutex. A mutex can ensure that two threads don't access the same variable at the same time, but it doesn't have any control over the order in which the two threads access it.
I had a full explanation here but ended up deleting it. This is a basic synchronization problem and you should be able to trace and identify it before tackling anything more complicated.
But I'll give you a hint: It's only the sleep() in position 1 that matters; the other one inside the lock is irrelevant as long as it doesn't change the code outside the lock.
I'm writing a program that simply demonstrates writing and reading from a bounded buffer as it outputs what value it expected and what value actually was read. When I define N as a low value, the program executes as expected. However, when I increase the value, I start to see unexpected results. From what I understand, I am creating data races by using two threads.
By looking at the output, I think I've narrowed it down to about three examples of data races as follows:
One thread writes to the buffer while another thread reads.
Two threads simultaneously write to the buffer.
Two threads simultaneously read from the buffer.
Below is my code. The formatting is really strange for the #include statements, so I left them out.
#define BUFFSIZE 10000
#define N 10000
int Buffer[BUFFSIZE];
int numOccupied = 0; //# items currently in the buffer
int firstOccupied = 0; //where first/next value or item is to be found or placed
//Adds a given value to the next position in the buffer
void buffadd(int value)
{
Buffer[firstOccupied + numOccupied++] = value;
}
int buffrem()
{
numOccupied--;
return(Buffer[firstOccupied++]);
}
void *tcode1(void *empty)
{
int i;
//write N values into the buffer
for(i=0; i<N; i++)
buffadd(i);
}
void *tcode2(void *empty)
{
int i, val;
//Read N values from the buffer, checking the value read with what is expected for testing
for(i=0; i<N; i++)
{
val = buffrem();
if(val != i)
printf("tcode2: removed %d, expected %d\n", val, i);
}
}
main()
{
pthread_t tcb1, tcb2;
pthread_create(&tcb1, NULL, tcode1, NULL);
pthread_create(&tcb2, NULL, tcode2, NULL);
pthread_join(tcb1, NULL);
pthread_join(tcb2, NULL);
}
So here are my questions.
Where (in my code) do these data races occur?
How do I fix them?
Use a mutex to synchronize access to your shared data structure. You will need the following:
pthread_mutex_t mutex;
pthread_mutex_init(&mutex, NULL);
pthread_mutex_lock(&mutex);
pthread_mutex_unlock(&mutex);
pthread_mutex_destroy(&mutex);
As a basic principle, you lock the mutex before reading/writing to the data structure shared between threads, and unlock it afterwards. In this case, the Buffer plus metadata numOccupied and firstOccupied are your shared data structure you need to protect. So in buffadd() and buffrem(), lock the mutex at the beginning, unlock it at the end. And in main(), initialize the mutex before starting the threads, and destroy it after joining.
You have races because both threads access the same global vars numOccupied and firstOccupied. You need to synchronize access to the variables with some form of locking. For example, you could use a semaphore or mutex to lock access to the shared global state while doing add / remove operations.