Passing threads a value from a for loop

Passing threads a value from a for loop - c

I am attempting to create threads and pass each thread the value from a for loop. Here is the code segment
pthread_t *threadIDs;
int i = 0;
if(impl == 1)
{
threadIDs = (pthread_t *)malloc(sizeof(pthread_t)*reduces);
for(;i < reduces; i++)
{
pthread_create(&threadIDs[i], NULL, reduce,&i);
}
}
It is not passing the correct values of the loop, which makes sense since I am creating a race condition. What is the simplest way to pass the correct value of i from my loop?
Another question, will each thread finish executing before the next one is created and called?

You've already dynamically created an array of thread IDs. Do the same for the values you want to pass in.
pthread_t *threadIDs;
int *values;
int i = 0;
if(impl == 1)
{
threadIDs = malloc(sizeof(pthread_t)*reduces);
values = malloc(sizeof(int)*reduces);
for(;i < reduces; i++)
{
values[i] = i;
pthread_create(&threadIDs[i], NULL, reduce, &values[i]);
}
}
Each thread will be working with a different array member, so there's no race condition.

You can define a structure and assign i to the variable of the object.
#include <stdio.h>
#include <pthread.h>
typedef struct Param_ {
int index;
}Param;
static void* thread(void* p) {
Param* param = p;
printf("index: %d\n", param->index);
}
int main() {
int i = 0;
int reduces = 10;
pthread_t *threadIDs;
threadIDs = (pthread_t *)malloc(sizeof(pthread_t)*reduces);
for(; i < reduces; i++)
{
Param* p;
p = (Param*)malloc(sizeof(*p));
p->index = i;
pthread_create(&threadIDs[i], NULL, thread, p);
}
return 0;
}

What is the simplest way to pass the correct value of i from my loop?
What is to be considered "simple" depends on the use case, so here another approach to solve the issues you present:
#include <pthread.h>
pthread_mutex_t m_init;
pthread_cond_t c_init;
int init_done = 1;
void* thread_function(void * pv)
{
pthread_mutex_lock(&m_init);
size_t i = *((size_t*) pv);
init_done = 1;
pthread_cond_signal(&c_init);
pthread_mutex_unlock(&m_init);
...
}
#define THREADS_MAX (42)
int main(void)
{
pthread_t thread[THREADS_MAX];
pthread_mutex_init(&m_init, NULL);
pthread_cond_init(&c_init, NULL);
for(size_t i = 0; i < THREADS_MAX; ++i)
{
pthread_mutex_lock(&m_init);
init_done = 0;
pthread_create(&thread[i], NULL, thread_function, &i);
while (!init_done)
{
pthread_cond_wait(&c_init);
}
pthread_mutex_unlock(&m_init);
}
...
}
(error checking omitted for the sake of legibility)

Related

malloc() is returning the same address multiple times, even when I haven't used free()

EDIT: I did use free(), ignore the title.
The gist is that every time malloc() is called, the address 0x8403620
is returned, which I found out using Gdb.
tellers[i] = create_teller(0, i, NULL);
I first use malloc() on line 72 to create 3 teller structures. The first addressed returned, visible through Gdb, is 0x84003620. The second is
0x84033a0, the third 0x84034e0. Everything seems fine.
clients[i] = create_client(0, i, -1, -1);
Then I use malloc() on line 77 with the create_client() function to
create 100 clients. The first address, assigned to client[0], is ...
0x8403620. The same as tellers[0]. It gets worse. The next address
returned from malloc() is 0x8403620 again for when i = 1, and so
on for i = 3, 4, ..., 99.
It isn't inherently the create_client() or the create_teller() functions, but
instead the malloc() function itself.
This is simply a very odd situation.
Now, I'd like to ask: Am I using malloc() wrong? Or is my version of malloc() bugged and should I somehow reinstall whatever it is? It's most likely my code since it works for creating the tellers, just not for the clients.
Here is the full code:
#include <pthread.h>
#include <semaphore.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <time.h>
#include <assert.h>
typedef struct teller teller_t;
typedef struct client client_t;
teller_t * create_teller (pthread_t thread_id, int id, client_t *assigned_client);
client_t * create_client (pthread_t thread_id, int id, int operation, int amount);
void * run_teller (void *arg);
void * run_client (void *arg);
/* types of operations */
#define DEPOSIT 0
#define WITHDRAW 1
#define NUM_TELLERS 3
#define NUM_CLIENTS 100
struct client {
pthread_t thread_id;
int id;
int operation;
int amount;
};
struct teller {
pthread_t thread_id;
int id;
bool available;
client_t *assigned_client;
};
client_t *clients[100];
teller_t *tellers[3];
/* only 2 tellers at a time can access */
sem_t safe;
/* only 1 teller at a time can access */
sem_t manager;
/* amount of tellers available, at most 3 */
sem_t line; /* rename to available? */
/* each teller waiting for a client to be assigned to them */
sem_t wait_for_client[3];
int
main (int argc, char **argv) {
(void) argc;
(void) argv;
srand(time(NULL));
/* This also tells us how many clients have been served */
int client_index = 0;
sem_init(&safe, 0, 2);
sem_init(&manager, 0, 1);
sem_init(&line, 0, 0);
for (int i = 0; i < 3; i++)
sem_init(&wait_for_client[i], 0, 0);
for (int i = 0; i < NUM_TELLERS; i++) {
tellers[i] = create_teller(0, i, NULL);
pthread_create(&tellers[i]->thread_id, NULL, run_teller, (void *) tellers[i]);
}
for (int i = 0; i < NUM_CLIENTS; i++) {
clients[i] = create_client(0, i, -1, -1);
pthread_create(&clients[i]->thread_id, NULL, run_client, (void *) clients[i]);
}
/* DEBUG
for (int i = 0; i < NUM_CLIENTS; i++) {
printf("client %d has id %d\n", i, clients[i]->id);
}
*/
// No threads should get past this point!!!
// ==------------------------------------==
// Should all of this below be handled by the clients instead of main?
while (1) {
if (client_index >= NUM_CLIENTS) {
// TODO:
// tell tellers that there are no more clients
// so they should close, then then close the bank.
break;
}
sem_wait(&line);
for (int i = 0; i < 3; i++) {
if (tellers[i]->available) {
int client_id = clients[client_index]->id;
//printf("client_index = %d\n", client_index); // DEBUG
tellers[i]->assigned_client = clients[client_index++];
tellers[i]->available = false;
printf(
"Client %d goes to Teller %d\n",
client_id,
tellers[i]->id
);
sem_post(&wait_for_client[i]);
break;
}
}
//sem_post(&line); // Is this needed?
}
return EXIT_SUCCESS;
}
teller_t *
create_teller (pthread_t thread_id, int id, client_t *assigned_client) {
teller_t *t = (teller_t *) malloc(sizeof(teller_t));
if (t == NULL) {
printf("ERROR: Unable to allocate teller_t.\n");
exit(EXIT_FAILURE);
}
t->thread_id = thread_id;
t->id = id;
t->available = true;
t->assigned_client = assigned_client;
return t;
}
/* TODO: Malloc returns the same address everytime, fix this */
client_t *
create_client (pthread_t thread_id, int id, int operation, int amount) {
client_t *c = malloc(sizeof(client_t));
if (c == NULL) {
printf("ERROR: Unable to allocate client_t.\n");
exit(EXIT_FAILURE);
}
c->thread_id = thread_id;
c->id = id;
c->operation = operation;
c->amount = amount;
return c;
}
void *
run_teller (void *arg) {
teller_t *t = (teller_t *) arg;
printf("Teller %d is available\n", t->id);
while (1) {
/* tell the line that a teller is available */
sem_post(&line);
/* pass when the line assignes a client to this teller */
sem_wait(&wait_for_client[t->id]);
assert(t->assigned_client != NULL);
if (t->assigned_client->operation == WITHDRAW) {
}
else {
}
}
free(arg);
pthread_cancel(t->thread_id);
return NULL;
}
void *
run_client (void *arg) {
client_t *c = (client_t *) arg;
c->operation = rand() & 1;
printf(
"Client %d waits in line to make a %s\n",
c->id,
((c->operation == DEPOSIT) ? "Deposit" : "Withdraw")
);
free(arg);
pthread_cancel(c->thread_id);
return NULL;
}

Then I use malloc() on line 77 with the create_client() function to create 100 clients.
Not exactly, you create one object, then you spawn a thread that manages that object, run_client() and then repeat. But run_client() basically does nothing except free() your client object! So malloc is totally right returning the same address again, as it is now free memory.
It just happens that your client threads are faster than your main one. Your problem here is that you are freeing the objects from secondary threads while leaving the dangling pointers in the global pointer array. If you use that array for debugging purposes, then nothing is actually wrong here, but if you want to use the client objects somewhen in the future, then you should not free your clients in the first place.

C Program exits without any output

This had a previous question regarding multi thread issues Here. Now the issue is that the program exits without any input. The program gets the input from a text file given as arguments with executing. It should only contain numbers separated by spaces and if theres any other character it should give an error as done in row_check functions. Can anyone suggest why it would exit without any error ?.
#include<pthread.h>
#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<ncurses.h>
const unsigned int NUM_OF_THREADS = 9;
typedef struct thread_data_s {
char *ptr;
int row_num;
} thread_data_t;
void report(const char *s,int w,int q);
void* row_check(void* data)
{
thread_data_t *my_data_ptr = data;
int j, flag;
flag=0x0000;
for(j = 0; j < 9; j++)
{
flag |= 1u << ( (my_data_ptr->ptr)[j] - 1 );
if (flag != 0x01FF){
report("row", my_data_ptr->row_num, j-1);
}
}
return NULL;
}
void report(const char *s,int w,int q)
{
printf("\nThe sudoku is INCORRECT");
printf("\nin %s. Row:%d,Column:%d",s,w+1,q+1);
getchar();
exit(0);
}
int main(int argc, char* argv[])
{
int i,j;
char arr1[9][9];
FILE *file = fopen(argv[1], "r");
if (file == 0)
{
fprintf(stderr, "failed");
exit(1);
}
int col=0,row=0;
int num;
while(fscanf(file, "%c ", &num) ==1) {
arr1[row][col] = num;
col++;
if(col ==9)
{
row++;
col = 0;
}
}
fclose(file);
int n;
thread_data_t data[NUM_OF_THREADS];
pthread_t tid;
pthread_attr_t attr;
for(n=0; n < NUM_OF_THREADS; n++)
{
data[n].ptr = &arr1[n][0];
data[n].row_num = n;
pthread_create(&tid, &attr, row_check, &data[n]);
}
for(n=0; n < NUM_OF_THREADS; n++)
{
pthread_join(tid, NULL);
}
return 0;
}

The following in one of the issues in the code and it would explain why the application exists so soon...
The following code doesn't join all the threads it creates (so the application exits and terminates the threads before they finished running):
thread_data_t data[NUM_OF_THREADS];
pthread_t tid;
pthread_attr_t attr;
for(n=0; n < NUM_OF_THREADS; n++)
{
data[n].ptr = &arr1[n][0];
data[n].row_num = n;
pthread_create(&tid, &attr, row_check, &data[n]);
}
for(n=0; n < NUM_OF_THREADS; n++)
{
pthread_join(tid, NULL);
}
As you can see, the code is only saving the pointer to one of the threads (the value in tid is always replaced, overwriting the existing data) and joining that thread (instead of all of them).
This might be better constructed as:
thread_data_t data[NUM_OF_THREADS];
pthread_t tid[NUM_OF_THREADS];
for(n=0; n < NUM_OF_THREADS; n++)
{
data[n].ptr = &arr1[n][0];
data[n].row_num = n;
pthread_create(tid + n, NULL, row_check, &data[n]);
}
for(n=0; n < NUM_OF_THREADS; n++)
{
pthread_join(tid[n], NULL);
}
This way the application will wait for all the threads to complete their tasks (and report any errors) before returning.

Can anyone suggest why it would exit without any error ?.
Yes,
the posted code has no action when all 9 rows of the puzzle result are correct, it just gracefully exits.
and to further muddy the logic.
The posted code only checks the last thread created, and when that thread exits, the program exits, That does not mean the other threads have exited
One further serious detail. the call to pthread_create() is passing the address of the attr variable, but that variable contains what ever trash is/was on the stack where that variable was declared. Since the code is not setting any specific attributes for the threads, strongly suggest eliminate the variable and simply use NULL in the second parameter to pthread_create()

Segmentation fault: core dumped during execution of multi-threaded program

I have realized that my code was too lengthy and rather hard to read.
Can you check over the way I pass in the arguments and constructing the arguments in the main body?
Essentially, provided that I have correct implementation of "produce" and "consume" functions, I want to pass in a shared circular queue and semaphores and mutexes to each produce/consume threads.
typedef struct circularQueue
{
int *items;
int *head;
int *tail;
int numProduced;
int numConsumed;
} circularQueue;
typedef struct threadArg
{
int id;
circularQueue *queue;
pthread_mutex_t *mutex;
sem_t *spaces;
sem_t *itemAvail;
int numItems;
int bufferSize;
int numProducer;
int numConsumer;
} threadArg;
pthread_t *producerThd;
pthread_t *consumerThd;
int main(int argc, char* argv[])
{
pthread_attr_t attr;
// In fo to pass to thread arg
circularQueue *myQueue;
pthread_mutex_t useSharedMem;
sem_t spaces;
sem_t itemAvail;
int numItems;
int bufferSize;
int numProducer;
int numConsumer;
int i, j, k, l;
if(argc != 5)
{
printf("Enter in 4 arguments - N B P C\n");
return -1;
}
numItems = atoi(argv[1]);
bufferSize = atoi(argv[2]);
numProducer = atoi(argv[3]);
numConsumer = atoi(argv[4]);
if(numItems == 0 || bufferSize == 0 || numProducer == 0 || numConsumer == 0)
{
printf("Parameters should not be 0\n");
return -1;
}
// Initialize list of threads
producerThd = malloc(sizeof(pthread_t) * numProducer);
consumerThd = malloc(sizeof(pthread_t) * numConsumer);
// Initialize semaphores
sem_init(&spaces, 0, bufferSize);
sem_init(&itemAvail, 0, 0);
// Initialize mutex
pthread_mutex_init(&useSharedMem, NULL);
// Initialzie thread attributes
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
// Initialize queue
myQueue = (circularQueue*)malloc(sizeof(circularQueue));
myQueue->items = (int*)malloc(sizeof(int)*bufferSize);
myQueue->head = myQueue->items;
myQueue->tail = myQueue->items;
myQueue->numProduced = 0;
myQueue->numConsumed = 0;
// thread arguments
for(i = 0; i < numProducer; i++)
{
// Initialize thraed args
threadArg *args = (threadArg*)malloc(sizeof(threadArg));
args->queue = (circularQueue*)malloc(sizeof(circularQueue));
args->mutex = &useSharedMem;
args->spaces = &spaces;
args->itemAvail = &itemAvail;
args->numItems = numItems;
args->bufferSize = bufferSize;
args->numProducer = numProducer;
args->numConsumer = numConsumer;
args->id = i;
pthread_t thisThread = *(producerThd + i);
pthread_create(&thisThread, &attr, produce, args);
}
for(j = 0; j < numConsumer; j++)
{
// Initialize thraed args
threadArg *args = (threadArg*)malloc(sizeof(threadArg));
args->queue = (circularQueue*)malloc(sizeof(circularQueue));
args->mutex = &useSharedMem;
args->spaces = &spaces;
args->itemAvail = &itemAvail;
args->numItems = numItems;
args->bufferSize = bufferSize;
args->numProducer = numProducer;
args->numConsumer = numConsumer;
args->id = j;
pthread_t thisThread = *(consumerThd + i);
pthread_create(&thisThread, &attr, consume, args);
}
for(k = 0; k < numProducer; k++)
{
pthread_join(*(producerThd+k), NULL);
}
printf("Finished waiting for producers\n");
for(l = 0; l < numConsumer; l++)
{
pthread_join(*(consumerThd+l), NULL);
}
printf("Finished waiting for consumers\n");
free(producerThd);
free(consumerThd);
free(myQueue->items);
free(myQueue);
sem_destroy(&spaces);
sem_destroy(&itemAvail);
fflush(stdout);
return 0;
}
Thank you

There are multiple sources of undefined behavior in your code, you are either compiling without enabling compilation warnings, or what I consider worst you ignore them.
You have the wrong printf() specifier in
printf("cid %d found this item %d as valid item %d\n", myArgs->id, thisItem, validItem);
because validItem is a double, so the last specifier should be %f.
Your thread functions never return a value, but you declare them to return void * which is the signature required for such functions.
You are freeing and dereferencing myQueue in the main() function but you have not initialized it because that code is commented.
Your code is also too hard to read because you have no consistent style and you mix declarations with statements, which make everything very confusing, e.g. determining the scope of a variable is very difficult.
Fixing the code will not only help others read it, but will also help you fix it and find issues quickly.

Multithreaded program not producing desired output

I am writing a code that creates 10 threads and executes those threads with even thread ids first and then executes all those with odd thread ids next. I'm using the POSIX threads library. Here is the code I wrote:
#include "stdlib.h"
#include "pthread.h"
#include "stdio.h"
#define TRUE 1
#define FALSE 0
int EVEN_DONE = FALSE;
int evenThreads, oddThreads = 0;
int currentThread = 0;
//the mutex for thread synchronization
static pthread_mutex_t mymutex = PTHREAD_MUTEX_INITIALIZER;
//the condition variable;
static pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
void * printEven(unsigned long id)
{
pthread_mutex_lock(&mymutex);
evenThreads++;
printf("TID: %lu, Hello from even\n", id);
// this condition checks whether even threads have finished executing
if(evenThreads + oddThreads >= 10) {
EVEN_DONE = TRUE;
pthread_cond_broadcast(&cond);
}
pthread_mutex_unlock(&mymutex);
return NULL;
}
void * printOdd(unsigned long id)
{
pthread_mutex_lock(&mymutex);
while (!EVEN_DONE) {
oddThreads++;
pthread_cond_wait(&cond, &mymutex);
printf("TID: %lu, Hello from odd\n", id);
}
pthread_mutex_unlock(&mymutex);
return NULL;
}
void * threadFunc(void *arg)
{
unsigned long id = (unsigned long)pthread_self();
if (id % 2 == 0)
{
printEven(id);
}
else
{
printOdd(id);
}
return NULL;
}
int main()
{
pthread_t* threads;
int num_threads = 10;
int i, j;
threads = malloc(num_threads * sizeof(threads));
for ( i = 0; i < 10; i++) {
pthread_create(&threads[i], NULL, threadFunc, NULL);
}
for ( j = 0; j < 10; j++) {
pthread_join(threads[j], NULL);
}
printf("Finished executing all threads\n");
return 0;
}
However, when I run the code it doesn't produce the desired output. The output I'm getting is this:
Apparently, it seems that all the thread IDs are even numbers. However, I do think there is a problem with my code. What am I doing wrong? How can I achieve the desired output?
(Note: I'm at beginner level when it comes to POSIX threads and multithreading in general)
Thanks in advance.

There is no guarantee in POSIX that the pthread_t type returned by pthread_self() is a numeric type that can be cast to an unsigned long - it is allowed to be a structure type, for example.
If you want to write your code in a POSIX-conforming way, you will need to allocate numeric thread IDs yourself. For example, you could have:
unsigned long allocate_id(void)
{
static unsigned long next_id = 0;
static pthread_mutex_t id_lock = PTHREAD_MUTEX_INITIALIZER;
unsigned long id;
pthread_mutex_lock(&id_lock);
id = next_id++;
pthread_mutex_unlock(&id_lock);
return id;
}
Then in your threads use:
unsigned long id = allocate_id();
Controlling the allocation of IDs yourself also allows you to control the sequence - for example in this case you can ensure that IDs are sequentially allocated so that you will have both odd and even IDs.

How to synchronize manager/worker pthreads without a join?

I'm familiar with multithreading and I've developed many multithreaded programs in Java and Objective-C successfully. But I couldn't achieve the following in C using pthreads without using a join from the main thread:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#define NUM_OF_THREADS 2
struct thread_data {
int start;
int end;
int *arr;
};
void print(int *ints, int n);
void *processArray(void *args);
int main(int argc, const char * argv[])
{
int numOfInts = 10;
int *ints = malloc(numOfInts * sizeof(int));
for (int i = 0; i < numOfInts; i++) {
ints[i] = i;
}
print(ints, numOfInts); // prints [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
pthread_t threads[NUM_OF_THREADS];
struct thread_data thread_data[NUM_OF_THREADS];
// these vars are used to calculate the index ranges for each thread
int remainingWork = numOfInts, amountOfWork;
int startRange, endRange = -1;
for (int i = 0; i < NUM_OF_THREADS; i++) {
amountOfWork = remainingWork / (NUM_OF_THREADS - i);
startRange = endRange + 1;
endRange = startRange + amountOfWork - 1;
thread_data[i].arr = ints;
thread_data[i].start = startRange;
thread_data[i].end = endRange;
pthread_create(&threads[i], NULL, processArray, (void *)&thread_data[i]);
remainingWork -= amountOfWork;
}
// 1. Signal to the threads to start working
// 2. Wait for them to finish
print(ints, numOfInts); // should print [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
free(ints);
return 0;
}
void *processArray(void *args)
{
struct thread_data *data = (struct thread_data *)args;
int *arr = data->arr;
int start = data->start;
int end = data->end;
// 1. Wait for a signal to start from the main thread
for (int i = start; i <= end; i++) {
arr[i] = arr[i] + 1;
}
// 2. Signal to the main thread that you're done
pthread_exit(NULL);
}
void print(int *ints, int n)
{
printf("[");
for (int i = 0; i < n; i++) {
printf("%d", ints[i]);
if (i+1 != n)
printf(", ");
}
printf("]\n");
}
I would like to achieve the following in the above code:
In main():
Signal to the threads to start working.
Wait for the background threads to finish.
In processArray():
Wait for a signal to start from the main thread
Signal to the main thread that you're done
I don't want to use a join in the main thread because in the real application, the main thread will create the threads once, and then it will signal to the background threads to work many times, and I can't let the main thread proceed unless all the background threads have finished working. In the processArray function, I will put an infinite loop as following:
void *processArray(void *args)
{
struct thread_data *data = (struct thread_data *)args;
while (1)
{
// 1. Wait for a signal to start from the main thread
int *arr = data->arr;
int start = data->start;
int end = data->end;
// Process
for (int i = start; i <= end; i++) {
arr[i] = arr[i] + 1;
}
// 2. Signal to the main thread that you're done
}
pthread_exit(NULL);
}
Note that I'm new to C and the posix API, so excuse me if I'm missing something obvious. But I really tried many things, starting from using a mutex, and an array of semaphores, and a mixture of both, but without success. I think a condition variable may help, but I couldn't understand how it could be used.
Thanks for your time.
Problem Solved:
Thank you guys so much! I was finally able to get this to work safely and without using a join by following your tips. Although the solution is somewhat ugly, it gets the job done and the performance gains is worth it (as you'll see below). For anyone interested, this is a simulation of the real application I'm working on, in which the main thread keeps giving work continuously to the background threads:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#define NUM_OF_THREADS 5
struct thread_data {
int id;
int start;
int end;
int *arr;
};
pthread_mutex_t currentlyIdleMutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t currentlyIdleCond = PTHREAD_COND_INITIALIZER;
int currentlyIdle;
pthread_mutex_t workReadyMutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t workReadyCond = PTHREAD_COND_INITIALIZER;
int workReady;
pthread_cond_t currentlyWorkingCond = PTHREAD_COND_INITIALIZER;
pthread_mutex_t currentlyWorkingMutex= PTHREAD_MUTEX_INITIALIZER;
int currentlyWorking;
pthread_mutex_t canFinishMutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t canFinishCond = PTHREAD_COND_INITIALIZER;
int canFinish;
void print(int *ints, int n);
void *processArray(void *args);
int validateResult(int *ints, int num, int start);
int main(int argc, const char * argv[])
{
int numOfInts = 10;
int *ints = malloc(numOfInts * sizeof(int));
for (int i = 0; i < numOfInts; i++) {
ints[i] = i;
}
// print(ints, numOfInts);
pthread_t threads[NUM_OF_THREADS];
struct thread_data thread_data[NUM_OF_THREADS];
workReady = 0;
canFinish = 0;
currentlyIdle = 0;
currentlyWorking = 0;
// these vars are used to calculate the index ranges for each thread
int remainingWork = numOfInts, amountOfWork;
int startRange, endRange = -1;
// Create the threads and give each one its data struct.
for (int i = 0; i < NUM_OF_THREADS; i++) {
amountOfWork = remainingWork / (NUM_OF_THREADS - i);
startRange = endRange + 1;
endRange = startRange + amountOfWork - 1;
thread_data[i].id = i;
thread_data[i].arr = ints;
thread_data[i].start = startRange;
thread_data[i].end = endRange;
pthread_create(&threads[i], NULL, processArray, (void *)&thread_data[i]);
remainingWork -= amountOfWork;
}
int loops = 1111111;
int expectedStartingValue = ints[0] + loops; // used to validate the results
// The elements in ints[] should be incremented by 1 in each loop
while (loops-- != 0) {
// Make sure all of them are ready
pthread_mutex_lock(&currentlyIdleMutex);
while (currentlyIdle != NUM_OF_THREADS) {
pthread_cond_wait(&currentlyIdleCond, &currentlyIdleMutex);
}
pthread_mutex_unlock(&currentlyIdleMutex);
// All threads are now blocked; it's safe to not lock the mutex.
// Prevent them from finishing before authorized.
canFinish = 0;
// Reset the number of currentlyWorking threads
currentlyWorking = NUM_OF_THREADS;
// Signal to the threads to start
pthread_mutex_lock(&workReadyMutex);
workReady = 1;
pthread_cond_broadcast(&workReadyCond );
pthread_mutex_unlock(&workReadyMutex);
// Wait for them to finish
pthread_mutex_lock(&currentlyWorkingMutex);
while (currentlyWorking != 0) {
pthread_cond_wait(&currentlyWorkingCond, &currentlyWorkingMutex);
}
pthread_mutex_unlock(&currentlyWorkingMutex);
// The threads are now waiting for permission to finish
// Prevent them from starting again
workReady = 0;
currentlyIdle = 0;
// Allow them to finish
pthread_mutex_lock(&canFinishMutex);
canFinish = 1;
pthread_cond_broadcast(&canFinishCond);
pthread_mutex_unlock(&canFinishMutex);
}
// print(ints, numOfInts);
if (validateResult(ints, numOfInts, expectedStartingValue)) {
printf("Result correct.\n");
}
else {
printf("Result invalid.\n");
}
// clean up
for (int i = 0; i < NUM_OF_THREADS; i++) {
pthread_cancel(threads[i]);
}
free(ints);
return 0;
}
void *processArray(void *args)
{
struct thread_data *data = (struct thread_data *)args;
int *arr = data->arr;
int start = data->start;
int end = data->end;
while (1) {
// Set yourself as idle and signal to the main thread, when all threads are idle main will start
pthread_mutex_lock(&currentlyIdleMutex);
currentlyIdle++;
pthread_cond_signal(&currentlyIdleCond);
pthread_mutex_unlock(&currentlyIdleMutex);
// wait for work from main
pthread_mutex_lock(&workReadyMutex);
while (!workReady) {
pthread_cond_wait(&workReadyCond , &workReadyMutex);
}
pthread_mutex_unlock(&workReadyMutex);
// Do the work
for (int i = start; i <= end; i++) {
arr[i] = arr[i] + 1;
}
// mark yourself as finished and signal to main
pthread_mutex_lock(&currentlyWorkingMutex);
currentlyWorking--;
pthread_cond_signal(&currentlyWorkingCond);
pthread_mutex_unlock(&currentlyWorkingMutex);
// Wait for permission to finish
pthread_mutex_lock(&canFinishMutex);
while (!canFinish) {
pthread_cond_wait(&canFinishCond , &canFinishMutex);
}
pthread_mutex_unlock(&canFinishMutex);
}
pthread_exit(NULL);
}
int validateResult(int *ints, int n, int start)
{
int tmp = start;
for (int i = 0; i < n; i++, tmp++) {
if (ints[i] != tmp) {
return 0;
}
}
return 1;
}
void print(int *ints, int n)
{
printf("[");
for (int i = 0; i < n; i++) {
printf("%d", ints[i]);
if (i+1 != n)
printf(", ");
}
printf("]\n");
}
I'm not sure though if pthread_cancel is enough for clean up! As for the barrier, it would've been of a great help if it wasn't limited to some OSs as mentioned by #Jeremy.
Benchmarks:
I wanted to make sure that these many conditions aren't actually slowing down the algorithm, so I've setup this benchmark to compare the two solutions:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#include <sys/time.h>
#include <sys/resource.h>
#define NUM_OF_THREADS 5
struct thread_data {
int start;
int end;
int *arr;
};
pthread_mutex_t currentlyIdleMutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t currentlyIdleCond = PTHREAD_COND_INITIALIZER;
int currentlyIdle;
pthread_mutex_t workReadyMutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t workReadyCond = PTHREAD_COND_INITIALIZER;
int workReady;
pthread_cond_t currentlyWorkingCond = PTHREAD_COND_INITIALIZER;
pthread_mutex_t currentlyWorkingMutex= PTHREAD_MUTEX_INITIALIZER;
int currentlyWorking;
pthread_mutex_t canFinishMutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t canFinishCond = PTHREAD_COND_INITIALIZER;
int canFinish;
void *processArrayMutex(void *args);
void *processArrayJoin(void *args);
double doItWithMutex(pthread_t *threads, struct thread_data *data, int loops);
double doItWithJoin(pthread_t *threads, struct thread_data *data, int loops);
int main(int argc, const char * argv[])
{
int numOfInts = 10;
int *join_ints = malloc(numOfInts * sizeof(int));
int *mutex_ints = malloc(numOfInts * sizeof(int));
for (int i = 0; i < numOfInts; i++) {
join_ints[i] = i;
mutex_ints[i] = i;
}
pthread_t join_threads[NUM_OF_THREADS];
pthread_t mutex_threads[NUM_OF_THREADS];
struct thread_data join_thread_data[NUM_OF_THREADS];
struct thread_data mutex_thread_data[NUM_OF_THREADS];
workReady = 0;
canFinish = 0;
currentlyIdle = 0;
currentlyWorking = 0;
int remainingWork = numOfInts, amountOfWork;
int startRange, endRange = -1;
for (int i = 0; i < NUM_OF_THREADS; i++) {
amountOfWork = remainingWork / (NUM_OF_THREADS - i);
startRange = endRange + 1;
endRange = startRange + amountOfWork - 1;
join_thread_data[i].arr = join_ints;
join_thread_data[i].start = startRange;
join_thread_data[i].end = endRange;
mutex_thread_data[i].arr = mutex_ints;
mutex_thread_data[i].start = startRange;
mutex_thread_data[i].end = endRange;
pthread_create(&mutex_threads[i], NULL, processArrayMutex, (void *)&mutex_thread_data[i]);
remainingWork -= amountOfWork;
}
int numOfBenchmarkTests = 100;
int numberOfLoopsPerTest= 1000;
double join_sum = 0.0, mutex_sum = 0.0;
for (int i = 0; i < numOfBenchmarkTests; i++)
{
double joinTime = doItWithJoin(join_threads, join_thread_data, numberOfLoopsPerTest);
double mutexTime= doItWithMutex(mutex_threads, mutex_thread_data, numberOfLoopsPerTest);
join_sum += joinTime;
mutex_sum+= mutexTime;
}
double join_avg = join_sum / numOfBenchmarkTests;
double mutex_avg= mutex_sum / numOfBenchmarkTests;
printf("Join average : %f\n", join_avg);
printf("Mutex average: %f\n", mutex_avg);
double diff = join_avg - mutex_avg;
if (diff > 0.0)
printf("Mutex is %.0f%% faster.\n", 100 * diff / join_avg);
else if (diff < 0.0)
printf("Join is %.0f%% faster.\n", 100 * diff / mutex_avg);
else
printf("Both have the same performance.");
free(join_ints);
free(mutex_ints);
return 0;
}
// From https://stackoverflow.com/a/2349941/408286
double get_time()
{
struct timeval t;
struct timezone tzp;
gettimeofday(&t, &tzp);
return t.tv_sec + t.tv_usec*1e-6;
}
double doItWithMutex(pthread_t *threads, struct thread_data *data, int num_loops)
{
double start = get_time();
int loops = num_loops;
while (loops-- != 0) {
// Make sure all of them are ready
pthread_mutex_lock(&currentlyIdleMutex);
while (currentlyIdle != NUM_OF_THREADS) {
pthread_cond_wait(&currentlyIdleCond, &currentlyIdleMutex);
}
pthread_mutex_unlock(&currentlyIdleMutex);
// All threads are now blocked; it's safe to not lock the mutex.
// Prevent them from finishing before authorized.
canFinish = 0;
// Reset the number of currentlyWorking threads
currentlyWorking = NUM_OF_THREADS;
// Signal to the threads to start
pthread_mutex_lock(&workReadyMutex);
workReady = 1;
pthread_cond_broadcast(&workReadyCond );
pthread_mutex_unlock(&workReadyMutex);
// Wait for them to finish
pthread_mutex_lock(&currentlyWorkingMutex);
while (currentlyWorking != 0) {
pthread_cond_wait(&currentlyWorkingCond, &currentlyWorkingMutex);
}
pthread_mutex_unlock(&currentlyWorkingMutex);
// The threads are now waiting for permission to finish
// Prevent them from starting again
workReady = 0;
currentlyIdle = 0;
// Allow them to finish
pthread_mutex_lock(&canFinishMutex);
canFinish = 1;
pthread_cond_broadcast(&canFinishCond);
pthread_mutex_unlock(&canFinishMutex);
}
return get_time() - start;
}
double doItWithJoin(pthread_t *threads, struct thread_data *data, int num_loops)
{
double start = get_time();
int loops = num_loops;
while (loops-- != 0) {
// create them
for (int i = 0; i < NUM_OF_THREADS; i++) {
pthread_create(&threads[i], NULL, processArrayJoin, (void *)&data[i]);
}
// wait
for (int i = 0; i < NUM_OF_THREADS; i++) {
pthread_join(threads[i], NULL);
}
}
return get_time() - start;
}
void *processArrayMutex(void *args)
{
struct thread_data *data = (struct thread_data *)args;
int *arr = data->arr;
int start = data->start;
int end = data->end;
while (1) {
// Set yourself as idle and signal to the main thread, when all threads are idle main will start
pthread_mutex_lock(&currentlyIdleMutex);
currentlyIdle++;
pthread_cond_signal(&currentlyIdleCond);
pthread_mutex_unlock(&currentlyIdleMutex);
// wait for work from main
pthread_mutex_lock(&workReadyMutex);
while (!workReady) {
pthread_cond_wait(&workReadyCond , &workReadyMutex);
}
pthread_mutex_unlock(&workReadyMutex);
// Do the work
for (int i = start; i <= end; i++) {
arr[i] = arr[i] + 1;
}
// mark yourself as finished and signal to main
pthread_mutex_lock(&currentlyWorkingMutex);
currentlyWorking--;
pthread_cond_signal(&currentlyWorkingCond);
pthread_mutex_unlock(&currentlyWorkingMutex);
// Wait for permission to finish
pthread_mutex_lock(&canFinishMutex);
while (!canFinish) {
pthread_cond_wait(&canFinishCond , &canFinishMutex);
}
pthread_mutex_unlock(&canFinishMutex);
}
pthread_exit(NULL);
}
void *processArrayJoin(void *args)
{
struct thread_data *data = (struct thread_data *)args;
int *arr = data->arr;
int start = data->start;
int end = data->end;
// Do the work
for (int i = start; i <= end; i++) {
arr[i] = arr[i] + 1;
}
pthread_exit(NULL);
}
And the output is:
Join average : 0.153074
Mutex average: 0.071588
Mutex is 53% faster.
Thank you again. I really appreciate your help!

There are several synchronization mechanisms you can use (condition variables, for example). I think the simplest would be to use a pthread_barrier to synchronize the the start of the threads.
Assuming that you want all of the threads to 'sync up' on each loop iteration, you can just reuse the barrier. If you need something more flexible, a condition variable might be more appropriate.
When you decide it's time for the thread to wrap up (you haven't indicated how the threads will know to break out of the infinite loop - a simple shared variable might be used for that; the shared variable could be an atomic type or protected with a mutex), the main() thread should use pthread_join() to wait for all the threads to complete.

You need to use a different synchronization technique than join, that's clear.
Unfortunately you have a lot of options. One is a "synchronization barrier", which basically is a thing where each thread that reaches it blocks until they've all reached it (you specify the number of threads in advance). Look at pthread_barrier.
Another is to use a condition-variable/mutex pair (pthread_cond_*). When each thread finishes it takes the mutex, increments a count, signals the condvar. The main thread waits on the condvar until the count reaches the value it expects. The code looks like this:
// thread has finished
mutex_lock
++global_count
// optional optimization: only execute the next line when global_count >= N
cond_signal
mutex_unlock
// main is waiting for N threads to finish
mutex_lock
while (global_count < N) {
cond_wait
}
mutex_unlock
Another is to use a semaphore per thread -- when the thread finishes it posts its own semaphore, and the main thread waits on each semaphore in turn instead of joining each thread in turn.
You also need synchronization to re-start the threads for the next job -- this could be a second synchronization object of the same type as the first, with details changed for the fact that you have 1 poster and N waiters rather than the other way around. Or you could (with care) re-use the same object for both purposes.
If you've tried these things and your code didn't work, maybe ask a new specific question about the code you tried. All of them are adequate to the task.

You are working at the wrong level of abstraction. This problem has been solved already. You are reimplementing a work queue + thread pool.
OpenMP seems like a good fit for your problem. It converts #pragma annotations into threaded code. I believe it would let you express what you're trying to do pretty directly.
Using libdispatch, what you're trying to do would be expressed as a dispatch_apply targeting a concurrent queue. This implicitly waits for all child tasks to complete. Under OS X, it's implemented using a non-portable pthread workqueue interface; under FreeBSD, I believe it manages a group of pthreads directly.
If it is portability concerns driving you to use raw pthreads, don't use pthread barriers. Barriers are an additional extension over and above basic POSIX threads. OS X for example does not support it. For more, see POSIX.
Blocking the main thread till all child threads have completed can be done using a count protected by a condition variable or, even more simply, using a pipe and a blocking read where the number of bytes to read matches the number of threads. Each thread writes one byte on work completion, then sleeps till it gets new work from the main thread. The main thread unblocks once each thread has written its "I'm done!" byte.
Passing work to the child threads can be done using a mutex protecting the work-descriptor and a condition to signal new work. You could use a single array of work descriptors that all threads draw from. On signal, each one tries to grab the mutex. On grabbing the mutex, it would dequeue some work, signal anew if the queue is nonempty, and then process its work, after which it would signal completion to the master thread.
You could reuse this "work queue" to unblock the main thread by enqueueing the results, with the main thread waiting till the result queue length matches the number of threads; the pipe approach is just using a blocking read to do this count for you.

To tell all the threads to start working, it can be as simple as a global integer variable which is initialized to zero, and the threads simply wait until it's non-zero. This way you don't need the while (1) loop in the thread function.
For waiting until they are all done, pthread_join is simplest as it will actually block until the thread it's joining is done. It's also needed to clean up system stuff after the thread (like otherwise the return value from the thread will be stored for the remainder of the program). As you have an array of all pthread_t for the threads, just loop over them one by one. As that part of your program doesn't do anything else, and has to wait until all threads are done, just waiting for them in order is okay.