I am trying to write a program in C that calculates the series:
for(i=0; i <= n; i++){
(2*i+1)/factorial(2*i);
}
n is the number of elements, determined by the user as an argument.
The user also determines the number of threads that are going to calculate the series.
I divide the series in subseries that calculate only a part of the series and each subseries should be calculated by a single thread. The problem is that my threads probably share memory because some series members are calculated many times and others are not calculated at all. Do you know why? Please help!
Here is the problematic part of the code:
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <gmp.h>
#include <math.h>
#include <pthread.h>
/* a struct to pass function arguments to the thread */
struct intervals_struct {
int **intervals;
mpf_t *result;
int thread_index;
};
/* calculate the sum of the elements of the subseries;
doesn't work properly for more than one thread */
void* sum_subinterval(void *args) {
/* Initialize the local variables here */
struct intervals_struct *p = (struct intervals_struct*)args;
for(i=(*p).intervals[(*p).thread_index][0]; i<=(*p).intervals[(*p).thread_index][1]; i++){
/* Do something with the local variables here */
}
mpf_set((*p).result[(*p).thread_index],sum);
/* Free resources used by the local variables here */
}
/* calculate the sum of all subseries */
void main(int argc, char * argv[]){
int p, t, i;
p = atoi(argv[1]);
assert( p >= 0);
t = atoi(argv[2]);
assert( t >= 0);
int **intervals_arr;
intervals_arr = (int**)malloc(t * sizeof(int *));
for(i = 0; i < t; i++) {
intervals_arr[i] = (int *)malloc(2 * sizeof(int));
}
/* Calculate intervals and store them in intervals_arr here */
mpf_t *subinterval_sum;
subinterval_sum = (mpf_t*)malloc(t * sizeof(mpf_t));
for(i=0; i < t; i++) {
mpf_init(subinterval_sum[i]);
}
pthread_t *tid;
tid = (pthread_t *)malloc(t * sizeof(pthread_t));
for(i = 0; i < t; i++) {
struct intervals_struct args = {intervals_arr, subinterval_sum, i};
pthread_create(&(tid[i]), NULL, sum_subinterval, (void*)&args);
}
for(i = 0; i < t; i++) {
pthread_join(tid[i], NULL);
}
/* Sum the elements of the result array and free resources used in main here */
}
The problem is probably here:
for(i = 0; i < t; i++) {
struct intervals_struct args = {intervals_arr, subinterval_sum, i};
pthread_create(&(tid[i]), NULL, sum_subinterval, (void*)&args);
}
You are passing the address of args to your new thread, but the lifetime of that variable ended immediately after the pthread_create call. The compiler can and will reuse the stack space occupied by args between different loop iterations.
Try allocating an array on the heap with malloc instead.
Edit: What I meant by that last sentence is something like this:
struct intervals_struct * args = (struct intervals_struct *) calloc(t, sizeof(struct intervals_struct));
for(i = 0; i < t; i++) {
args[i].intervals = intervals_arr;
args[i].result = subinterval_sum;
args[i].thread_index = i;
pthread_create(&(tid[i]), NULL, sum_subinterval, (void*)&args[i]);
}
// at the end of main(), or at least after every thread has been joined
free(args);
Related
I created this program to understand multithreading and have tested this program with single thread and works. Basically you enter 3 digits. First one as an initiale number, Second one is how many squence it will be run and last number is used for the number of threads required. Program will add the first 2 numbers in a struct that has: start, iteration and result. The algorithm will start multiplying the first number by 2 for the number of times you entered in the second number.
example: 1 3 2.
I've done the program in normally which works. but once i introduce pthread i'm getting Segmentation core dump error. I've spend hours trying to identify what is causing it, but no luck.
//The program will do: 1 * 2 = 2, 2 * 2 = 4, 4 * 2 = 8
//The results will be stored in a the struct result which is a pointer.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
struct Params
{
int start;
int iteration;
int *result;
};
void *double_number(void *vFirststruct)
{
struct Params *Firststruct = (struct Params *)vFirststruct;
int iter = 0;
Firststruct->result = (int *)malloc(sizeof(int) * Firststruct->iteration);
for (iter = 0; iter < Firststruct->iteration; iter++)
{
// printf("%d\n", Firststruct->start);
Firststruct->start = Firststruct->start * 2;
Firststruct->result[iter] = Firststruct->start;
}
}
void double_number_Single_Thread(struct Params *Firststruct)
{
int iter = 0;
Firststruct->result = (int *)malloc(sizeof(int) * Firststruct->iteration);
for (iter = 0; iter < Firststruct->iteration; iter++)
{
printf("%d\n", Firststruct->start);
Firststruct->start = Firststruct->start * 2;
Firststruct->result[iter] = Firststruct->start;
}
}
int main(int argc, char *argv[])
{
struct Params *Firststruct = (struct Params *)malloc(sizeof(struct Params));
Firststruct->start = atoi(argv[1]);
Firststruct->iteration = atoi(argv[2]);
int threads = atoi(argv[3]);
//For Single Thread
// double_number_Single_Thread(Firststruct); // <-- testing on single thread
// for (int i = 0; i < Firststruct->iteration; i++)
// {
// printf("%d %d\n", i, Firststruct->result[i]);
// }
//End for Single Thread
//Start of Single thread using pthread-Thread
pthread_t *t = (pthread_t *)malloc(threads * sizeof(pthread_t));
pthread_create(&t[0], NULL, &double_number, (void *)&Firststruct);
pthread_join(t[0], NULL);
//End for Single Thread
//Start of Multi thread
// for (int i = 0; i < threads; i++)
// {
// pthread_create(&t[i], NULL, &double_number, (void *)&Firststruct);
// }
// for (int i = 0; i < threads; i++)
// {
// pthread_join(t[i], NULL);
// }
free(Firststruct);
return 0;
}
The main problem you have (ignoring the fact that different thread will modify the same data) is your pthread_create call.
pthread_create(&t[0], NULL, &double_number, (void *) & Firststruct);
Should be
pthread_create(&t[0], NULL, &double_number, (void *) Firststruct);
Indeed Firststruct is already a pointer on struct Params, the extra & causes the mess.
For a prime factorization project, I need to pass a struct and a number (from the command line) to a thread. The below code is what I have so far. The factorization works fine, the problem is that the index passed to the thread isn't being passed in order, so the results vary, often storing data in the same index in a subsequent thread. Anyone know how to guarantee which index the thread will access, or a better way of implementing this? Each thread has to store their data in a struct so that the main thread can print all the data at the end, once all threads have closed.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <pthread.h>
// Initialize Constants
#define MAX_ARGS 25
#define MAX_PRIMES 10
#define SMALLEST_ARG 2
// Define Struct
struct PrimeData {
int index;
int num_to_fact[MAX_ARGS];
int primes[MAX_ARGS][MAX_PRIMES];
};
// Declare Functions
void* factor (void*);
// Main
int main(int argc, char* argv[])
{
// Initialize Struct Variables
struct PrimeData data;
struct PrimeData* data_addr = &data;
data.index = 0;
for (int i = 0; i < MAX_ARGS; i++)
data.num_to_fact[i] = -1;
for (int i = 0; i < MAX_ARGS; i++) {
for (int j = 0; j < MAX_PRIMES; j++)
data.primes[i][j] = -1;
}
// Check for arguments
if (argc <= 1)
printf("Usage: ./p3 <number to factor>...\n");
else {
// Initialize Thread Handler list
pthread_t threads[argc - 1];
// Create a Thread per Argument
for (int i = 1; i < argc; i++) {
// Update shares structure
data.index = i - 1;
data.num_to_fact[i - 1] = atoi(argv[i]);
// Create thread
pthread_create(&threads[i - 1], NULL, factor, (void*)data_addr);
}
// Tell main to wait for threads to terminate
for (int i = 1; i < argc; i++)
pthread_join(threads[i - 1], NULL);
}
// Iterate through struct
for (int i = 0; i < MAX_ARGS; i++) {
if (data.num_to_fact[i] == -1)
break;
printf("%d: ", data.num_to_fact[i]);
for (int j = 0; j < MAX_PRIMES; j++) {
if (data.primes[i][j] == -1)
break;
printf("%d ", data.primes[i][j]);
}
printf("\n");
}
// Terminate
return 0;
}
// The factor() function
void* factor(void* data)
{
struct PrimeData* d = (struct PrimeData*)data;
int index = d->index;
int n = d->num_to_fact[index];
int counter = 0;
int i = 2;
while (n != 1) {
if (n % i == 0) {
while (n % i == 0) {
d->primes[index][counter] = i;
n = n / i;
counter++;
}
}
i++;
}
return NULL;
}
You have only one 'struct PrimeData data;', so there is no point in signaling the address of it in the pthread_create call. The messy way would be to globalize 'PrimeData' so the threads have access to it, array-ize the index: 'int index[MAX_ARGS];', load it with 0,1,2,3.. etc and then pass the address of the required index to each thread, eg '&data_addr[i-1]'.
It might be clearer if you accepted that C arrays are indexed from zero and so get rid of a lot of those [i-1] things.
I got it working using a mutex. Kind of odd, we do not cover mutexes until the next chapter. I eventually stumbled upon an article explaining that when you use multithreading where each thread is accessing a shared memory location (my struct in this case) then you have to use a mutex control the index:
// At the start of the program, before main
pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;
int _index = -1;
// first three lines of the factor function
pthread_mutex_lock(&mutex1);
_index++;
pthread_mutex_unlock(&mutex1);
// Define Struct
struct PrimeData {
int num_to_fact[MAX_ARGS];
int primes[MAX_ARGS][MAX_PRIMES];
};
typedef struct Wrapper {
int index;
struct PrimeData *data;
} Wrapper;
...
int main(int argc, char *argv)
{
// ...
// Define wrappers
Wrapper wrappers[argc-1];
for (int i = 1; i < argc; i++)
{
wrappers[i-1].index = i;
wrappers[i-1].data = &data;
//...
pthread_create(&threads[i - 1], NULL, factor, wrappers + i - 1);
}
// ...
}
void *factor(void *wrapper)
{
Wrapper *w = (Wrapper *) wrapper;
struct PrimeData* d = w->data;
int index = w->index;
// ...
}
I want to read as input a table A and B from a user , and make an inner product space from them (a1b1+a2b2+……+anbn) and save it in a local_sum and then share it to an total_sum variable. I am doing the bellow code , but there is a segment fault. For some reason table A & B can't pass to function MUL. Any help would be great, thank you!
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#define N 2
int p;
int A[N],B[N];
int local_sum;
void *mul(void *arg)
{
int lines, start, end, i, j;
int id = *(int*)arg;
lines = N / p;
start = id * lines;
end = start + lines;
for (i = start; i < end; i++)
local_sum = A[i] * B[i] + local_sum;
return NULL;
}
int main (int argc, char *argv[])
{
int i;
pthread_t *tid;
if (argc != 2)
{
printf("Provide number of threads.\n");
exit(1);
}
p = atoi(argv[1]);
tid = (pthread_t *)malloc(p * sizeof(pthread_t));
if (tid == NULL)
{
printf("Could not allocate memory.\n");
exit(1);
}
printf("Give Table A\n");
for (int i = 0; i < N; i++)
{
scanf("%d", &A[i]);
}
printf("Give Table B\n");
for (int i = 0; i < N; i++)
{
scanf("%d", &B[i]);
}
for (i = 0; i < p; i++)
{
int *a;
a = malloc(sizeof(int));
*a = 0;
pthread_create(&tid[i], NULL, mul, a);
}
for (i = 0; i < p; i++)
pthread_join(tid[i], NULL);
printf("%d", local_sum);
return 0;
}
Let's see:
You want to have p threads, working on the vectors A and B.
You must be aware of that threads share the same memory, and might be interrupted at any time.
You've got p threads, all trying to write to one shared variable local_sum. This leads to unpredictable results since one thread overwrites the value another thread has written there before.
You can bypass this problem by ensuring exclusive access of one single thread to this variable by using a mutex or the like, or you could have one variable per thread, have each thread produce an intermediate result and after joining all threads, collapse all your intermediate results into the final one.
To do this, your main should look something like (assuming your compiler supports a recent C standard):
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#define N 2
/* these are variables shared amongst all threads */
int p;
int A[N], B[N];
/* array with one slot per thread to receive the partial result of each thread */
int* partial_sum;
/* prototype of thread function, just to be independent of the place mul will be placed in the source file... */
void *mul(void *arg);
int main (int argc, char** argv)
{
pthread_t* tid;
p = atoi(argv[1]);
const size_t n_by_p = N/p;
if(n_by_p * p != N)
{
fprintf(stderr, "Number of threads must be an integral factor of N\n");
exit(EXIT_FAILURE) ;
}
tid = calloc(p, sizeof(pthread_t));
partial_sum = calloc(p, sizeof(int)) ;
printf("Give Table A\n");
for(size_t i = 0; i < N; ++i)
{
scanf("%d",&A[i]);
}
printf("Give Table B\n");
for(size_t i = 0; i < N; ++i)
{
scanf("%d",&B[i]);
}
for (size_t i =0; i < p; ++i)
{
/* clumsy way to pass a thread it's slot number, but works as a starter... */
int *a;
a = malloc(sizeof(int));
*a = i;
pthread_create(&tid[i], 0, mul, a);
}
for (size_t i = 0; i < p; ++i)
{
pthread_join(tid[i], 0);
}
free(tid);
tid = 0;
int total_sum = 0;
for (size_t i = 0; i < p; ++i)
{
total_sum += partial_sum[i] ;
}
free(partial_sum);
partial_sum = 0;
printf("%d",total_sum);
return EXIT_SUCCESS;
}
Your threaded method mul should now write to its particular partial_sum slot only :
void *mul(void *arg)
{
int slot_num = *(int*)arg;
free(arg);
arg = 0;
const size_t lines = N/p;
const size_t start = slot_num * lines;
const size_t end = start + lines;
partial_sum[slot_num] = 0;
for(size_t i = start; i < end; ++i)
{
partial_sum[slot_num] += A[i]*B[i];
}
return 0;
}
Beware: This code runs smoothly, only if N is some integral multiple of p.
If this condition is not met, due to truncation in N/p, not all elements of the vectors will be processed.
However, fixing these cases is not the core of this question IMHO.
I spared all kinds of error-checking, which you should add, should this code become part of some operational setup...
if (tid=NULL)
-->
if (tid==NULL)
and
for (i=start;i<end;i++)
I suppose we need
for (i=0;i<end-start;i++)
I'm pretty new to threads and would like some insight. I'm trying to get the percentage each thread has completed for its calculation. Each thread will report its percentage to a different element of the same array. I have this working with pthread_join immediately after pthread_create and a separate thread for reading all the values of the array and printing the percentage but when I have all threads running after each other without waiting for the previous one to finish I get some weird behavior. This is how I'm accessing the shared (global) array.
//global
int *currentProgress;
//main
currentProgress = malloc(sizeof(int)*threads);
for(i=0; i<threads; i++)
currentProgress[i] = 0;
//child threads
currentProgress[myId] = (int)percent; //myId is unique
//progress thread
for(i=0; i<threads; i++)
progressTotal += currentProgress[i];
progressTotal /= threads;
printf("Percent: %d", progressTotal);
This is essentially the code I think is not being used correctly for multi-threads. When I print out the state of the shared array, I notice that as soon as another thread starts accessing the array (different element though), the previous element immediately goes to some random number... -2147483648 and when the latter element finishes the prior element continues like normal. Should I be using semaphores for this? I thought I could access different elements of an array at the same time and I thought reading them wasn't an issue.
This is the entire code:
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <stdint.h>
#include <pthread.h>
#include <string.h>
#define STDIN 0
int counter = 0;
uint64_t *factors;
void *getFactors(void *arg);
void *deleteThreads(void *arg);
void *displayProgressThread(void *arg);
int *currentProgress;
struct data
{
uint64_t num;
uint64_t incrS;
uint64_t incrF;
int threads;
int member;
} *args;
int main(int argc, char *argv[])
{
if(argc < 3) {printf("not enough arguments"); exit(1);}
int i;
int threads = atoi(argv[2]);
pthread_t thread_id[threads];
pthread_t dThread;
currentProgress = malloc(sizeof(int)*threads);
for(i=0; i<threads; i++)
currentProgress[i] = 0;
args = (struct data*)malloc(sizeof(struct data));
args->num = atoll(argv[1]);
args->threads = threads;
uint64_t increment = (uint64_t)sqrt((uint64_t)args->num)/threads;
factors = (uint64_t*)malloc(sizeof(uint64_t)*increment*threads);
pthread_create(&dThread, NULL, displayProgressThread, (void*)args);
//for the id of each thread
args->member = 0;
for(i=0; i<threads; i++)
{
args->incrS = (i)*increment +1;
args->incrF = (i+1)*increment +1;
pthread_create(&thread_id[i], NULL, getFactors, (void*)args);
usleep(5);
}
for(i=0; i<threads; i++)
{
pthread_join(thread_id[i], NULL);
}
sleep(1);
printf("done\n");
for (i=0; i<counter; i++)
printf("\n%llu : %llu", factors[++i], factors[i]);
return 0;
}
void *getFactors(void *arg)
{
uint64_t count;
int myId;
int tempCounter = 0, i;
struct data *temp = (struct data *) arg;
uint64_t number = temp->num;
float total = temp->incrF - temp->incrS, percent;
myId = temp->member++;
pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);
pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, NULL);
for(count=temp->incrS; count<=temp->incrF; count++)
{
percent = (float)(count-temp->incrS)/total*100;
currentProgress[myId] = (int)percent;
if (number%count == 0)
{
factors[counter++] = count;
factors[counter++] = number/count;
}
usleep(1);
}
usleep(1);
pthread_exit(NULL);
}
void *displayProgressThread(void *arg)
{
struct data *temp = (struct data *) arg;
int toDelete = 0;
while(1)
{
int i;
int progressTotal = 0;
char *percent = malloc(sizeof(char)*20);
for(i=0; i<toDelete; i++)
printf("\b \b");
for(i=0; i<temp->threads; i++){
progressTotal += currentProgress[i];
}
progressTotal /= temp->threads;
printf("|");
for(i=0; i<50; i++)
if(i<progressTotal/2)
printf("#");
else
printf("_");
printf("| ");
sprintf(percent, "Percent: %d", progressTotal);
printf("%s", percent);
toDelete = 53 + strlen(percent);
usleep(1000);
fflush(stdout);
if(progressTotal >= 100)
pthread_exit(NULL);
}
}
There are some non synchronized pieces of code that are accessed by the threads which cause this problem.
One first place to be synchronized is:
myId = temp->member++;
But more importantly is that, the main thread is doing:
args->incrS = (i)*increment +1;
args->incrF = (i+1)*increment +1;
while at the same time in the threads:
for(count=temp->incrS; count<= temp->incrF; count++)
{
percent = (float)(count-temp->incrS)/total*100;
currentProgress[myId] = (int)percent;
if (number%count == 0)
{
factors[counter++] = count;
factors[counter++] = number/count;
}
usleep(1);
}
The unsynchronized accesses mentioned above affect the calculation of percent value which results in such abnormal happenings. You have to do synchronization in all these places in order to get the kind of behavior you would expect.
Hello,
I have created a multithreaded application for multiplying two matrices using pthreads,but to my surprise the multithreaded program is taking much time than my expectation.
I dnt know where is the problem in my code,the code snippet is given below::
#include "pthreads.h"
#include "cv.h"
#include "cxcore.h"
CvMat * matA; /* first matrix */
CvMat * matB; /* second matrix */
CvMat * matRes; /* result matrix */
int size_x_a; /* this variable will be used for the first dimension */
int size_y_a; /* this variable will be used for the second dimension */
int size_x_b,size_y_b;
int size_x_res;
int size_y_res;
struct v {
int i; /* row */
int j; /* column */
};
void *printThreadID(void *threadid)
{
/*long id = (long) threadid;
//printf("Thread ID: %ld\n", id);
arrZ[id] = arrX[id] + arrY[id];
pthread_exit(NULL);*/
return 0;
}
int main()
{
/* assigining the values of sizes */
size_x_a = 200;
size_y_a = 200;
size_x_b = 200;
size_y_b = 200;
/* resultant matrix dimensions */
size_x_res = size_x_a;
size_y_res = size_y_b;
matA = cvCreateMat(size_x_a,size_y_a,CV_64FC1);
matB = cvCreateMat(size_x_b,size_y_b,CV_64FC1);
matRes = cvCreateMat(size_x_res,size_y_res,CV_64FC1);
pthread_t thread1;
pthread_t thread2;
pthread_t multThread[200][200];
int res1;
int res2;
int mulRes;
/*******************************************************************************/
/*Creating a thread*/
res1 = pthread_create(&thread1,NULL,initializeA,(void*)matA);
if(res1!=0)
{
perror("thread creation of thread1 failed");
exit(EXIT_FAILURE);
}
/*Creating a thread*/
res2 = pthread_create(&thread2,NULL,initializeB,(void*)matB);
if(res2!=0)
{
perror("thread creation of thread2 failed");
exit(EXIT_FAILURE);
}
pthread_join(thread1,NULL);
pthread_join(thread2,NULL);
/*Multiplication of matrices*/
for(int i=0;i<size_x_a;i++)
{
for(int j=0;j<size_y_b;j++)
{
struct v * data = (struct v*)malloc(sizeof(struct v));
data->i = i;
data->j = j;
mulRes = pthread_create(&multThread[i][j],NULL,multiplication, (void*)data);
}
}
for(int i=0;i<size_x_a;i++)
{
for(int j=0;j<size_y_b;j++)
{
pthread_join(multThread[i][j],NULL);
}
}
for(int i =0;i<size_x_a;i++)
{
for(int j = 0;j<size_y_a;j++)
{
printf("%f ",cvmGet(matA,i,j));
}
}
return 0;
}
void * multiplication(void * param)
{
struct v * data = (struct v *)param;
double sum =0;
for(int k=0;k<size_x_a;k++)
sum += cvmGet(matA,data->i,k) * cvmGet(matB,k,data->j);
cvmSet(matRes,data->i,data->j,sum);
pthread_exit(0);
return 0;
}
void * initializeA(void * arg)
{
CvMat * matA = (CvMat*)arg;
//matA = (CvMat*)malloc(size_x_a * sizeof(CvMat *));
/*initialiazing random values*/
for (int i = 0; i < size_x_a; i++)
{
for (int j = 0; j < size_y_a; j++)
{
cvmSet(matA,i,j,size_y_a + j); /* just some unique number for each element */
}
}
return 0;
}
void * initializeB(void * arg)
{
CvMat* matB = (CvMat*)arg;
//matB = (CvMat*)malloc(size_x_b * sizeof(CvMat *));
/*initialiazing random values*/
for (int i = 0; i < size_x_b; i++)
{
for (int j = 0; j < size_y_b; j++)
{
cvmSet(matB,i,j,size_y_b + j); /* just some unique number for each element */
}
}
return 0;
}
void * initializeRes(void * arg)
{
CvMat * res = (CvMat*)arg;
//res = (CvMat*)malloc(size_x_res * sizeof(CvMat *));
/* for matrix matRes, allocate storage for an array of ints */
for (int i = 0; i < size_x_res; i++)
{
for (int j = 0; j < size_y_res; j++)
{
cvmSet(matRes,i,j,0);
}
}
return 0;
}
I am doing this multithreading for the first time.
Kindly help me with this,any suggestion or correction will be very helpful.
Thanks in advance.
You're creating ALOT of threads, which will involve lots of context switches. If each thread is doing pure calculations, and wont involve any sort of waiting (like networking, sockets, etc) there is no reason why threading will be faster than not threaded. Unless of course you are on a multi CPU/core machine, then you should create one thread per core. With this sort of processing, more threads than cores will just slow it down.
What you could do is divide the work-set into tasks that can be enqueued, and have worker threads (one/CPU core) that will pull the tasks off of a common worker queue. This is a standard producer/consumer problem.
Here is some generic info about the producer/consumer problem.
Its been a long time since Ive done matrix multiplication, so bear with me :) It appears that you could divide the following into separate tasks:
/*Multiplication of matrices*/
for(int i=0;i<size_x_a;i++)
{
for(int j=0;j<size_y_b;j++)
{
struct v * data = (struct v*)malloc(sizeof(struct v));
data->i = i;
data->j = j;
/* Instead of creating a thread, create a task and put it on the queue
* mulRes = pthread_create(&multThread[i][j],NULL,multiplication, (void*)data);
*/
/* Im not going to implement the queue here, since there are several available
* But remember that the queue access MUST be mutex protected. */
enqueue_task(data);
}
}
Previously, you will have to have created what is called the thread-pool (the worker threads, one per CPU core), whose worker function will try to pull off the queue and execute the work. There are ways to do this with pthread conditional variables, whereby the threads are blocked/waiting on the cond var if the queue is empty, and once the queue is populated, then the cond var is signalled, thus releasing the threads so they can start working.
If this is not a logical division of work, and you cant find one, then perhaps this problem is not suitable for multi-threading.