C multithread performance issue - c

I am writing a multi-threaded program to traverse an n x n matrix, where the elements in the main diagonal are processed in a parallel manner, as shown in the code below:
int main(int argc, char * argv[] )
{
/* VARIABLES INITIALIZATION HERE */
gettimeofday(&start_t, NULL); //start timing
for (int slice = 0; slice < 2 * n - 1; ++slice)
{
z = slice < n ? 0 : slice - n + 1;
int L = 0;
pthread_t threads[slice-z-z+1];
struct thread_data td[slice-z-z+1];
for (int j=z; j<=slice-z; ++j)
{
td[L].index= L;
printf("create:%d\n", L );
pthread_create(&threads[L],NULL,mult_thread,(void *)&td[L]);
L++;
}
for (int j=0; j<L; j++)
{
pthread_join(threads[j],NULL);
}
}
gettimeofday(&end_t, NULL);
printf("Total time taken by CPU: %ld \n", ( (end_t.tv_sec - start_t.tv_sec)*1000000 + end_t.tv_usec - start_t.tv_usec));
return (0);
}
void *mult_thread(void *t)
{
struct thread_data *my_data= (struct thread_data*) t;
/* SOME ADDITIONAL CODE LINES HERE */
printf("ThreadFunction:%d\n", (*my_data).index );
return (NULL);
}
The problem is that this multithreaded implementation gave me a very bad performance compared with the serial (naive) implementation.
Are there some adjustments that could be done to improve the performance of the multithreaded version ??

a thread pool may make it better.
define a new struct type as follow.
typedef struct {
struct thread_data * data;
int status; // 0: ready
// 1: adding data
// 2: data handling, 3: done
int next_free;
} thread_node;
init :
size_t thread_size = 8;
thread_node * nodes = (thread_node *)malloc(thread_size * sizeof(thread_node));
for(int i = 0 ; i < thread_size - 1 ; i++ ) {
nodes[i].next_free = i + 1;
nodes[i].status = 0 ;
}
nodes[thread_size - 1].next_free = -1;
int current_free_node = 0 ;
pthread_mutex_t mutex;
get thread :
int alloc() {
pthread_mutex_lock(&mutex);
int rt = current_free_node;
if(current_free_node != -1) {
current_free_node = nodes[current_free_node].next_free;
nodes[rt].status = 1;
}
pthread_mutex_unlock(&mutex);
return rt;
}
return thread :
void back(int idx) {
pthread_mutex_lock(&mutex);
nodes[idx].next_free = current_free_node;
current_free_node = idx;
nodes[idx].status = 0;
pthread_mutex_unlock(&mutex);
}
create the threads first, and use alloc() to try to get a idle thread, update the pointer.
don't use join to judge the status.
modify your mult_thread as a loop and after the job finished , just change your status to 3
for each loop in the thread , you may give it more work
I wish it will give you some help.
------------ UPDATED Apr. 23, 2015 -------------------
here is a example.
compile & run with command
$ g++ thread_pool.cc -o tp -pthread --std=c++
yu:thread_pool yu$ g++ tp.cc -o tp -pthread --std=c++11 && ./tp
1227135.147 1227176.546 1227217.944 1227259.340...
time cost 1 : 1068.339091 ms
1227135.147 1227176.546 1227217.944 1227259.340...
time cost 2 : 548.221607 ms
you may also remove timer and it can also compiled as a std c99 file.
In current , the thread size has been limited to 2. You may also adjust the parameter thread_size, and recompile & run again. More threads may give your some more advantage(in my pc, if I change the thread size to 4, the task will finish in 280ms), while too much thread number may not help you too much if you have no enough cpu thread.

Related

How to solve the dining philosophers problem with only mutexes?

I wrote this program to solve the dining philosophers problem using Dijkstra's algorithm, notice that I'm using an array of booleans (data->locked) instead of an array of binary semaphores.
I'm not sure if this solution is valid (hence the SO question).
Will access to the data->locked array in both test and take_forks functions cause data races? if so is it even possible to solve this problem using Dijkstra's algorithm with only mutexes?
I'm only allowed to use mutexes, no semaphores, no condition variables (it's an assignment).
Example of usage:
./a.out 4 1000 1000
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <stdbool.h>
#define NOT_HUNGRY 1
#define HUNGRY 2
#define EATING 3
#define RIGHT ((i + 1) % data->n)
#define LEFT ((i + data->n - 1) % data->n)
typedef struct s_data
{
int n;
int t_sleep;
int t_eat;
int *state;
bool *locked;
pthread_mutex_t *state_mutex;
} t_data;
typedef struct s_arg
{
t_data *data;
int i;
} t_arg;
int ft_min(int a, int b)
{
if (a < b)
return (a);
return (b);
}
int ft_max(int a, int b)
{
if (a > b)
return (a);
return (b);
}
// if the LEFT and RIGHT threads are not eating
// and thread number i is hungry, change its state to EATING
// and signal to the while loop in `take_forks` to stop blocking.
// if a thread has a state of HUNGRY then it's guaranteed
// to be out of the critical section of `take_forks`.
void test(int i, t_data *data)
{
if (
data->state[i] == HUNGRY
&& data->state[LEFT] != EATING
&& data->state[RIGHT] != EATING
)
{
data->state[i] = EATING;
data->locked[i] = false;
}
}
// set the state of the thread number i to HUNGRY
// and block until the LEFT and RIGHT threads are not EATING
// in which case they will call `test` from `put_forks`
// which will result in breaking the while loop
void take_forks(int i, t_data *data)
{
pthread_mutex_lock(data->state_mutex);
data->locked[i] = true;
data->state[i] = HUNGRY;
test(i, data);
pthread_mutex_unlock(data->state_mutex);
while (data->locked[i]);
}
// set the state of the thread number i to NOT_HUNGRY
// then signal to the LEFT and RIGHT threads
// so they can start eating when their neighbors are not eating
void put_forks(int i, t_data *data)
{
pthread_mutex_lock(data->state_mutex);
data->state[i] = NOT_HUNGRY;
test(LEFT, data);
test(RIGHT, data);
pthread_mutex_unlock(data->state_mutex);
}
void *philosopher(void *_arg)
{
t_arg *arg = _arg;
while (true)
{
printf("%d is thinking\n", arg->i);
take_forks(arg->i, arg->data);
printf("%d is eating\n", arg->i);
usleep(arg->data->t_eat * 1000);
put_forks(arg->i, arg->data);
printf("%d is sleeping\n", arg->i);
usleep(arg->data->t_sleep * 1000);
}
return (NULL);
}
void data_init(t_data *data, pthread_mutex_t *state_mutex, char **argv)
{
int i = 0;
data->n = atoi(argv[1]);
data->t_eat = atoi(argv[2]);
data->t_sleep = atoi(argv[3]);
pthread_mutex_init(state_mutex, NULL);
data->state_mutex = state_mutex;
data->state = malloc(data->n * sizeof(int));
data->locked = malloc(data->n * sizeof(bool));
while (i < data->n)
{
data->state[i] = NOT_HUNGRY;
data->locked[i] = true;
i++;
}
}
int main(int argc, char **argv)
{
pthread_mutex_t state_mutex;
t_data data;
t_arg *args;
pthread_t *threads;
int i;
if (argc != 4)
{
fputs("Error\nInvalid argument count\n", stderr);
return (1);
}
data_init(&data, &state_mutex, argv);
args = malloc(data.n * sizeof(t_arg));
i = 0;
while (i < data.n)
{
args[i].data = &data;
args[i].i = i;
i++;
}
threads = malloc(data.n * sizeof(pthread_t));
i = 0;
while (i < data.n)
{
pthread_create(threads + i, NULL, philosopher, args + i);
i++;
}
i = 0;
while (i < data.n)
pthread_join(threads[i++], NULL);
}
Your spin loop while (data->locked[i]); is a data race; you don't hold the lock while reading it data->locked[i], and so another thread could take the lock and write to that same variable while you are reading it. In fact, you rely on that happening. But this is undefined behavior.
Immediate practical consequences are that the compiler can delete the test (since in the absence of a data race, data->locked[i] could not change between iterations), or delete the loop altogether (since it's now an infinite loop, and nontrivial infinite loops are UB). Of course other undesired outcomes are also possible.
So you have to hold the mutex while testing the flag. If it's false, you should then hold the mutex until you set it true and do your other work; otherwise there is a race where another thread could get it first. If it's true, then drop the mutex, wait a little while, take it again, and retry.
(How long is a "little while", and what work you choose to do in between, are probably things you should test. Depending on what kind of fairness algorithms your pthread implementation uses, you might run into situations where take_forks succeeds in retaking the lock even if put_forks is also waiting to lock it.)
Of course, in a "real" program, you wouldn't do it this way in the first place; you'd use a condition variable.

How to control pthreads with multiple mutexes and conditions?

In the code below I wrote a program to perform add/remove operations on an int array using multithreading. The condition is that multiple threads cannot make operations on the same cell, but parallel operations can be made on different cells.
I thought in order to implement such conditions I'd need to use multiple mutexes and condition variables, to be exact, as many as there're cells in the array. The initial value of all cells of my array is 10 and threads increment/decrement this value by 3.
The code below seems to work (the cell values of the array after all threads finished working is as expected) but I don't understand a few things:
I first spawn adder threads which sleep for a second. In addition each thread has printf statement which is triggered if a thread waits. Remove threads don't sleep so I expect remove threads to invoke their printf statements because they must wait a second at least before adder threads finish their work. But remover threads never call printf.
My second concern: as I mentioned I first spawn adder threads so I expect the cells value go from 10 to 13. Then if remover thread acquires lock the value can go from 13 to 10 OR if adder thread acquires the lock then the cell value will go from 13 to 16. But I don't see the behavior in printf statements inside threads. For example one of the printf sequences I had: add thread id and cell id 1: cell value 10->13, then remove thread id and cell id 1: cell value 10->7 then add thread id and cell id 1: cell value 10->13. This doesn't make sense. I made sure that the threads all point to the same array.
Bottom line I'd like to know whether my solution is correct and if yes why is the behavior I described occurring. If my solution is incorrect I'd appreciate example of correct solution or at least general direction.
This is the code (all the logic is in AdderThread, RemoveThread):
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
#define ARR_LEN 5
#define THREADS_NUM 5
#define INIT_VAL 10
#define ADD_VAL 3
#define REMOVE_VAL 3
#define ADDER_LOOPS 2
typedef struct helper_t {
int threadId;
int * arr;
int * stateArr; //0 if free, 1 if busy
} helper_t;
enum STATE {FREE, BUSY};
enum ERRORS {MUTEX, COND, CREATE, JOIN, LOCK, UNLOCK, WAIT, BROADCAST};
pthread_mutex_t mutexArr[THREADS_NUM];
pthread_cond_t condArr[THREADS_NUM];
void errorHandler(int errorId) {
switch (errorId) {
case MUTEX:
printf("mutex error\n");
break;
case COND:
printf("cond error\n");
break;
case CREATE:
printf("create error\n");
break;
case JOIN:
printf("join error\n");
break;
case LOCK:
printf("lock error\n");
break;
case UNLOCK:
printf("unlock error\n");
break;
case WAIT:
printf("wait error\n");
break;
case BROADCAST:
printf("broadcast error\n");
break;
default:
printf("default switch\n");
break;
}
}
void mallocError() {
printf("malloc error\nExiting app\n");
exit(EXIT_FAILURE);
}
void initMutexesAndConds(pthread_mutex_t * mutexArr, pthread_cond_t * condArr) {
int i;
for(i = 0; i < THREADS_NUM; i++) {
pthread_mutex_init(&mutexArr[i], NULL);
pthread_cond_init(&condArr[i], NULL);
}
}
helper_t * initStructs(int * arr, int * stateArr) {
int i;
helper_t * helpers = (helper_t *) malloc(sizeof(helper_t) * THREADS_NUM);
if(!helpers) {
mallocError();
} else {
for(i = 0; i < THREADS_NUM; i++) {
helpers[i].threadId = i;
helpers[i].arr = arr;
helpers[i].stateArr = stateArr;
}
}
return helpers;
}
void printArr(int * arr, int len) {
int i;
for(i = 0; i < len; i++) {
printf("%d, ", arr[i]);
}
printf("\n");
}
void * AdderThread(void * arg) {
int i;
helper_t * h = (helper_t *) arg;
int id = h->threadId;
for(i = 0; i < ADDER_LOOPS; i++) {
pthread_mutex_t * mutex = &mutexArr[id];
pthread_cond_t * cond = &condArr[id];
if(pthread_mutex_lock(mutex)) {
errorHandler(LOCK);
}
while(h->stateArr[id] == BUSY) {
printf("adder id %d waiting...\n", id);
if(pthread_cond_wait(cond, mutex)) {
errorHandler(WAIT);
}
}
h->stateArr[id] = BUSY;
sleep(1);
h->arr[id] = h->arr[id] + ADD_VAL;
printf("add thread id and cell id %d: cell value %d->%d\n", id, h->arr[id]-ADD_VAL, h->arr[id]);
h->stateArr[id] = FREE;
if(pthread_cond_broadcast(cond)) {
errorHandler(BROADCAST);
}
if(pthread_mutex_unlock(mutex)) {
errorHandler(UNLOCK);
}
}
pthread_exit(NULL);
}
void * RemoveThread(void * arg) {
helper_t * h = (helper_t *) arg;
int id = h->threadId;
pthread_mutex_t * mutex = &mutexArr[id];
pthread_cond_t * cond = &condArr[id];
if(pthread_mutex_lock(mutex)) {
errorHandler(LOCK);
}
while(h->stateArr[id] == BUSY) {
printf("remover id %d waiting...\n", id);
if(pthread_cond_wait(cond, mutex)) {
errorHandler(WAIT);
}
}
h->stateArr[id] = BUSY;
h->arr[id] = h->arr[id] - REMOVE_VAL;
printf("remove thread id and cell id %d: cell value %d->%d\n", id, h->arr[id], h->arr[id]-ADD_VAL);
h->stateArr[id] = FREE;
if(pthread_cond_broadcast(cond)) {
errorHandler(BROADCAST);
}
if(pthread_mutex_unlock(mutex)) {
errorHandler(UNLOCK);
}
pthread_exit(NULL);
}
int main() {
int i;
helper_t * adderHelpers;
helper_t * removeHelpers;
pthread_t adders[THREADS_NUM];
pthread_t removers[THREADS_NUM];
int * arr = (int *) malloc(sizeof(int) * ARR_LEN);
int * stateArr = (int *) malloc(sizeof(int) * ARR_LEN);
if(!arr || !stateArr) {
mallocError();
}
for(i = 0; i < ARR_LEN; i++) {
arr[i] = INIT_VAL;
stateArr[i] = FREE;
}
initMutexesAndConds(mutexArr, condArr);
adderHelpers = initStructs(arr, stateArr);
removeHelpers = initStructs(arr, stateArr);
for(i = 0; i < THREADS_NUM; i++) {
pthread_create(&adders[i], NULL, AdderThread, &adderHelpers[i]);
pthread_create(&removers[i], NULL, RemoveThread, &removeHelpers[i]);
}
for(i = 0; i < THREADS_NUM; i++) {
pthread_join(adders[i], NULL);
pthread_join(removers[i], NULL);
}
printf("the results are:\n");
printArr(arr, THREADS_NUM);
printf("DONE.\n");
return 0;
}
1) This code sequence in Addr:
h->stateArr[id] = BUSY;
sleep(1);
h->arr[id] = h->arr[id] + ADD_VAL;
printf("add thread id and cell id %d: cell value %d->%d\n", id, h->arr[id]-ADD_VAL, h->arr[id]);
h->stateArr[id] = FREE;
Is execute with the mutex locked; thus Remove would never get a chance to see the state as anything but FREE.
2) There is no guarantee that mutex ownership alternates (afaik), but at the very least, to properly co-ordinate threads you should never rely upon such an implementation detail. It is the difference between working and “happens to work”, which usually leads to “used to work”....
If you put the sleep() between the mutex unlock and mutex lock, you might have a better case, but as it is, it just unlocks it then locks it again, so the system is well within its rights to just let it continue executing.
[ I ran out of space in comments ... ]:
Yes, the condition variables are doing nothing for you here. The idea of a condition variable is to be able to be notified when a significant event, such as a state change, has occurred on some shared objection.
For example, a reservoir might have a single condition variable for the water level. Multiplexed onto that might be many conditions: level < 1m; level > 5m; level > 10m. To keep the systems independent (thus working), the bit that updates the level might just:
pthread_mutex_lock(&levellock);
level = x;
pthread_cond_broadcast(&newlevel);
pthread_mutex_unlock(&levellock);
The actors implementing the conditions would do something like:
pthread_mutex_lock(&levellock);
while (1) {
if (level is my conditions) {
pthread_mutex_unlock(&levellock);
alert the media
pthread_mutex_lock(&levellock);
}
pthread_cond_wait(&newlevel, &levellock);
}
Thus I can add many “condition monitors” without breaking the level setting code, or the overall system. Many is finite, but by releasing the mutex while I alert the media, I avoid having my water monitoring system rely on the alarm handling.
If you are familiar with “publish/subscribe”, you might find this familiar. This is fundamentally the same model, just the PS hides a pile of details.

How to make thread safe program?

On a 64-bit architecture pc, the next program should return the result 1.350948.
But it is not thread safe and every time I run it gives (obviously) a different result.
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <pthread.h>
const unsigned int ndiv = 1000;
double res = 0;
struct xval{
double x;
};
// Integrate exp(x^2 + y^2) over the unit circle on the
// first quadrant.
void* sum_function(void*);
void* sum_function(void* args){
unsigned int j;
double y = 0;
double localres = 0;
double x = ((struct xval*)args)->x;
for(j = 0; (x*x)+(y*y) < 1; y = (++j)*(1/(double)ndiv)){
localres += exp((x*x)+(y*y));
}
// Globla variable:
res += (localres/(double)(ndiv*ndiv));
// This is not thread safe!
// mutex? futex? lock? semaphore? other?
}
int main(void){
unsigned int i;
double x = 0;
pthread_t thr[ndiv];
struct xval* xvarray;
if((xvarray = calloc(ndiv, sizeof(struct xval))) == NULL){
exit(EXIT_FAILURE);
}
for(i = 0; x < 1; x = (++i)*(1/(double)ndiv)){
xvarray[i].x = x;
pthread_create(&thr[i], NULL, &sum_function, &xvarray[i]);
// Should check return value.
}
for(i = 0; i < ndiv; i++){
pthread_join(thr[i], NULL);
// If
// pthread_join(thr[i], &retval);
// res += *((double*)retval) <-?
// there would be no problem.
}
printf("The integral of exp(x^2 + y^2) over the unit circle on\n\
the first quadrant is: %f\n", res);
return 0;
}
How can it be thread safe?
NOTE: I know that 1000 threads is not a good way to solve this problem, but I really really want to know how to write thread-safe c programs.
Compile the above program with
gcc ./integral0.c -lpthread -lm -o integral
pthread_mutex_lock(&my_mutex);
// code to make thread safe
pthread_mutex_unlock(&my_mutex);
Declare my_mutex either as a global variable like pthread_mutex_t my_mutex;. Or initialize in code using pthread_mutex_t my_mutex; pthread_mutex_init(&my_mutex, NULL);. Also don't forget to include #include <pthread.h> and link your program with -lpthread when compiling.
The question (in a comment in the code):
// mutex? futex? lock? semaphore? other?
Answer: mutex.
See pthread_mutex_init, pthread_mutex_lock, and pthread_mutex_unlock.

Thread pool - handle a case when there are more tasks than threads

I'm just entered multithreaded programming and as part of an exercise trying to implement a simple thread pool using pthreads.
I have tried to use conditional variable to signal working threads that there are jobs waiting within the queue. But for a reason I can't figure out the mechanism is not working.
Bellow are the relevant code snippets:
typedef struct thread_pool_task
{
void (*computeFunc)(void *);
void *param;
} ThreadPoolTask;
typedef enum thread_pool_state
{
RUNNING = 0,
SOFT_SHUTDOWN = 1,
HARD_SHUTDOWN = 2
} ThreadPoolState;
typedef struct thread_pool
{
ThreadPoolState poolState;
unsigned int poolSize;
unsigned int queueSize;
OSQueue* poolQueue;
pthread_t* threads;
pthread_mutex_t q_mtx;
pthread_cond_t q_cnd;
} ThreadPool;
static void* threadPoolThread(void* threadPool){
ThreadPool* pool = (ThreadPool*)(threadPool);
for(;;)
{
/* Lock must be taken to wait on conditional variable */
pthread_mutex_lock(&(pool->q_mtx));
/* Wait on condition variable, check for spurious wakeups.
When returning from pthread_cond_wait(), we own the lock. */
while( (pool->queueSize == 0) && (pool->poolState == RUNNING) )
{
pthread_cond_wait(&(pool->q_cnd), &(pool->q_mtx));
}
printf("Queue size: %d\n", pool->queueSize);
/* --- */
if (pool->poolState != RUNNING){
break;
}
/* Grab our task */
ThreadPoolTask* task = osDequeue(pool->poolQueue);
pool->queueSize--;
/* Unlock */
pthread_mutex_unlock(&(pool->q_mtx));
/* Get to work */
(*(task->computeFunc))(task->param);
free(task);
}
pthread_mutex_unlock(&(pool->q_mtx));
pthread_exit(NULL);
return(NULL);
}
ThreadPool* tpCreate(int numOfThreads)
{
ThreadPool* threadPool = malloc(sizeof(ThreadPool));
if(threadPool == NULL) return NULL;
/* Initialize */
threadPool->poolState = RUNNING;
threadPool->poolSize = numOfThreads;
threadPool->queueSize = 0;
/* Allocate OSQueue and threads */
threadPool->poolQueue = osCreateQueue();
if (threadPool->poolQueue == NULL)
{
}
threadPool->threads = malloc(sizeof(pthread_t) * numOfThreads);
if (threadPool->threads == NULL)
{
}
/* Initialize mutex and conditional variable */
pthread_mutex_init(&(threadPool->q_mtx), NULL);
pthread_cond_init(&(threadPool->q_cnd), NULL);
/* Start worker threads */
for(int i = 0; i < threadPool->poolSize; i++)
{
pthread_create(&(threadPool->threads[i]), NULL, threadPoolThread, threadPool);
}
return threadPool;
}
int tpInsertTask(ThreadPool* threadPool, void (*computeFunc) (void *), void* param)
{
if(threadPool == NULL || computeFunc == NULL) {
return -1;
}
/* Check state and create ThreadPoolTask */
if (threadPool->poolState != RUNNING) return -1;
ThreadPoolTask* newTask = malloc(sizeof(ThreadPoolTask));
if (newTask == NULL) return -1;
newTask->computeFunc = computeFunc;
newTask->param = param;
/* Add task to queue */
pthread_mutex_lock(&(threadPool->q_mtx));
osEnqueue(threadPool->poolQueue, newTask);
threadPool->queueSize++;
pthread_cond_signal(&(threadPool->q_cnd));
pthread_mutex_unlock(&threadPool->q_mtx);
return 0;
}
The problem is that when I create a pool with 1 thread and add a lot of jobs to it, it does not executes all the jobs.
[EDIT:]
I have tried running the following code to test basic functionality:
void hello (void* a)
{
int i = *((int*)a);
printf("hello: %d\n", i);
}
void test_thread_pool_sanity()
{
int i;
ThreadPool* tp = tpCreate(1);
for(i=0; i<10; ++i)
{
tpInsertTask(tp,hello,(void*)(&i));
}
}
I expected to have input in like the following:
hello: 0
hello: 1
hello: 2
hello: 3
hello: 4
hello: 5
hello: 6
hello: 7
hello: 8
hello: 9
Instead, sometime i get the following output:
Queue size: 9 //printf added for debugging within threadPoolThread
hello: 9
Queue size: 9 //printf added for debugging within threadPoolThread
hello: 0
And sometimes I don't get any output at all.
What is the thing I'm missing?
When you call tpInsertTask(tp,hello,(void*)(&i)); you are passing the address of i which is on the stack. There are multiple problems with this:
Every thread is getting the same address. I am guessing the hello function takes that address and prints out *param which all point to the same location on the stack.
Since i is on the stack once test_thread_pool_sanity returns the last value is lost and will be overwritten by other code so the value is undefined.
Depending on then the worker thread works through the tasks versus when your main test thread schedules the tasks you will get different results.
You need the parameter passed to be saved as part of the task in order to guarantee it is unique per task.
EDIT: You should also check the return code of pthread_create to see if it is failing.

How do we extend the program from 4 to 8 threads in C [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have started learning C in Uni, and now I'm stuck on posix threads. I have a program that has a single thread, 2 threads and 4 threads as an example from lecture. I need your help to extend this program from 4 to 8/16/32 and how it will perform a difference or not?
Thank you in advance.
Here is the code for 4 thread programm:
/****************************************************************************
This program finds groups of three numbers that when multiplied together
equal 98931313. Compile with:
cc -o factorise4 factorise4.c -lrt -pthread
Kevan Buckley, University of Wolverhampton, October 2012
*****************************************************************************/
#include <stdlib.h>
#include <stdio.h>
#include <ctype.h>
#include <errno.h>
#include <sys/stat.h>
#include <string.h>
#include <time.h>
#include <pthread.h>
#include <math.h>
#define goal 98931313
typedef struct arguments {
int start;
int n;
} arguments_t;
void factorise(int n) {
pthread_t t1, t2, t3, t4;
//1st pthread
arguments_t t1_arguments;
t1_arguments.start = 0;
t1_arguments.n = n;
//2nd pthread
arguments_t t2_arguments;
t2_arguments.start = 250;
t2_arguments.n = n;
//3rd pthread
arguments_t t3_arguments;
t3_arguments.start = 500;
t3_arguments.n = n;
//4th pthread
arguments_t t4_arguments;
t4_arguments.start = 750;
t4_arguments.n = n;
void *find_factors();
//creating threads
pthread_create(&t1, NULL, find_factors, &t1_arguments);
pthread_create(&t2, NULL, find_factors, &t2_arguments);
pthread_create(&t3, NULL, find_factors, &t3_arguments);
pthread_create(&t4, NULL, find_factors, &t4_arguments);
pthread_join(t1, NULL);
pthread_join(t2, NULL);
pthread_join(t3, NULL);
pthread_join(t4, NULL);
}
//Using 3 loops, 1 loop represents one value that we need to find, and go throught it until 98931313 not will be find.
void *find_factors(arguments_t *args){
int a, b, c;
for(a=args->start;a<args->start+250;a++){
for(b=0;b<1000;b++){
for(c=0;c<1000;c++){
if(a*b*c == args->n){
printf("solution is %d, %d, %d\n", a, b, c);// Printing out the answer
}
}
}
}
}
// Calculate the difference between two times.
long long int time_difference(struct timespec *start, struct timespec *finish, long long int *difference) {
long long int ds = finish->tv_sec - start->tv_sec;
long long int dn = finish->tv_nsec - start->tv_nsec;
if(dn < 0 ) {
ds--;
dn += 1000000000;
}
*difference = ds * 1000000000 + dn;
return !(*difference > 0);
}
//Prints elapsed time
int main() {
struct timespec start, finish;
long long int time_elapsed;
clock_gettime(CLOCK_MONOTONIC, &start);
factorise(goal); //This is our goal = 98931313
clock_gettime(CLOCK_MONOTONIC, &finish);
time_difference(&start, &finish, &time_elapsed);
printf("Time elaipsed was %lldns or %0.9lfs\n", time_elapsed, (time_elapsed/1.0e9));
return 0;
}
I'll give you a hint:
If you call a function twice manually, you can put its results into two separate variables:
int y0 = f(0);
int y1 = f(1);
You as well can put them into one array:
int y[2];
y[0] = f(0);
y[1] = f(1);
Or into a memory area on heap (obtained via malloc()):
int * y = malloc(2 * sizeof(*y));
y[0] = f(0);
y[1] = f(1);
In the latter two cases, you can replace the two function calls with
for (i = 0; i < 2; i++) {
y[i] = f(i);
}
Another hint:
For a changed number of threads, you will as well have to change your parameter set.
And another hint:
Thread creation, in your case, can be put into a function:
void facthread_create(pthread_t * thread, int start, int n)
{
arguments_t arguments;
arguments.start = start;
arguments.n = n;
void *find_factors();
//creating thread
pthread_create(thread, NULL, find_factors, &arguments);
}
But - there is a caveat: we have a race condition here. As soon as the thread starts, we can return and the stack space occupied by arguments is freed. So we use an improved version here which is useful for cooperation:
We add a field to arguments_t:
typedef struct arguments {
char used;
int start;
int n;
} arguments_t;
We set used to 0:
void facthread_create(pthread_t * thread, int start, int n)
{
arguments_t arguments;
arguments.start = start;
arguments.n = n;
arguments.used = 0;
void *find_factors();
//creating thread
pthread_create(thread, NULL, find_factors, &arguments);
while (!arguments.used); // wait until thread has "really" started
}
Set used to 1 once the data has safely copied:
void *find_factors(arguments_t *args){
arguments_t my_args = *args; // is this valid? Don't remember... If not, do it element-wise...
*args.used = 1; // Inform the caller that it is safe to continue
int a, b, c;
for(a=my_args.start;a<my_args.start+250;a++){
...
You should get a command line parameter (maybe -t for threads). Then instead of calling factorise from main, have a for loop which does the thread create with the parameter which is calculated from the loop number. Something like:
for (int i = 0; i < threads; i++) {
arguments.start = 250 * i;
arguments.n = n;
pthread_start(...)
}
Note that you should allocate the argument structs before the for loop for clarity.
Let me know if you need more help.
Here is some more help:
0) get the number of threads and the skip (in your case 250) from the command line.
1) create a control stuct which contains the args for the thread, the thread id, etc.
2) using the args, allocate the control struct and fill it in.
3) do a for loop to spawn off the treads.
4) do another for loop to wait for the threads to complete.
For some extra complexity, you could introduce a global variable which any thread could set to signal the other threads that the work is done and they should exit. But don't do this until you get the simple case correct.
If you post some updated code, I will help you some more.

Resources