Accelerate a C program using pthreads

Accelerate a C program using pthreads - c

I'm new here, and also I'm relatively new in programming in general. I' ve writen a program in C and I need to accelerate it using pthreads. I've tried to do so using OpenMP, but I don't know how to debug it. Also I need to find out if the programm is faster using pthreads and the times, but I don't know how to write this in my code. Here is my code
enter code here
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <pthread.h>
#define NTHREADS 2
#define FYLLO(komvos) ((komvos) * 2 + 1)
long factorial(long);
void heap_function (int [], int, int );
void make_heap(long [], int );
void pop_heap(long [], int );
struct thread_data
{
long int n;
long int k;
long *b;
};
main()
{
long int n,k,c,fact=1;
long *a,*b,*d,p[k];
int i,j,rc;
int q[]={2,3,4,5,6,7,8,9,12,13,14,15,16};
pthread_t thread[NTHREADS];
struct thread_data threada;
for(i=0;i<NTHREADS;i++)
{
threada.n=n;
threada.k=k;
threada.b=b;
pthread_create (&thread[i], NULL, (void *)&threada);
}
for (i=0; i<NTHREADS; i++)
rc = pthread_join (thread[i], NULL);
for(i=0;i<13;i++)
{
k=pow(2,q[i])-1;
if(a=(long*)malloc(i*sizeof(long))==NULL);
{
printf("Den yparxei diathesimi mnimi gia desmeusi\n");
exit(1);
}
a[i]=k;
for(a[0];a[13];a[i]++)
{
n=(pow(2,q[i]))*k;
if(d=(long*)malloc((i*i)*sizeof(long))==NULL);
{
printf("Den yparxei diathesimi mnimi gia desmeusi\n");
exit(1);
}
d[i]=n;
}
c=(factorial(n))/((factorial(k))*(factorial(n-k)));
}
if(b=(long*)malloc(((i*i)+i)*sizeof(long))==NULL)
{
printf("Den yparxei diathesimi mnimi gia desmeusi\n");
exit(1);
}
for(i=0;i<13;i++)
{
b[i]=a[i];
}
for(i=13;i<182;i++) /* Gia i=13 exoume i^2=169 kai i^2+i=182*/
{
b[i]=d[i];
}
long heap[sizeof(b)];
make_heap( heap, sizeof(b) );
printf("To heap einai:\n");
for ( i = sizeof(b); i >=0; i-- )
{
printf( "%d ", heap[0] );
pop_heap( heap, i );
}
for(i=(n-k);i<=n;i++)
for(j=0;j<k;j++)
{
p[j]=heap[i];
printf("Ta %d mikrotera stoixeia eina ta %ld\n",k,p[j]);
}
free((void*)b);
getch();
}
long factorial(long n)
{
int a;
long result=1;
for( a=1;a<=n;a++ )
result=result*a;
return(result);
}
void heap_function( int a[], int i, int n )
{
while ( FYLLO( i ) < n ) /* Vazoume sto heap ta stoixeia san ypodentra */
{
int fyllo = FYLLO( i );
if ( fyllo + 1 < n && a[fyllo] < a[fyllo + 1] ) /* Dialegoume to maegalytero apo ta dyo paidia komvous */
++fyllo;
if ( a[i] < a[fyllo] ) /* Metaferoume to megalytero komvo sti riza */
{
int k = a[i];
a[i] = a[fyllo];
a[fyllo] = k;
}
++i; /* Synexizoume ston epomeno komvo */
}
}
void make_heap( long a[], int n ) /*Dhmioyrgoume ti sinartisi make_heap gia na mporesoume na valoume ta
stoixeia pou dwsame mesa sto heap kai na ta ta3inomisoume*/
{
int i = n / 2;
while ( i-- > 0 )
heap_function( a, i, n );
}
void pop_heap( long heap[], int n ) /*Dhmiourgoume ti sinartisi pop_heap gia na mporesoume na e3agoume
ta stoixeia apo to heap apo to megalytero sto mikrotero*/
{
long k = heap[0];
heap[0] = heap[n];
heap[n] = k;
heap_function( heap, 0, n ); /*Afou emfanistei to prwto stoixeio kaloume ti sinartisi heap_function
gia na ta3inomisei ta stoixeia pou menoun sto heap*/
}
Sorry for my messed up mail, but I'm new her now I'm getting to use it

Adding threads may not accelerate your program, it lets you do organize your work into execution units which can appear to run in parallel (and on multi-core systems, generally can run in parallel). If you're not on a multi-core system you can still gain an advantage if one or more of your threads must block waiting for slow input because other thread(s) can continue to run; this may or may not give you a faster runtime, depending on your actual program.
Debugging threads is generally more difficult than debugging a single thread, and how to do it comes down to the tools you have available. If your debugger is not able to make the job easier for you, I would recommend you first make your program run serially -- still break it up using a threaded model, but let the code for each run in the primary thread and let it run till completion, if your model permits this. Many threaded applications cannot be written like that because threads depend on each other during runtime, but it just depends on what you're doing exactly.
Now to your specific situation -- you're diving into the deep end when you don't know how to swim yet. I would suggest you first learn to use threads without the complexity of why you need them, otherwise you're making the problem more complicated than it needs to be. http://cs.gmu.edu/~white/CS571/Examples/Pthread/create.c has a simple example to get started with. Take particular notice to the parameters of the pthread_create() call and compare to what you've done; your code is missing the 3rd parameter -- the function to run as a thread. You appear to have no such function at all, and instead you seem to believe that the code following the call to pthread_create() is what runs in parallel. This is how fork() works, but that's very different.
That should be enough to get you started. http://cs.gmu.edu/~white/CS571/Examples/pthread_examples.html has additional examples, and a google of "pthread tutorial" would probably be helpful.

Related

How to solve the dining philosophers problem with only mutexes?

I wrote this program to solve the dining philosophers problem using Dijkstra's algorithm, notice that I'm using an array of booleans (data->locked) instead of an array of binary semaphores.
I'm not sure if this solution is valid (hence the SO question).
Will access to the data->locked array in both test and take_forks functions cause data races? if so is it even possible to solve this problem using Dijkstra's algorithm with only mutexes?
I'm only allowed to use mutexes, no semaphores, no condition variables (it's an assignment).
Example of usage:
./a.out 4 1000 1000
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <stdbool.h>
#define NOT_HUNGRY 1
#define HUNGRY 2
#define EATING 3
#define RIGHT ((i + 1) % data->n)
#define LEFT ((i + data->n - 1) % data->n)
typedef struct s_data
{
int n;
int t_sleep;
int t_eat;
int *state;
bool *locked;
pthread_mutex_t *state_mutex;
} t_data;
typedef struct s_arg
{
t_data *data;
int i;
} t_arg;
int ft_min(int a, int b)
{
if (a < b)
return (a);
return (b);
}
int ft_max(int a, int b)
{
if (a > b)
return (a);
return (b);
}
// if the LEFT and RIGHT threads are not eating
// and thread number i is hungry, change its state to EATING
// and signal to the while loop in `take_forks` to stop blocking.
// if a thread has a state of HUNGRY then it's guaranteed
// to be out of the critical section of `take_forks`.
void test(int i, t_data *data)
{
if (
data->state[i] == HUNGRY
&& data->state[LEFT] != EATING
&& data->state[RIGHT] != EATING
)
{
data->state[i] = EATING;
data->locked[i] = false;
}
}
// set the state of the thread number i to HUNGRY
// and block until the LEFT and RIGHT threads are not EATING
// in which case they will call `test` from `put_forks`
// which will result in breaking the while loop
void take_forks(int i, t_data *data)
{
pthread_mutex_lock(data->state_mutex);
data->locked[i] = true;
data->state[i] = HUNGRY;
test(i, data);
pthread_mutex_unlock(data->state_mutex);
while (data->locked[i]);
}
// set the state of the thread number i to NOT_HUNGRY
// then signal to the LEFT and RIGHT threads
// so they can start eating when their neighbors are not eating
void put_forks(int i, t_data *data)
{
pthread_mutex_lock(data->state_mutex);
data->state[i] = NOT_HUNGRY;
test(LEFT, data);
test(RIGHT, data);
pthread_mutex_unlock(data->state_mutex);
}
void *philosopher(void *_arg)
{
t_arg *arg = _arg;
while (true)
{
printf("%d is thinking\n", arg->i);
take_forks(arg->i, arg->data);
printf("%d is eating\n", arg->i);
usleep(arg->data->t_eat * 1000);
put_forks(arg->i, arg->data);
printf("%d is sleeping\n", arg->i);
usleep(arg->data->t_sleep * 1000);
}
return (NULL);
}
void data_init(t_data *data, pthread_mutex_t *state_mutex, char **argv)
{
int i = 0;
data->n = atoi(argv[1]);
data->t_eat = atoi(argv[2]);
data->t_sleep = atoi(argv[3]);
pthread_mutex_init(state_mutex, NULL);
data->state_mutex = state_mutex;
data->state = malloc(data->n * sizeof(int));
data->locked = malloc(data->n * sizeof(bool));
while (i < data->n)
{
data->state[i] = NOT_HUNGRY;
data->locked[i] = true;
i++;
}
}
int main(int argc, char **argv)
{
pthread_mutex_t state_mutex;
t_data data;
t_arg *args;
pthread_t *threads;
int i;
if (argc != 4)
{
fputs("Error\nInvalid argument count\n", stderr);
return (1);
}
data_init(&data, &state_mutex, argv);
args = malloc(data.n * sizeof(t_arg));
i = 0;
while (i < data.n)
{
args[i].data = &data;
args[i].i = i;
i++;
}
threads = malloc(data.n * sizeof(pthread_t));
i = 0;
while (i < data.n)
{
pthread_create(threads + i, NULL, philosopher, args + i);
i++;
}
i = 0;
while (i < data.n)
pthread_join(threads[i++], NULL);
}

Your spin loop while (data->locked[i]); is a data race; you don't hold the lock while reading it data->locked[i], and so another thread could take the lock and write to that same variable while you are reading it. In fact, you rely on that happening. But this is undefined behavior.
Immediate practical consequences are that the compiler can delete the test (since in the absence of a data race, data->locked[i] could not change between iterations), or delete the loop altogether (since it's now an infinite loop, and nontrivial infinite loops are UB). Of course other undesired outcomes are also possible.
So you have to hold the mutex while testing the flag. If it's false, you should then hold the mutex until you set it true and do your other work; otherwise there is a race where another thread could get it first. If it's true, then drop the mutex, wait a little while, take it again, and retry.
(How long is a "little while", and what work you choose to do in between, are probably things you should test. Depending on what kind of fairness algorithms your pthread implementation uses, you might run into situations where take_forks succeeds in retaking the lock even if put_forks is also waiting to lock it.)
Of course, in a "real" program, you wouldn't do it this way in the first place; you'd use a condition variable.

C multithread performance issue

I am writing a multi-threaded program to traverse an n x n matrix, where the elements in the main diagonal are processed in a parallel manner, as shown in the code below:
int main(int argc, char * argv[] )
{
/* VARIABLES INITIALIZATION HERE */
gettimeofday(&start_t, NULL); //start timing
for (int slice = 0; slice < 2 * n - 1; ++slice)
{
z = slice < n ? 0 : slice - n + 1;
int L = 0;
pthread_t threads[slice-z-z+1];
struct thread_data td[slice-z-z+1];
for (int j=z; j<=slice-z; ++j)
{
td[L].index= L;
printf("create:%d\n", L );
pthread_create(&threads[L],NULL,mult_thread,(void *)&td[L]);
L++;
}
for (int j=0; j<L; j++)
{
pthread_join(threads[j],NULL);
}
}
gettimeofday(&end_t, NULL);
printf("Total time taken by CPU: %ld \n", ( (end_t.tv_sec - start_t.tv_sec)*1000000 + end_t.tv_usec - start_t.tv_usec));
return (0);
}
void *mult_thread(void *t)
{
struct thread_data *my_data= (struct thread_data*) t;
/* SOME ADDITIONAL CODE LINES HERE */
printf("ThreadFunction:%d\n", (*my_data).index );
return (NULL);
}
The problem is that this multithreaded implementation gave me a very bad performance compared with the serial (naive) implementation.
Are there some adjustments that could be done to improve the performance of the multithreaded version ??

a thread pool may make it better.
define a new struct type as follow.
typedef struct {
struct thread_data * data;
int status; // 0: ready
// 1: adding data
// 2: data handling, 3: done
int next_free;
} thread_node;
init :
size_t thread_size = 8;
thread_node * nodes = (thread_node *)malloc(thread_size * sizeof(thread_node));
for(int i = 0 ; i < thread_size - 1 ; i++ ) {
nodes[i].next_free = i + 1;
nodes[i].status = 0 ;
}
nodes[thread_size - 1].next_free = -1;
int current_free_node = 0 ;
pthread_mutex_t mutex;
get thread :
int alloc() {
pthread_mutex_lock(&mutex);
int rt = current_free_node;
if(current_free_node != -1) {
current_free_node = nodes[current_free_node].next_free;
nodes[rt].status = 1;
}
pthread_mutex_unlock(&mutex);
return rt;
}
return thread :
void back(int idx) {
pthread_mutex_lock(&mutex);
nodes[idx].next_free = current_free_node;
current_free_node = idx;
nodes[idx].status = 0;
pthread_mutex_unlock(&mutex);
}
create the threads first, and use alloc() to try to get a idle thread, update the pointer.
don't use join to judge the status.
modify your mult_thread as a loop and after the job finished , just change your status to 3
for each loop in the thread , you may give it more work
I wish it will give you some help.
------------ UPDATED Apr. 23, 2015 -------------------
here is a example.
compile & run with command
$ g++ thread_pool.cc -o tp -pthread --std=c++
yu:thread_pool yu$ g++ tp.cc -o tp -pthread --std=c++11 && ./tp
1227135.147 1227176.546 1227217.944 1227259.340...
time cost 1 : 1068.339091 ms
1227135.147 1227176.546 1227217.944 1227259.340...
time cost 2 : 548.221607 ms
you may also remove timer and it can also compiled as a std c99 file.
In current , the thread size has been limited to 2. You may also adjust the parameter thread_size, and recompile & run again. More threads may give your some more advantage(in my pc, if I change the thread size to 4, the task will finish in 280ms), while too much thread number may not help you too much if you have no enough cpu thread.

How do we extend the program from 4 to 8 threads in C [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have started learning C in Uni, and now I'm stuck on posix threads. I have a program that has a single thread, 2 threads and 4 threads as an example from lecture. I need your help to extend this program from 4 to 8/16/32 and how it will perform a difference or not?
Thank you in advance.
Here is the code for 4 thread programm:
/****************************************************************************
This program finds groups of three numbers that when multiplied together
equal 98931313. Compile with:
cc -o factorise4 factorise4.c -lrt -pthread
Kevan Buckley, University of Wolverhampton, October 2012
*****************************************************************************/
#include <stdlib.h>
#include <stdio.h>
#include <ctype.h>
#include <errno.h>
#include <sys/stat.h>
#include <string.h>
#include <time.h>
#include <pthread.h>
#include <math.h>
#define goal 98931313
typedef struct arguments {
int start;
int n;
} arguments_t;
void factorise(int n) {
pthread_t t1, t2, t3, t4;
//1st pthread
arguments_t t1_arguments;
t1_arguments.start = 0;
t1_arguments.n = n;
//2nd pthread
arguments_t t2_arguments;
t2_arguments.start = 250;
t2_arguments.n = n;
//3rd pthread
arguments_t t3_arguments;
t3_arguments.start = 500;
t3_arguments.n = n;
//4th pthread
arguments_t t4_arguments;
t4_arguments.start = 750;
t4_arguments.n = n;
void *find_factors();
//creating threads
pthread_create(&t1, NULL, find_factors, &t1_arguments);
pthread_create(&t2, NULL, find_factors, &t2_arguments);
pthread_create(&t3, NULL, find_factors, &t3_arguments);
pthread_create(&t4, NULL, find_factors, &t4_arguments);
pthread_join(t1, NULL);
pthread_join(t2, NULL);
pthread_join(t3, NULL);
pthread_join(t4, NULL);
}
//Using 3 loops, 1 loop represents one value that we need to find, and go throught it until 98931313 not will be find.
void *find_factors(arguments_t *args){
int a, b, c;
for(a=args->start;a<args->start+250;a++){
for(b=0;b<1000;b++){
for(c=0;c<1000;c++){
if(a*b*c == args->n){
printf("solution is %d, %d, %d\n", a, b, c);// Printing out the answer
}
}
}
}
}
// Calculate the difference between two times.
long long int time_difference(struct timespec *start, struct timespec *finish, long long int *difference) {
long long int ds = finish->tv_sec - start->tv_sec;
long long int dn = finish->tv_nsec - start->tv_nsec;
if(dn < 0 ) {
ds--;
dn += 1000000000;
}
*difference = ds * 1000000000 + dn;
return !(*difference > 0);
}
//Prints elapsed time
int main() {
struct timespec start, finish;
long long int time_elapsed;
clock_gettime(CLOCK_MONOTONIC, &start);
factorise(goal); //This is our goal = 98931313
clock_gettime(CLOCK_MONOTONIC, &finish);
time_difference(&start, &finish, &time_elapsed);
printf("Time elaipsed was %lldns or %0.9lfs\n", time_elapsed, (time_elapsed/1.0e9));
return 0;
}

I'll give you a hint:
If you call a function twice manually, you can put its results into two separate variables:
int y0 = f(0);
int y1 = f(1);
You as well can put them into one array:
int y[2];
y[0] = f(0);
y[1] = f(1);
Or into a memory area on heap (obtained via malloc()):
int * y = malloc(2 * sizeof(*y));
y[0] = f(0);
y[1] = f(1);
In the latter two cases, you can replace the two function calls with
for (i = 0; i < 2; i++) {
y[i] = f(i);
}
Another hint:
For a changed number of threads, you will as well have to change your parameter set.
And another hint:
Thread creation, in your case, can be put into a function:
void facthread_create(pthread_t * thread, int start, int n)
{
arguments_t arguments;
arguments.start = start;
arguments.n = n;
void *find_factors();
//creating thread
pthread_create(thread, NULL, find_factors, &arguments);
}
But - there is a caveat: we have a race condition here. As soon as the thread starts, we can return and the stack space occupied by arguments is freed. So we use an improved version here which is useful for cooperation:
We add a field to arguments_t:
typedef struct arguments {
char used;
int start;
int n;
} arguments_t;
We set used to 0:
void facthread_create(pthread_t * thread, int start, int n)
{
arguments_t arguments;
arguments.start = start;
arguments.n = n;
arguments.used = 0;
void *find_factors();
//creating thread
pthread_create(thread, NULL, find_factors, &arguments);
while (!arguments.used); // wait until thread has "really" started
}
Set used to 1 once the data has safely copied:
void *find_factors(arguments_t *args){
arguments_t my_args = *args; // is this valid? Don't remember... If not, do it element-wise...
*args.used = 1; // Inform the caller that it is safe to continue
int a, b, c;
for(a=my_args.start;a<my_args.start+250;a++){
...

You should get a command line parameter (maybe -t for threads). Then instead of calling factorise from main, have a for loop which does the thread create with the parameter which is calculated from the loop number. Something like:
for (int i = 0; i < threads; i++) {
arguments.start = 250 * i;
arguments.n = n;
pthread_start(...)
}
Note that you should allocate the argument structs before the for loop for clarity.
Let me know if you need more help.
Here is some more help:
0) get the number of threads and the skip (in your case 250) from the command line.
1) create a control stuct which contains the args for the thread, the thread id, etc.
2) using the args, allocate the control struct and fill it in.
3) do a for loop to spawn off the treads.
4) do another for loop to wait for the threads to complete.
For some extra complexity, you could introduce a global variable which any thread could set to signal the other threads that the work is done and they should exit. But don't do this until you get the simple case correct.
If you post some updated code, I will help you some more.

Thread Programming... No output in terminal

I m doing thread programming and trying to implement MonteCarlo technique for calculating Pi value in it. I compiled the code and I have no error but when I execute I get no output for it. Kindly correct me if there's any mistake.
Here's my code:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <pthread.h>
#define frand() ((double) rand() / (RAND_MAX))
#define MAX_LEN 1353
const size_t N = 4;
float circlePoints=0;
void* point_counter(void *param){
float xcord;
float ycord;
while(MAX_LEN){
xcord=frand();
ycord=frand();
float cord = (xcord*xcord) + (ycord*ycord);
if(cord <= 1){
circlePoints++;}
}
}
int main()
{
printf("out");
size_t i;
pthread_t thread[N];
srand(time(NULL));
for( i=0;i <4;++i){
printf("in creating thread");
pthread_create( &thread[i], NULL, &point_counter, NULL);
}
for(i=0;i <4;++i){
printf("in joining thread");
pthread_join( thread[i], NULL );
}
for( i=0;i <4;++i){
printf("in last thread");
float pi = 4.0 * (float)circlePoints /MAX_LEN;
printf("pi is %2.4f: \n", pi);
}
return 0;
}

You're hitting an infinite loop here:
while(MAX_LEN){
Since MAX_LEN is and remains non-zero.
As to why you see no output before that, see Why does printf not flush after the call unless a newline is in the format string?

You have an infinite loop in your thread function:
while(MAX_LEN){
...
}
So all the threads you create never come out that loop.
Also, circlePoints is modified by all the threads which will lead to race condition ( what's a race condition? ) and likely render the value incorrect. You should use a mutex lock to avoid it.

while(any_non_zero_number_which does_not_update)
{
infinite loop //not good unless you intend it that way
}

How to stop a specific thread in C process.h?

I just learn the bare bone of making a thread inside a program using process.h in C programming. And now, my problem is how to stop a specific thread.
Here is my code:
#include <stdio.h>
#include <windows.h>
#include <process.h>
void mimicCounter( void * );
int main()
{
int i;
printf( "Now in the main() function.\n" );
_beginthread( mimicCounter, 0, (void*)12 );
for(i = 2; i <= 10; i++){
Sleep(500);
printf("%d\n",i);
}
system("PAUSE");
printf("\n");
}
void mimicCounter( void *arg )
{
int i;
printf( "The mimicCounter() function was passed %d\n", (INT_PTR)arg ) ;
for(i = 1; i <= 10; i++){
Sleep(500);
printf("%d\n",i);
}
}
I just want to stop the thread that I have created (the mimicCounter function) when it reaches i = 5, (yeah I know I set it to 10 but this is for ending a thread demo).
Thank you so much :)

The _endthread and _endthreadex functions terminate a thread created by _beginthread or _beginthreadex, respectively. You can call _endthread or _endthreadex explicitly to terminate a thread; however, _endthread or _endthreadex is called automatically when the thread returns from the routine passed as a parameter to _beginthread or _beginthreadex. Terminating a thread with a call to endthread or _endthreadex helps to ensure proper recovery of resources allocated for the thread.
From http://msdn.microsoft.com/en-us/library/aa246804(v=vs.60).aspx

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Accelerate a C program using pthreads - c

Related

How to solve the dining philosophers problem with only mutexes?

C multithread performance issue

How do we extend the program from 4 to 8 threads in C [closed]

Thread Programming... No output in terminal

How to stop a specific thread in C process.h?

Categories

Resources