Monte Carlo with threading - c

This is what I am trying to accomplish.Write a multithreaded program in C (or C++/C#) that creates 5 threads. Each thread should generate 1,000 random points and count the number of points that occur within the circle. The main thread should wait for the five threads to terminate one after another. Once a thread is terminated, the main thread updates the value of PI using the total number of points in the circle and the total number of points generated by the terminated thread. For example, the main thread waits for the first thread to terminate. When the first thread is terminated, the main thread incorporates the total number of points in the circle obtained by the first thread to update the value of PI. Next, the main thread waits for the second thread to terminate. When the second thread is terminated, the main thread updates the value of PI using the total number of points in the circle obtained by the second thread, and so on.
I keep getting an error saying that non-void function doesn't return a value, so is there anyway I can get around that or what are some alternatives for me?
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
long incircle = 0;
long ppt; /* points per thread*/
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
void *runner() {
long incircle_thread = 0;
unsigned int rand_state = rand();
long i;
for (i = 0; i < ppt; i++) {
double x = rand_r(&rand_state) / ((double)RAND_MAX + 1) * 2.0 - 1.0;
double y = rand_r(&rand_state) / ((double)RAND_MAX + 1) * 2.0 - 1.0;
if (x * x + y * y < 1) {
incircle += incircle_thread;
int main(int argc, const char *argv[])
if (argc != 3) {
fprintf(stderr, "usage: ./pi <total points> <threads>\n");
long totalpoints = atol(argv[1]);
int thread_count = atoi(argv[2]);
ppt = totalpoints / thread_count;
time_t start = time(NULL);
pthread_t *threads = malloc(thread_count * sizeof(pthread_t));
pthread_attr_t attr;
int i;
for (i = 0; i < thread_count; i++) {
pthread_create(&threads[i], &attr, runner, (void *) NULL);
for (i = 0; i < thread_count; i++) {
pthread_join(threads[i], NULL);
double points_per_thread = 0.0;
printf("Pi: %f\n", (4. * (double)incircle) / ((double)points_per_thread * thread_count));
printf("Time: %d sec\n", (unsigned int)(time(NULL) - start));
return 0;

The return type of 'runner' is void*, so that's what it needs to return
In your case, it looks like you would just want to add
return NULL;


how to compute sum of n/m Gregory-Leibniz terms in C language

get the two values named m & n from the command line arguments and convert them into integers. now after that create m threads and each thread computes the sum of n/m terms in Gregory-Leibniz Series.
pi = 4 * (1 - 1/3 + 1/5 - 1/7 + 1/9 - ...)
Now when thread finishes its computation, print its partial sum and atomically add it to a shared global variable.
& how to check that all of the m computational threads have done the atomic additions?
I share my source code, what I tried
#include <stdlib.h>
pthread_barrier_t barrier;
int count;
long int term;
// int* int_arr;
double total;
void *thread_function(void *vargp)
int thread_rank = *(int *)vargp;
// printf("waiting for barrier... \n");
// printf("we passed the barrier... \n");
double sum = 0.0;
int n = count * term;
int start = n - term;
// printf("start %d & end %d \n\n", start, n);
for(int i = start; i < n; i++)
sum += pow(-1, i) / (2*i+1);
// v += 1 / i - 1 / (i + 2);
total += sum;
// int_arr[count] = sum;
printf("thr %d : %lf \n", thread_rank, sum);
return NULL;
int main(int argc,char *argv[])
if (argc <= 2) {
printf("missing arguments. please pass two num. in arguments\n");
int m = atoi(argv[1]); // get value of first argument
int n = atoi(argv[2]); // get value of second argument
// int_arr = (int*) calloc(m, sizeof(int));
count = 1;
term = n / m;
pthread_t thread_id[m];
int i, ret;
double pi;
/* Initialize the barrier. */
pthread_barrier_init(&barrier, NULL, m);
for(i = 0; i < m; i++)
ret = pthread_create(&thread_id[i], NULL , &thread_function, (void *)&i);
if (ret) {
printf("unable to create thread! \n");
for(i = 0; i < m; i++)
if(pthread_join(thread_id[i], NULL) != 0) {
perror("Failed to join thread");
pi = 4 * total;
printf("%lf ", pi);
return 0;
what I need :-
create M thread & each thread computes the sum of n/m terms in the Gregory-Leibniz Series.
first thread computes the sum of term 1 to n/m , the second thread computes the sum of the terms from (n/m + 1) to 2n/m etc.
when all the thread finishes its computation than print its partial sum and Value of Pi.
I tried a lot, but I can't achieve exact what I want. I got wrong output value of PI
for example : m = 16 and n = 1024
then it sometimes return 3.125969, sometimes 12.503874 , 15.629843, sometimes 6.251937 as a output of Pi value
please help me
Edited Source Code :
#include <inttypes.h>
#include <math.h>
#include <pthread.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
struct args {
uint64_t thread_id;
struct {
uint64_t start;
uint64_t end;
} range;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_barrier_t barrier;
long double total = 0;
uint64_t total_iterations = 0;
void *partial_sum(void *arg)
struct args *args = arg;
long double sum = 0;
printf("waiting for barrier in thread -> %" PRId64 "\n", args->thread_id);
// printf("we passed the barrier... \n");
for (uint64_t n = args->range.start; n < args->range.end; n++)
sum += pow(-1.0, n) / (1 + n * 2);
if (pthread_mutex_lock(&mutex)) {
total += sum;
total_iterations += args->range.end - args->range.start;
if (pthread_mutex_unlock(&mutex)) {
printf("thr %" PRId64 " : %.20Lf\n", args->thread_id, sum);
return NULL;
int main(int argc,char *argv[])
if (argc <= 2) {
fprintf(stderr, "usage: %s THREADS TERMS.\tPlease pass two num. in arguments\n", *argv);
int m = atoi(argv[1]); // get value of first argument & converted into int
int n = atoi(argv[2]); // get value of second argument & converted into int
if (!m || !n) {
fprintf(stderr, "Argument is zero.\n");
uint64_t threads = m;
uint64_t terms = n;
uint64_t range = terms / threads;
uint64_t excess = terms - range * threads;
pthread_t thread_id[threads];
struct args arguments[threads];
int ret;
/* Initialize the barrier. */
ret = pthread_barrier_init(&barrier, NULL, m);
if (ret) {
for (uint64_t i = 0; i < threads; i++) {
arguments[i].thread_id = i;
arguments[i].range.start = i * range;
arguments[i].range.end = arguments[i].range.start + range;
if (threads - 1 == i)
arguments[i].range.end += excess;
printf("In main: creating thread %ld\n", i);
ret = pthread_create(thread_id + i, NULL, partial_sum, arguments + i);
if (ret) {
for (uint64_t i = 0; i < threads; i++)
if (pthread_join(thread_id[i], NULL))
printf("Pi value is : %.10Lf\n", 4 * total);
printf("COMPLETE? (%s)\n", total_iterations == terms ? "YES" : "NO");
return 0;
In each thread, the count variable is expected to be of a steadily increasing value in this expression
int n = count * term;
being one larger than it was in the "previous" thread, but count is only increased later on in each thread.
Even if you were to "immediately" increase count, there is nothing that guards against two or more threads attempting to read from and write to the variable at the same time.
The same issue exists for total.
The unpredictability of these reads and writes will lead to indeterminate results.
When sharing resources between threads, you must take care to avoid these race conditions. The POSIX threads library does not contain any atomics for fundamental integral operations.
You should protect your critical data against a read/write race condition by using a lock to restrict access to a single thread at a time.
The POSIX threads library includes a pthread_mutex_t type for this purpose. See:
pthread_mutex_init / pthread_mutex_destroy
pthread_mutex_lock / pthread_mutex_unlock
Additionally, as pointed out by #Craig Estey, using (void *) &i as the argument to the thread functions introduces a race condition where the value of i may change before any given thread executes *(int *) vargp;.
The suggestion is to pass the value of i directly, storing it intermediately as a pointer, but you should use the appropriate type of intptr_t or uintptr_t, which are well defined for this purpose.
pthread_create(&thread_id[i], NULL , thread_function, (intptr_t) i)
int thread_rank = (intptr_t) vargp;
How to check that all of the m computational threads have done the atomic additions?
Sum up the number of terms processed by each thread, and ensure it is equal to the expected number of terms. This can also naturally be assumed to be the case if all possible errors are accounted for (ensuring all threads run to completion and assuming the algorithm used is correct).
A moderately complete example program:
#define _POSIX_C_SOURCE 200809L
#include <inttypes.h>
#include <math.h>
#include <pthread.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
struct args {
uint64_t thread_id;
struct {
uint64_t start;
uint64_t end;
} range;
pthread_mutex_t mutex;
long double total = 0;
uint64_t total_iterations = 0;
void *partial_sum(void *arg)
struct args *args = arg;
long double sum = 0;
for (uint64_t n = args->range.start; n < args->range.end; n++)
sum += pow(-1.0, n) / (1 + n * 2);
if (pthread_mutex_lock(&mutex)) {
total += sum;
total_iterations += args->range.end - args->range.start;
if (pthread_mutex_unlock(&mutex)) {
printf("thread(%" PRId64 ") Partial sum: %.20Lf\n", args->thread_id, sum);
return NULL;
int main(int argc,char **argv)
if (argc < 3) {
fprintf(stderr, "usage: %s THREADS TERMS\n", *argv);
uint64_t threads = strtoull(argv[1], NULL, 10);
uint64_t terms = strtoull(argv[2], NULL, 10);
if (!threads || !terms) {
fprintf(stderr, "Argument is zero.\n");
uint64_t range = terms / threads;
uint64_t excess = terms - range * threads;
pthread_t thread_id[threads];
struct args arguments[threads];
if (pthread_mutex_init(&mutex, NULL)) {
for (uint64_t i = 0; i < threads; i++) {
arguments[i].thread_id = i;
arguments[i].range.start = i * range;
arguments[i].range.end = arguments[i].range.start + range;
if (threads - 1 == i)
arguments[i].range.end += excess;
int ret = pthread_create(thread_id + i, NULL , partial_sum, arguments + i);
if (ret) {
for (uint64_t i = 0; i < threads; i++)
if (pthread_join(thread_id[i], NULL))
printf("%.10Lf\n", 4 * total);
printf("COMPLETE? (%s)\n", total_iterations == terms ? "YES" : "NO");
Using 16 threads to process 1 billion terms:
$ ./a.out 16 10000000000
thread(14) Partial sum: 0.00000000000190476190
thread(10) Partial sum: 0.00000000000363636364
thread(2) Partial sum: 0.00000000006666666667
thread(1) Partial sum: 0.00000000020000000000
thread(8) Partial sum: 0.00000000000555555556
thread(15) Partial sum: 0.00000000000166666667
thread(0) Partial sum: 0.78539816299744868408
thread(3) Partial sum: 0.00000000003333333333
thread(13) Partial sum: 0.00000000000219780220
thread(11) Partial sum: 0.00000000000303030303
thread(4) Partial sum: 0.00000000002000000000
thread(5) Partial sum: 0.00000000001333333333
thread(7) Partial sum: 0.00000000000714285714
thread(6) Partial sum: 0.00000000000952380952
thread(12) Partial sum: 0.00000000000256410256
thread(9) Partial sum: 0.00000000000444444444

c - progress for each thread

I've got a program that takes n numbers that generates a sum of each number from 0 to N. A new thread is created for each number given:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
struct sum_runner_struct {
long long limit;
long long answer;
// Thread function to generate sum of 0 to N
void* sum_runner(void* arg)
struct sum_runner_struct *arg_struct = (struct sum_runner_struct*) arg;
long long sum = 0;
for (long long i = 0; i <= arg_struct->limit; i++)
sum += i;
arg_struct->answer = sum;
int main(int argc, char **argv)
if (argc < 2)
printf("Usage: %s <num 1> <num 2> ... <num-n>\n", argv[0]);
int num_args = argc - 1;
struct sum_runner_struct args[num_args];
// Launch thread
pthread_t tids[num_args];
for (int i = 0; i < num_args; i++)
args[i].limit = atoll(argv[i + 1]);
pthread_attr_t attr;
pthread_create(&tids[i], &attr, sum_runner, &args[i]);
// Wait until thread is done its work
for (int i = 0; i < num_args; i++)
pthread_join(tids[i], NULL);
printf("Sum for thread %d is %lld\n", i, args[i].answer);
I want to display the progress of each thread (maybe in a percentage?) and from there I can calculate overall progress given each thread progress. I don't know how I can implement the progress for each thread though, how could I go about doing this?
One way to do it would be to add a global mutex, plus add a new member variable like long long current_index to your sum_runner_struct.
Every so often (e.g. maybe once every 1 million iterations of the for-loop?), each thread would then lock the mutex, set arg_struct->current_index=i;, and then unlock the mutex.
Then the main thread could then occasionally lock the mutex, iterate over the array sum_runner_structs to print out each thread's current_index value, and also tally up the sum of all of the values for the global-progress calculation, then unlock the mutex.

How to use pthreads in order to perform simultaneous operations on an array with constraint?

I'm using pthreads in C in order to perform two operations on an int array: one operation doubles the value of a cell, the other operation halves the value of the cell. If after doubling a cell its value will become greater than the max allowed value the thread needs to wait until another thread will halve the value of that cell. The way I initialized the array is that the first 5 cells have value that is very close to max allowed and the other five have a value far from the max.
I decided to use a global mutex and condition variable for this. In the main first spawn 10 doubler threads then another 10 halver threads. But then my program freezes. I can't understand what the problem is, any help is appreciated.
My motivation is to better understand pthreads and condition variables.
This is the code:
#include <stdio.h>
#include <stdlib.h>
#include <ntsid.h>
#include <pthread.h>
#include <unistd.h>
#define MAX 20
#define THREADS_NUM 10
#define OFFSET 10
typedef struct myStruct {
int cellId;
} myStruct;
int * cells;
pthread_mutex_t globalMutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t globalCond = PTHREAD_COND_INITIALIZER;
pthread_t threads[THREADS_NUM * 2];
void * DoublerThread(void * arg) {
myStruct * myStr = (myStruct *) arg;
int id = myStr->cellId;
pthread_mutex_t mutex = globalMutex;
pthread_cond_t condition = globalCond;
while((cells[id] * 2) > MAX) {
printf("Waiting... id = %d\n", id);
pthread_cond_wait(&condition, &mutex);
cells[id] *= 2;
printf("new val = %d, id = %d\n", cells[id], id);
void * HalverThread(void * arg) {
myStruct * myStr = (myStruct *) arg;
int id = myStr->cellId;
pthread_mutex_t mutex = globalMutex;
pthread_cond_t condition = globalCond;
cells[id] /= 2;
void initMyStructs(myStruct ** myStructs) {
int i;
for(i = 0; i < THREADS_NUM * 2; i++) {
myStructs[i] = (myStruct *) malloc(sizeof(myStruct) * 2);
if(!myStructs[i]) {
printf("malloc error\n");
myStructs[i]->cellId = i % THREADS_NUM;
void initCells() {
int i, tmp;
cells =(int *) malloc(sizeof(int));
if(!cells) {
printf("malloc error\n");
for(i = 0; i <= THREADS_NUM; i++) {
if(i < THREADS_NUM / 2) {
cells[i] = MAX - 1;
} else {
tmp = cells[i] = 1;
int main() {
int i;
myStruct ** myStructs;
//create 10 Doubler threads
for(i = 0; i < THREADS_NUM; i++) {
pthread_create(&threads[i], NULL, DoublerThread, (void *) myStructs[i]);
//create 10 Halver threads
for(i = 0; i < THREADS_NUM; i++) {
pthread_create(&threads[i + OFFSET], NULL, HalverThread, (void *) myStructs[i + OFFSET]);
for(i = 0; i < THREADS_NUM + OFFSET; i++) {
pthread_join(threads[i], NULL);
return 0;
You have made “private” mutexes and condition variables for each thread, so they are not synchronizing in any (meaningful) way. Rather than this:
pthread_mutex_t mutex = globalMutex;
pthread_cond_t condition = globalCond;
Just use the globalMutex, and globalCond -- that is what you actually want.
I moved this in here, because I think we are supposed to. I can't intuit SO-iquette.
By the way, just to make sure I understand this, the mutex is per
cell, so that multiple threads can work on multiple cells
simultaneously, right? Just not two threads on the same cell. –
So, what you probably want is something more like:
typedef struct myStruct {
int cellId;
pthread_mutex_t lock;
pthread_cond_t wait;
} myStruct;
and in InitMyStruct():
myStructs[i]->cellId = i % THREADS_NUM;
pthread_mutex_init(&myStructs[i]->lock, NULL);
pthread_cond_init(&myStructs[i]->wait, NULL);
and in Halvers:
cells[id] /= 2;
and Doubler:
while((cells[id] * 2) > MAX) {
printf("Waiting... id = %d\n", id);
pthread_cond_wait(&myStr->wait, &myStr->lock);
cells[id] *= 2;
printf("new val = %d, id = %d\n", cells[id], id);
So currently, only one thread can make changes to the array at a time?
But then the program exits after about a second, if threads couldn't
be making changes to the array simultaneously then wouldn't the
program take 10 seconds to finish, because each HalverThread sleeps
for 1 second. – Yos 6 hours
The Halvers sleep before grabbing the mutex, thus all sleep near simultaneously, wake up, fight for mutex and continue.

pthread same ID and output self_t

i hope i will put my question very clear, i am programming pthread,Briefly i calculate the number of threads needed, and pass created threads to a function and back, the function does transpose on different blocks; so each thread has its own block.
To check that im sending different threads, i run pthread_t self_t, but face two problems:
that seems only one same thread is used, and that i always have warning message about the type output of selt_t, below code simplified showing main pints.
any ideas where i went wrong ?
First here struct and main:
pthread_mutex_t mutexZ; // Mutex initialize
int array[nn][nn];
struct v
int i, j; // threaded Row,Col
int n, y; //
int iMAX; //
void *transposeM(void *arg);
int main(int argc, char *argv[])
int Thread_Num = 10;
pthread_t t_ID[Thread_Num]; // the number of threads depending on # blocks
printf("Thread_Num %d\n", Thread_Num);
struct v *data = (struct v *) malloc(sizeof(struct v));
int i, j; //loop varables
printf("Matrix Initial before Transpose Done\n");
// printing the Matrix Before any transpose if needed testing
for (i = 0; i < nn; i++){
for(j = 0; j< nn; j++){
array[i][j] = i*nn + j;
printf("%d ", array[i][j]);
// Initialize the mutex
pthread_mutex_init(&mutexZ, NULL);
pthread_attr_t attr; //Set of thread attributes
int n, y; // Loop Variables for tiling
//Start of loop transpose:
int start = 0;
for (n = 0; n < nn; n += TILE)
data->n = n; // row
for (y = 0; y <= n; y += TILE) {
data->y = y; // column
printf("y Tile:%d \n", y);
printf("Start before:%d \n", start);
//Transpose the other blocks, thread created for each Block transposed
pthread_create(&(t_ID[start]), NULL, transposeM, (void*) data); // Send the thread to the function
pthread_join(t_ID[start], NULL);
if (start < Thread_Num)
start = start + 1;
printf("Start after:%d \n", start);
} // End the Y column TileJump loop
} // End of n Row TileJump loop
Modified according to the notes,
void *transposeM(void *arg)
// Transposing the tiles
struct v *data = arg;
int i, j; //loop row and column
int temp = 0;
pthread_mutex_lock(&mutexZ); //lock the running thread here,so keeps block until thread that holds mutex releases it
pthread_t self_t; // To check the thread id - my check not Mandetory to use
self_t = pthread_self();
printf("Thread number Main = %u \n ", self_t); //here we used u% coz seems the pthread_t is unsigned long data type
//here some function to work
return (NULL);
} // End
There are two conceptual issues with your code:
You pass the same reference/addrerss to each thread, making each thread work on the same data.
You join the thread immediately after having created it. As joining block until the thread to be joined ended, this sequentialises the running of all threads.
To get around 1. created a unique instance of what data points to for each thread.
To fix 2. move the call to pthread_join() out of the loop creating the threads and put it in a 2nd loop run after creation-loop.
printf("Thread_Num %d\n", Thread_Num);
pthread_t t_ID[Thread_Num]; // the number of threads depending on # blocks
struct v data_ID[Thread_Num] = {0}; // define an instance of data for ech thread
for (n = 0; n < nn; n += TILE) //limit of row
struct v * data = data_ID + start; // assign thread specific instance
data->n = n; // row
for (y = 0; y <= n; y += TILE) // limit of column -here removd the =n, then diagonal tile is not transposed
pthread_create(&(t_ID[start]), NULL, transposeM, (void*) data); // Send the thread to the function
} // End the Y column TileJump loop
for (;start >= 0; --start)
pthread_join(t_ID[start], NULL);
Modifications to the thread function:
void *transposeM(void *arg)
struct v *data = arg;
pthread_t self = pthread_self(); // better naming
pthread_exit(NULL); // the thread functions exits here.
return NULL; // this is never reached, but is necessary to calm down thr compiler.
} // End

Calling a void* function from main in C

Currently I am working on a program that uses threads to calculate the sum of square roots. My program works, however one of the requirements is to use the main thread to find the initial value, and as soon as I call the function Void *calc from main, the program breaks. Is there a certain way to make such a function call? Is this because the function is a pointer? Any help is appreciated.
#include <pthread.h>
#include <stdio.h>
#include <math.h>
#include <unistd.h>
#define NUM_THREADS 3
int ARGV;
pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
double total = 0;
void *calc(void* t){
int ph = (int)t + 1;
int start, stop, interval_size;
interval_size = ARGV/(NUM_THREADS + 1);
start = ((ph) * interval_size) + 1;
stop = (ph * interval_size) + 1;
double ttl;
int i;
for (i = start; i <= stop; i++){
ttl = ttl + sqrt(i);
printf("Total Thread %i %lf\n", ph, ttl);
total = total + ttl;
int main(int argc, char* argv[]) {
int i;
double ttl;
ARGV = atoi(argv[1]);
pthread_t ti[NUM_THREADS];
for (i = 0; i < NUM_THREADS; i++) {
pthread_create(&ti[i], NULL, calc,(void *)i);
/*for (i = 1; i <= (ARGV / 4) ; i++){
ttl = ttl + sqrt(i);
for (i = 0; i < NUM_THREADS; i++) {
pthread_join(ti[i], NULL);
total = total + ttl;
printf("Result: %lf\n", total);
The program breaks as in the function seems to only be called once, instead of each thread using the function. The only value printed out is some vague incorrect number.
Your calc function does pthread_exit. Now pthread_exit can and should be called from the main thread, so that's fine
To allow other threads to continue execution, the main thread
should terminate by calling pthread_exit() rather than exit(3).
But since this happens before any other thread has been created, the program just exits straight away, without ever starting other threads.
