Multi threading average cost per operation - c

I am learning to write multithreaded programs in c and I just noticed that as I increase the number of iterations for a given thread, the cost per operation goes down.
For example if I have 2 threads and each one adds a number to a global variable and then subtracts the same number, if each thread does this 1000 times lets say, the cost per operation is much higher compared with if each thread does this 1000000 times. Why is this ?
static int num_iterations = 1;
int opt_yield=0;
void add(long long *pointer, long long value) {
long long sum = *pointer + value;
if (opt_yield)
*pointer = sum;
struct arg_struct {
long long counter;
long long value;
void *aux_add(void *arguments)
struct arg_struct *args = arguments;
int i=0;
for (i=0;i<num_iterations;i++)
args->value = 1;
args->value = -1;
int main(int argc, char * argv[])
int num_threads = 2;
pthread_t t[num_threads];
struct arg_struct args;
args.counter = 0;
int count=0;
if( pthread_create(&threads[count],NULL,&aux_add, (void *) &args) !=0)
pthread_join(threads[count], NULL);
return 0;

Simply because creation and destruction of threads is not for free - It takes an overhead in the OS. The more time your thread consumes actually working, the less in relation the overhead (which is constant) applies to your overall run time.


Can I use pthreads over the same function on C?

I have a doubt that might be silly guys. I am having a function to calculate some mathematical formulas as an example.
# include <stdio.h>
# include <time.h>
# include <stdlib.h>
# include <pthread.h>
# include <unistd.h>
# include <math.h>
pthread_mutex_t a_mutex = PTHREAD_MUTEX_INITIALIZER;
volatile long int a = 0;
void threadOne(void *arg)
int i;
long int localA = 0;
for (i = 1; i < 50000000; i++)
localA = localA + i*a*sqrt(a);
a = a + localA;
void threadTwo(void *arg)
int i;
long int localA = 0;
for (i = 50000000; i <= 100000000; i++)
localA = localA + i*a*sqrt(a);
a = a + localA;
int main (int argc, char **argv)
pthread_t one, two;
int i;
pthread_create(&one, NULL, (void*)&threadOne, NULL);
pthread_create(&two, NULL, (void*)&threadTwo, NULL);
pthread_join(one, NULL);
pthread_join(two, NULL);
Now this is an example I found, I am having two functions with a thread each one, so one is calculated on a different thread. But can I have just one function and then have two threads to one function, so the function runs twice with different data?. My idea is this one: I am having just one function that can have two different sets of data, then the function can run with the first set or the second set depending on the thread is running.
But is this possible even?. I want to avoid something as copying the function twice as here.
Lets use say that I only keep the function
void threadOne(void *arg)
But I run it twice using different threads at same time with different data, this can be achieved or I am just being silly?.
Yes, this can be done by making use of the argument to the thread function.
Each thread needs to loop over a range of values. So create a struct definition to contain the min and max values:
struct args {
int min;
int max;
Define a single thread function which converts the void * argument to a pointer to this type and reads it:
void *thread_func(void *arg)
struct args *myargs = arg;
int i;
long int localA = 0;
for (i = myargs->min; i < myargs->max; i++)
localA = localA + i*a*sqrt(a);
a = a + localA;
return NULL;
(Note that the function needs to return a void * to conform to the interface pthread_create expects.)
Then in your main function create an instance of this struct for each set of arguments, and pass that to pthread_create:
int main (int argc, char **argv)
pthread_t one, two;
struct args args1 = { 1, 50000000 };
struct args args2 = { 50000000 , 100000000 };
pthread_create(&one, NULL, thread_func, &args1);
pthread_create(&two, NULL, thread_func, &args2);
pthread_join(one, NULL);
pthread_join(two, NULL);

Bounded buffer sharing with Pthread and mutex locks busy waiting

I am trying to create a producer consumer queue, using mutex locks, creating busy waiting between threads. My Main file takes X amount of integer arguments, pushes them onto a BOUNDED BUFFER of size 50. I am using a while loop to do this since you do not know the amount before hand. I am not sure when and where to create my producer thread.
NOTE: Main is a "producer" in the sense it fills the buffer, but my actual producer function is going to pass onto my consumer function later in my code, so disregard the names. Main is going to "Produce" numbers by pushing and producer is going to pop those numbers for later use. My question is where and when do I make my Pthread_create in my code for producer and am I using the Mutex locks correctly to have synchronization between the two threads?
#include <pthread.h>
#include <stdlib.h>
#include <math.h>
#include <stdio.h>
#define BUFFER_SIZE (50)
typedef struct {
int buffer[BUFFER_SIZE];
int count;
int top;
int next;
pthread_mutex_t count_lock;
} prodcons;
void pc_init(prodcons *pc);
int pc_pop(prodcons *pc);
void pc_push(prodcons *pc, int val);
void factor2pc(prodcons *pc, int number);
void *producer(void *data);
void *consumer(void *data);
int main(int argc, char *argv[])
int index = 1;
int num;
prodcons pc_nums;
//pthread_t tid[argc - 1];
pthread_t tid;
pthread_attr_t attr;
if (argc < 2) {
fprintf(stderr, "usage: No arguments\n");
return -1;
if (atoi(argv[1]) <= 0)
fprintf(stderr, "%d not > 0 or you must provide a positive integer.\n", atoi(argv[1]));
return -1;
pthread_create(&tid, &attr, *producer, &pc_nums);
while (index < argc)
num = atoi(argv[index]);
pc_push(&pc_nums, num);
void *producer(void *data)
prodcons *dataStruct = data;
while (dataStruct->count < BUFFER_SIZE)
number = pc_pop(data);
//This print is just here to make sure I am correctly "poping" from buffer
printf("%d\n", number);
void pc_init(prodcons *pc)
pc->count = 0;
pc->top = 0;
pc->next = 0;
if (pthread_mutex_init(&pc->count_lock, NULL) != 0)
printf("\n mutex init has failed\n");
int pc_pop(prodcons *pc)
int val;
if (pc->count > pc->top)
val = pc->buffer[pc->count];
printf("%d\n", val);
pc->buffer[pc->count] = 0;
return val;
void pc_push(prodcons *pc, int val)
if (pc->count < BUFFER_SIZE)
pc->buffer[pc->count] = val;
My question is where and when do I make my Pthread_create in my code for producer and am I using the Mutex locks correctly to have synchronization between the two threads?
As long as all is properly initialized and synchronized, you can put the pthread_create() call wherever you want, including where it's placed in the given program. But at least two things are wrong:
pc_pop() behaves undefined (by return of an uninitialized value) if there is no number in the buffer to pop.
Since dataStruct->count is accessed by producer() without locking, the declaration should be _Atomic(int) count;.

Multiple threads to find prime factors of integers, segmentation fault

I can't figure out what I am doing wrong with my pointers. It is causing a segmentation fault. I am convinced the problem is rooted in my use of the array of pointers I have and the pthread_join I am using.
The goal is to read multiple integers into a gcc compiler, then print out the integer with all its factors, like this, 12: 2 2 3
I created a struct containing an int array to store the factors of each integer as the factor function pulls it apart and a counter(numfact) to store how many factors there are stored in the array.
I commented out the section at the bottom that prints out the factors.
I think the problem is how I am trying to store the output from the pthread_join in the pointer array, ptr[]. Whenever I comment it out, it does not get the segmentation error.
Either I have my pointers screwed up in a way I don't understand or I can't use an array of pointers. Either way, after many hours, I am stuck.
Please help.
#include <stdio.h>
#include <pthread.h>
#include <math.h>
#include <stdlib.h>
struct intfact
long int factors[100];
int numfact;
struct intfact *factor(long int y)
struct intfact threadfact;
threadfact.numfact = 0;
// Store in struct the number of 2s that divide y
while (y % 2 == 0)
threadfact.factors[threadfact.numfact] = 2;
y = y/2;
// Store in struct the odds that divide y
for (int i = 3; i <= floor(sqrt(y)); i = i+2)
while (y % i == 0)
threadfact.factors[threadfact.numfact] = i;
y = y/i;
// Store in struct the primes > 2
if (y > 2)
threadfact.factors[threadfact.numfact] = y;
struct intfact *rtnthred = &threadfact;
return rtnthred;
/* Trial Division Function */
void *divde(void *n)
long int *num = (long int *) n;
struct intfact *temp = factor(*num);
return temp;
/* Main Function */
int main(int argc, char *argv[])
pthread_t threads[argc-1];
void *ptr[argc-1];
/* loop to create all threads */
for(int i=0; i < argc; i++)
long temp = atol(argv[i+1]);
pthread_create(&threads[i], NULL, divde, (void *) temp);
/* loop to join all threads */
for(int i=0; i < argc; i++)
pthread_join(threads[i],(void *) ptr[i]); //THIS POINTER IS THE PROBLEM
/* loops to print results of each thread using pointer array*/
//for(int i = 0; i < argc; i++)
// printf("%s: ", argv[i+1]); /* print out initial integer */
// struct intfact *temp = (struct intfact *) ptr[i]; //cast void pointer ptr as struct intfact pointer
// printf("%d", temp->numfact);
//for(int j = 0; j < temp->numfact; j++) /*(pull the numfact(count of factors) from the struct intfact pointer??)*/
// printf("%d ", temp->factors[j]); /* print out each factor from thread struct */
In my Linux) terminal this code is stored in p3.c
"./p3 12" should yeild "12: 2 2 3"
For starters:
long temp = atol(argv[i+1]);
pthread_create(&threads[i], NULL, divde, (void *) temp);
you define a long int and pass it as argument to the thread. For example 12
Inside the thread function then
void *divde(void *n)
long int *num = (long int *) n;
you treat the long int passed in as pointer to long int.
And then here dereference it
... = factor(*num);
So this *num for example would become *12. That is referencing memory address 12 to read out its content and pass it to factor). Aside the fact that this mostly likely is an invalid address, there would be nothing relevant store, at least nothing your code defined.
To (more or less fix) this do
void *divde(void *n)
long int num = (long int) n;
... = factor(num);
The second issues is mentioned in the comment: Multiple threads to find prime factors of integers, segmentation fault
The problem you are trying to solve is a special case of parallel programming, namely that the tasks to be run in parallel are completely independent. In such cases it makes sense to give each task its own context. Here such a context would include the
the thread specific input
as well as its specific output.
In C grouping variables can be done using structures, as your implementation already comes up with for the output of the tasks:
struct intfact
long int factors[100];
int numfact;
So what is missing is thread-id and input. Just add those for example like this.
/* group input and output: */
struct inout
long int input;
struct intfact output;
/* group input/output with thread-id */
struct context
pthread_t thread_id;
struct inout io;
Now before kicking off the threads define as many contexts as needed:
int main(int argc, char *argv[])
size_t num_to_process = argv - 1;
struct context ctx[num_to_process];
then create the threads passing in what is needed, that is input along with space/memory for the output:
for (size_t i = 0; i < num_to_process ; i++)
ctx[i].io.input = atol(argv[i]);
pthread_create(&ctx[i].thread_id, NULL, divide, &ctx[i].io);
Inside the thread function convert the void-pointer received back to its real type:
void *divide(void * pv)
struct inout * pio = pv; /* No cast needed in C. */
Define the processing function to take a pointer to the context specific input/output variables:
void factor(struct inout * pio) /* No need to return any thing */
/* Initialise the output: */
pio->output.numfact = 0;
/* set local copy of input: */
long int y = pio->input; /* One could also just use pio->input directly. */
Replace all other occurrences of threadfact by pio->output.
to leave the processing function.
Then inside the thread function call the processing function:
return NULL;
to leave the thread function.
In main() join without expecting any result from the threads:
/* loop to join all threads */
for (size_t i = 0; i < num_to_process; i++)
pthread_join(ctx[i].thread_id, NULL);
Putting this all together:
#include <stdlib.h>
#include <stdio.h>
#include <pthread.h>
#include <math.h>
struct intfact
long int factors[100];
size_t numfact;
/* group input and output: */
struct inout
long int input;
struct intfact output;
/* group input/output with thread-id */
struct context
pthread_t thread_id;
struct inout io;
void factor(struct inout * pio)
/* Initialise the output: */
pio->output.numfact = 0;
/* set local copy of input: */
long int y = pio->input; /* One could also just use pinout->input directly. */
if (0 == y)
return; /* Nothing to do! */
// Store in struct the number of 2s that divide y
while (y % 2 == 0)
pio->output.factors[pio->output.numfact] = 2;
y = y/2;
// Store in struct the odds that divide y
for (int i = 3; i <= floor(sqrt(y)); i = i+2)
while (y % i == 0)
pio->output.factors[pio->output.numfact] = i;
y = y/i;
// Store in struct the primes > 2
if (y > 2)
pio->output.factors[pio->output.numfact] = y;
void *divide(void * pv)
struct inout * pio = pv; /* No cast needed in C. */
return NULL;
int main(int argc, char *argv[])
size_t num_to_process = argc - 1;
struct context ctx[num_to_process];
for (size_t i = 0; i < num_to_process; i++)
ctx[i].io.input = atol(argv[i+1]);
if (!ctx[i].io.input)
fprintf(stderr, "COnversion to integer failed or 0 for '%s'\n", argv[i]);
pthread_create(&ctx[i].thread_id, NULL, divide, &ctx[i].io);
/* loop to join all threads */
for (size_t i=0; i < num_to_process; i++)
pthread_join(ctx[i].thread_id, NULL);
/* loops to print results of each thread using pointer array*/
for(size_t i = 0; i < num_to_process; i++)
printf("%ld: ", ctx[i].io.input); /* print out initial integer */
printf("%zu factors --> ", ctx[i].io.output.numfact);
for(size_t j = 0; j < ctx[i].io.output.numfact; j++)
printf("%ld ", ctx[i].io.output.factors[j]); /* print out each factor from thread struct */
putc('\n', stdout);

Multithreading in C, Fibonacci Program

I have newly started studying operating systems and creating processes/threads on Linux system by using C programming language(thats what is expected us to use) but I have some problems on the code that I've been trying to write:
Here is my code written on an Ubuntu system:
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
int total = 0;
typedef struct
int start;
int end;
int threadNo;
void *work(void *parameters);
int threadCount;
int main(int argc, char* argv[])
printf("please give the number of terms you want to diplay..");
scanf("%d", &threadCount);
pthread_t tid[threadCount];
pthread_attr_t attr[threadCount];
int n;
lpParameter = malloc(sizeof(THREAD_PARAMETERS)* threadCount);
int i=0;
for(i=0; i<threadCount; i++)
lpParameter[i].start = 0;
lpParameter[i].end = 1;
lpParameter[i].threadNo = i + 1;
for(i=0; i<threadCount; i++)
return 1;
void fibonacci(int a)
int prev_term = 0, current_term = 1, next_term = 0;
else if(a==1){
void *work(void * parameters)
The problem is the program counts with the threadCount variable but what the program prints is just threadCount times zeros.
And the main question is how can I make each of the threads write "only one term" of the Fibonacci series depending on the number of terms (which is at the same time the number of threads) entered by the user? Is there any other more logical way to implement this kind of program?
You are using lpParameter[i] as the argument to each thread's work, but then ignore its contents when calling fibonacci.

Multithreading in c using a thread safe random numbers

I have been trying to get this to pass valgrind leak check and also pass in 2 billion random numbers and divide them between the threads. I keep getting a seg fault once I get to 1 billion random numbers. Where am I allocating wrong or what am I doing wrong?
struct thread
long long int threadID; //The thread id
long long int operations; //The number of threads
void *generateThreads(void *ptr)
struct thread *args = ptr;
struct random_data *rdata = (struct random_data *) calloc(args->operations*64,sizeof(struct random_data));
char *statebuf = (char*) calloc(args->operations*64,BUFSIZE);
long long int i;
int32_t value;
for(i = 0; i < args->operations; i++)
if(DEBUG > 1)
printf("I am thread %lld with thread id %X\n", args->threadID, (unsigned int) pthread_self());
int main(int argc, char **argv)
long long int numRandoms;
long long int numThreads;
double timeStart = 0;
double timeElapsed = 0;
pthread_t *tid;
struct thread args;
if (argc != 3)
fprintf(stderr, "Usage: %s <Number of Randoms> <Number of Threads>\n" ,argv[0]);
/* Assign the arg values to appropriate variables */
sscanf(argv[1],"%lld",&numRandoms); /* lld for long long int */
sscanf(argv[2],"%lld",&numThreads); /* lld for long long int */
/* Number of threads must be less than or equal to the number of random numbers */
if(numRandoms < numThreads)
fprintf(stderr,"Number of threads must be less than or equal to the number of random numers.\n");
long long int i;
args.operations = numRandoms/numThreads;
timeStart = getMilliSeconds();
tid = (pthread_t *) calloc(numThreads,sizeof(pthread_t));
/* value is the thread id, creating threads */
for(i = 0; i < numThreads; i++)
args.threadID = i;
pthread_create(&tid[i],NULL,generateThreads,(void *) &args);
/* Joining the threads */
for(i = 0; i < numThreads; i++)
timeElapsed = getMilliSeconds() - timeStart;
OK I figured out what you were trying to do. The problem was that whatever code you copied from used initstate_r in main to set up the states for all threads. It called initstate_r once per thread to set up the rng for that thread. But you copied that loop into each thread, so you were calling initstate_r many times per thread which is useless. The *64 was there originally to make each state occupy 64 bytes in order to keep them on separate cache lines. You probably were referring to this stackoverflow question.
Here is your function rewritten to make much more sense:
void *generateThreads(void *ptr)
struct thread *args = ptr;
struct random_data *rdata = calloc(1,sizeof(struct random_data));
char statebuf[BUFSIZE];
long long int i;
int32_t value;
initstate_r((int) pthread_self(),statebuf,BUFSIZE,rdata);
for(i = 0; i < args->operations; i++)
if(DEBUG > 1)
printf("%d\n", value);
if(DEBUG > 1)
printf("I am thread %lld with thread id %X\n", args->threadID, (unsigned int) pthread_self());
By the way, the way you pass your arguments to your threads is wrong. You pass the same args to each thread, which means they are sharing the same args structure, which means they each share the same args->threadID. You should instead pass each thread its own args structure.
My answer to question link provides thread safe pseudo-random number generator designed for __uint64/__uint128 integers using xorshift algorithm.
Additional properties:
seeded from two variant sources of enthropy
