Squaring numbers w/ multiple threads - c

I'm trying to write a program that squares numbers 1-10,000 by creating 8 threads and each thread will take turns squaring ONE NUMBER EACH. Meaning that one thread will square 1, another will square 2, etc until all threads square a number. Then one thread will square 9, etc, all the way to 10,000. My code is below:
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <pthread.h>
#include <sys/types.h>
#define NUMBER_OF_THREADS 8
#define START_NUMBER 1
#define END_NUMBER 10000
FILE *f;
void *sqrtfunc(void *tid) { //function for computing squares
int i;
for (i = START_NUMBER; i<= END_NUMBER; i++){
if ((i % NUMBER_OF_THREADS) == pthread_self()){ //if i%8 == thread id
fprintf(f, "%lu squared = %lu\n", i, i*i); //then that thread should do this
}
}
}
int main(){
//Do not modify starting here
struct timeval start_time, end_time;
gettimeofday(&start_time, 0);
long unsigned i;
f = fopen("./squared_numbers.txt", "w");
//Do not modify ending here
pthread_t mythreads[NUMBER_OF_THREADS]; //thread variable
long mystatus;
for (i = 0; i < NUMBER_OF_THREADS; i++){ //loop to create 8 threads
mystatus = pthread_create(&mythreads[i], NULL, sqrtfunc, (void *)i);
if (mystatus != 0){ //check if pthread_create worked
printf("pthread_create failed\n");
exit(-1);
}
}
for (i = 0; i < NUMBER_OF_THREADS; i++){
if(pthread_join(mythreads[i], NULL)){
printf("Thread failed\n");
}
}
exit(1);
//Do not modify starting here
fclose(f);
gettimeofday(&end_time, 0);
float elapsed = (end_time.tv_sec-start_time.tv_sec) * 1000.0f + \
(end_time.tv_usec-start_time.tv_usec) / 1000.0f;
printf("took %0.2f milliseconds\n", elapsed);
//Do not modify ending here
}
I am not sure where my error is. I create my 8 threads in main, and then depending on their thread id (tid), I want that thread to square a number. As of right now, nothing is being printed into the output file and I can't figure out why. Is my tid comparison not doing anything? Any tips are appreciated. Thanks guys.

First, you intentionally pass a parameter to each thread so it know which thread it is (from 0 to 7) That is good, but you then don't use it anymore inside the thread (this leads to one of the possible confussions you have)
Second, as you say in the explanation of how the algorithm should go, you say each thread must square a different set of numbers, but all of them do square the same set of numbers (indeed the whole set of numbers)
You have two approaches to this: Let each thread square the number, and go for the next, eight places further (so the algorithm is the one described in your explanation) or you give different sets (each 1250 consecutive numbers) and let each thread act on is own separate interval.
That said, you have to reconstruct your for loop to do one of two:
for (i = parameter; i < MAX; i += 8) ...
or
for (i = 1250*parameter; i < 1250*(parameter+1); i++) ...
that way, you'll get each thread run with a different set of input numbers.

Related

Stuck at counting array elements greater than x in an array in C/MPI

So, I have this problem where I have an array and I must find the count of numbers that are greater than the number of index k in my array. So I implemented a master-worker strategy where I have a master that takes care of the I/O and split the work to the workers. In the master thread I have created the array in a matrix-like shape, so I could pass the sub-arrays easily to the workers (I know this sounds weird). Then also in the master thread I read all the values from the input to my sub-arrays and set the comp (comparison value) to the value of the k index value.
Then I pass the work portion size, the value for comparison and work data around to all the threads (including the master that gets its share of work). Finally, every worker do its job and report its result to the master, that while receiving the data from the workers will add their values to its own and then print the total result on the screen.
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <math.h>
int main(int argc, char *args[]){
int rank, psize;
MPI_Status status;
MPI_Init(&argc, &args);
MPI_Comm_size(MPI_COMM_WORLD, &psize);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
int *workvet, worksize, comp;
if(rank == 0){
int tam, k;
int **subvets, portion;
scanf("%d", &tam);
scanf("%d", &k);
portion = ceil((float)tam/(float)psize);
subvets = malloc(sizeof(int) * psize);
for(int i = 0; i < psize; i++)
subvets[i] = calloc(portion, sizeof(int));
for(int i = 0; i < psize; i++){
for(int j = 0; j < portion; j++){
if((i*j+j) < tam)
scanf("%d ", &subvets[i][j]);
if((i*j+j) == k)
comp = subvets[i][j];
}
}
for(int i = 1; i < psize; i++){
MPI_Send(&portion, 1, MPI_INT, i, i, MPI_COMM_WORLD);
MPI_Send(&comp, 1, MPI_INT, i, i, MPI_COMM_WORLD);
MPI_Send(subvets[i], portion, MPI_INT, i, i, MPI_COMM_WORLD);
}
workvet = calloc(portion, sizeof(int));
workvet = subvets[0];
worksize = portion;
} else {
MPI_Recv(&worksize, 1, MPI_INT, 0, rank, MPI_COMM_WORLD, &status);
MPI_Recv(&comp, 1, MPI_INT, 0, rank, MPI_COMM_WORLD, &status);
workvet = calloc(worksize, sizeof(int));
MPI_Recv(workvet, worksize, MPI_INT, 0, rank, MPI_COMM_WORLD, &status);
}
int maior = 0;
for(int i = 0; i < worksize; i++){
if(workvet[i] > comp)
maior++;
}
if(rank == 0){
int temp;
for(int i = 1; i < psize; i++){
MPI_Recv(&temp, 1, MPI_INT, i, rank, MPI_COMM_WORLD, &status);
maior += temp;
}
printf("%d números maiores que %d", maior, comp);
} else {
MPI_Send(&maior, 1, MPI_INT, 0, rank, MPI_COMM_WORLD);
}
MPI_Finalize();
}
My problem is that it looks like its stuck in a loop, and when trying to debug I put an printf in the main for that does the comparison in the sub-arrays and did infinite printing, however, when I put the same print anywhere else in the code, it won't be printed. I don't have any idea where I'm failing and have no idea on how I can debug my code.
Input data:
10 // size
7 // k
1 2 3 4 5 6 7 8 9 10 // elements
So, my program should count how many elements are greater than the element of index 7, which corresponds to the value 8, and this should return 2 in this case.
This is prefaced by my top comment re. tam being unitialized.
There are a number of additional issues ...
You're doing scanf to get a value for comp, but in the loops below it, you're assigning a new value to it (i.e. the prompted value is being trashed). That may be perfectly fine if the original value is treated as a default [if the loop fails to assign a new value], but it seems a bit rickety to me.
AFAICT, you are trying to loop on workvet in all processes. But, for the client ones, this does nothing because you don't send back the result [see below].
The clients are sending back maior but they never compute a value for it. And, main does not receive that value. It computes one of its own.
maior has no definition in your posted code. And, therefore is unitialized [even in main].
It looks like you want the clients to send back a single scalar value of their computed value of maior, but they do no calculation for it.
Thus, the clients send back a garbage maior value that the main process tries to sum.
You're sending portion to the clients, but they receive it as worksize. And, after main sends it, it assigns portion to worksize. I'd recommend using the same name in all places to reduce some confusion.
You've not provided any sample data so it's hard to debug this further here. Part of the problem is that only some of the values in subvets are initialized with the scanf in main, based on the if [or so it appears ...].
So, the clients will loop over possibly unitialized values in the given subvets array [sent to the client which receives it as workvet].
If the setup loops for subvets are correct as far as which values to send (that is, only certain selected values should be sent), I'm not sure you can do what you want with the 2D array method you have.
Without a problem statement describing the input data and what you want to do with it, it's difficult to divine what would be the correct code, but ...
A few guesses ...
You're calculating highest in all processes [probably useless in main], but then nobody does anything with it. My guess is that you want to calculate this in the client processes only. And, send this back to main as maior.
Then, main can sum the maior values from all the clients?
UPDATE:
I actually changed maior to highest to post the issue here, so it would make a bit of sense (maior is greater in portuguese) but failed to do so for all instances
As I mentioned, I guessed as much -- no worries. Side note: In fact, your English is quite good. And, it was nice of you to translate the code. Some others post in English, but leave the code in their native language. This can slow things down a bit. Sometimes, I've put the code into Google translate just to try to make sense of it.
I just updated the code without the translation to reflect what I'm working on. So, for the subvets part I actually thought of this being a matrix, where I would send each of its lines as being one array to each of the worker threads, and the if statement is there to only read up until the size of the array has been reached, thus, leaving the rest of the values as 0 (because I used calloc, thus making this approach fit to the problem I have to solve)
There's really no need for a 2D array. Just fill a 1D array, and then give each worker different offsets and counts into that single array [see below].
By trying to do everything in a single function main, this is probably what caused some of the problems with separating main and worker tasks.
By splitting things up into [more] functions, this can make things easier. We can use the same variable names in master and worker for the same data without any naming conflicts.
Also, a good maxim ... Don't replicate code
The various MPI_* calls take a lot of parameters because they're general purpose. Isolating them to wrapper functions can make things simpler and debugging easier.
Note that the second argument to MPI_Send/MPI_Recv is a count and not number of bytes (hence, not sizeof) (i.e. a bug). By putting them in wrapper functions, the call could be fixed once in a single place.
I did make a slight change to the split logic. In your code [AFAICT] you were having the main/master process do some of the calculation. That's fine but I prefer to have the main process available as a control process and not encumbered by much data calculation. So, in my version, only the worker processes actually process the array.
Sometimes it helps to isolate the calculation algorithm/logic from the MPI code. I did this below by putting it in a function docalc. This allowed the adding of a diagnostic cross check at the end.
Anyway, below it the code. It's been heavily refactored and has many comments:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <math.h>
// _dbgprt -- debug print
#define _dbgprt(_fmt...) \
do { \
printf("%d: ",myrank); \
printf(_fmt); \
} while (0)
#ifdef DEBUG
#define dbgprt(_fmt...) \
_dbgprt(_fmt)
#else
#define dbgprt(_fmt...) \
do { \
} while (0)
#endif
int myrank; // current rank
int numproc; // number of processes in comm group
// dataload -- read in the data
int *
dataload(FILE *xfsrc,int worksize)
{
int *workvet;
// get enough space
workvet = calloc(worksize,sizeof(int));
// fill the array
for (int idx = 0; idx < worksize; ++idx)
fscanf(xfsrc,"%d",&workvet[idx]);
return workvet;
}
// docalc -- count number of values greater than limit
int
docalc(int *workvet,int worksize,int k)
{
int count = 0;
for (int idx = 0; idx < worksize; ++idx) {
if (workvet[idx] > k)
count += 1;
}
return count;
}
// sendint -- send some data
void
sendint(int rankto,int *data,int count)
{
int tag = 0;
// NOTE: second argument is an array _count_ and _not_ the number of bytes
MPI_Send(data,count,MPI_INT,rankto,tag,MPI_COMM_WORLD);
}
// recvint -- receive some data
void
recvint(int rankfrom,int *data,int count)
{
int tag = 0;
MPI_Status status;
MPI_Recv(data,count,MPI_INT,rankfrom,tag,MPI_COMM_WORLD,&status);
}
// worker -- perform all worker operations
void
worker(void)
{
int master = 0;
// get array count
int worksize;
recvint(master,&worksize,1);
// get limit value
int k;
recvint(master,&k,1);
// allocate space for data
int *workvet = calloc(worksize,sizeof(int));
// get that data
recvint(master,workvet,worksize);
// calculate number of elements higher than limit
int count = docalc(workvet,worksize,k);
// send back result
sendint(master,&count,1);
}
// master -- perform all master operations
void
master(int argc,char **argv)
{
int isfile;
FILE *xfsrc;
int workrank;
// get the data either from stdin or from a file passed on the command line
do {
isfile = 0;
xfsrc = stdin;
if (argc <= 0)
break;
xfsrc = fopen(*argv,"r");
if (xfsrc == NULL) {
perror(*argv);
exit(1);
}
isfile = 1;
} while (0);
// get number of data elements
int worksize;
fscanf(xfsrc,"%d",&worksize);
// get limit [pivot]
int k;
fscanf(xfsrc,"%d",&k);
dbgprt("master: PARAMS worksize=%d k=%d\n",worksize,k);
// read in the data array
int *workvet = dataload(xfsrc,worksize);
if (isfile)
fclose(xfsrc);
// get number of workers
// NOTE: we do _not_ have the master do calculations [for simplicity]
// usually, for large data, we want the master free to control things
int numworkers = numproc - 1;
// get number of elements for each worker
int workper = worksize / numworkers;
dbgprt("master: LOOP numworkers=%d workper=%d\n",numworkers,workper);
// send data to other workers
int remain = worksize;
int offset = 0;
int portion;
for (workrank = 1; workrank < numproc; ++workrank,
offset += portion, remain -= portion) {
// get amount for this worker
portion = workper;
// last proc must get all remaining work
if (workrank == (numproc - 1))
portion = remain;
dbgprt("master: WORK/%d offset=%d portion=%d\n",
workrank,offset,portion);
// send the worker's data count
sendint(workrank,&portion,1);
// send the pivot point
sendint(workrank,&k,1);
// send the data to worker
sendint(workrank,&workvet[offset],portion);
}
// accumulate count
int total = 0;
int count;
for (workrank = 1; workrank < numproc; ++workrank) {
recvint(workrank,&count,1);
total += count;
}
printf("%d numbers bigger than %d\n",total,k);
// do cross check of MPI result against a simple single process solution
#ifdef CHECK
count = docalc(workvet,worksize,k);
printf("master count was %d -- %s\n",
count,(count == total) ? "PASS" : "FAIL");
#endif
}
// main -- main program
int
main(int argc,char **argv)
{
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numproc);
MPI_Comm_rank(MPI_COMM_WORLD,&myrank);
// skip over program name
--argc;
++argv;
if (myrank == 0)
master(argc,argv);
else
worker();
MPI_Finalize();
return 0;
}

why is this c code causing a race condition?

I'm trying to count the number of prime numbers up to 10 million and I have to do it using multiple threads using Posix threads(so, that each thread computes a subset of 10 million). However, my code is not checking for the condition IsPrime. I'm thinking this is due to a race condition. If it is what can I do to ameliorate this issue?
I've tried using a global integer array with k elements but since k is not defined it won't let me declare that at the file scope.
I'm running my code using gcc -pthread:
/*
Program that spawns off "k" threads
k is read in at command line each thread will compute
a subset of the problem domain(check if the number is prime)
to compile: gcc -pthread lab5_part2.c -o lab5_part2
*/
#include <math.h>
#include <stdio.h>
#include <time.h>
#include <pthread.h>
#include <stdlib.h>
typedef int bool;
#define FALSE 0
#define TRUE 1
#define N 10000000 // 10 Million
int k; // global variable k willl hold the number of threads
int primeCount = 0; //it will hold the number of primes.
//returns whether num is prime
bool isPrime(long num) {
long limit = sqrt(num);
for(long i=2; i<=limit; i++) {
if(num % i == 0) {
return FALSE;
}
}
return TRUE;
}
//function to use with threads
void* getPrime(void* input){
//get the thread id
long id = (long) input;
printf("The thread id is: %ld \n", id);
//how many iterations each thread will have to do
int numOfIterations = N/k;
//check the last thread. to make sure is a whole number.
if(id == k-1){
numOfIterations = N - (numOfIterations * id);
}
long startingPoint = (id * numOfIterations);
long endPoint = (id + 1) * numOfIterations;
for(long i = startingPoint; i < endPoint; i +=2){
if(isPrime(i)){
primeCount ++;
}
}
//terminate calling thread.
pthread_exit(NULL);
}
int main(int argc, char** args) {
//get the num of threads from command line
k = atoi(args[1]);
//make sure is working
printf("Number of threads is: %d\n",k );
struct timespec start,end;
//start clock
clock_gettime(CLOCK_REALTIME,&start);
//create an array of threads to run
pthread_t* threads = malloc(k * sizeof(pthread_t));
for(int i = 0; i < k; i++){
pthread_create(&threads[i],NULL,getPrime,(void*)(long)i);
}
//wait for each thread to finish
int retval;
for(int i=0; i < k; i++){
int * result = NULL;
retval = pthread_join(threads[i],(void**)(&result));
}
//get the time time_spent
clock_gettime(CLOCK_REALTIME,&end);
double time_spent = (end.tv_sec - start.tv_sec) +
(end.tv_nsec - start.tv_nsec)/1000000000.0f;
printf("Time tasken: %f seconds\n", time_spent);
printf("%d primes found.\n", primeCount);
}
the current output I am getting: (using the 2 threads)
Number of threads is: 2
Time tasken: 0.038641 seconds
2 primes found.
The counter primeCount is modified by multiple threads, and therefore must be atomic. To fix this using the standard library (which is now supported by POSIX as well), you should #include <stdatomic.h>, declare primeCount as an atomic_int, and increment it with an atomic_fetch_add() or atomic_fetch_add_explicit().
Better yet, if you don’t care about the result until the end, each thread can store its own count in a separate variable, and the main thread can add all the counts together once the threads finish. You will need to create, in the main thread, an atomic counter per thread (so that updates don’t clobber other data in the same cache line), pass each thread a pointer to its output parameter, and then return the partial tally to the main thread through that pointer.
This looks like an exercise that you want to solve yourself, so I won’t write the code for you, but the approach to use would be to declare an array of counters like the array of thread IDs, and pass &counters[i] as the arg parameter of pthread_create() similarly to how you pass &threads[i]. Each thread would need its own counter. At the end of the thread procedure, you would write something like, atomic_store_explicit( (atomic_int*)arg, localTally, memory_order_relaxed );. This should be completely wait-free on all modern architectures.
You might also decide that it’s not worth going to that trouble to avoid a single atomic update per thread, declare primeCount as an atomic_int, and then atomic_fetch_add_explicit( &primeCount, localTally, memory_order_relaxed ); once before the thread procedure terminates.

How to create thread blocks?

I have two questions.
First:
I need to create thread blocks gradually not more then some max value, for example 20.
For example, first 20 thread go, job is finished, only then 20 second thread go, and so on in a loop.
Total number of jobs could be much larger then total number of threads (in our example 20), but total number of threads should not be bigger then our max value (in our example 20).
Second:
Could threads be added continuously? For example, 20 threads go, one thread job is finished, we see that total number of threads is 19 but our max value is 20, so we can create one more thread, and one more thread go :)
So we don't waste a time waiting another threads job to be done and our total threads number is not bigger then our some max value (20 in our example) - sounds cool.
Conclusion:
For total speed I consider the second variant would be much faster and better, and I would be very graceful if you help me with this, but also tell how to do the first variant.
Here is me code (it's not working properly and the result is strange - some_array elements become wrong after eleven step in a loop, something like this: Thread counter = 32748):
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <math.h>
#define num_threads 5 /* total max number of threads */
#define lines 17 /* total jobs to be done */
/* args for thread start function */
typedef struct {
int *words;
} args_struct;
/* thread start function */
void *thread_create(void *args) {
args_struct *actual_args = args;
printf("Thread counter = %d\n", *actual_args->words);
free(actual_args);
}
/* main function */
int main(int argc, char argv[]) {
float block;
int i = 0;
int j = 0;
int g;
int result_code;
int *ptr[num_threads];
int some_array[lines] = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17};
pthread_t threads[num_threads];
/* counting how many block we need */
block = ceilf(lines / (double)num_threads);
printf("blocks= %f\n", block);
/* doing ech thread block continuously */
for (g = 1; g <= block; g++) {
//for (i; i < num_threads; ++i) { i < (num_threads * g),
printf("g = %d\n", g);
for (i; i < lines; ++i) {
printf("i= %d\n", i);
/* locate memory to args */
args_struct *args = malloc(sizeof *args);
args->words = &some_array[i];
if(pthread_create(&threads[i], NULL, thread_create, args)) {
free(args);
/* goto error_handler */
}
}
/* wait for each thread to complete */
for (j; j < lines; ++j) {
printf("j= %d\n", j);
result_code = pthread_join(threads[j], (void**)&(ptr[j]));
assert(0 == result_code);
}
}
return 0;
}

Unix c program to calculate pi using threads

Been working on this assignment for class. Put this code together but its giving me several errors I'm not able to solve.
Code
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
//global variables
int N, T;
double vsum[T];
//pie function
void* pie_runner(void* arg)
{
double *limit_ptr = (double*) arg;
double j = *limit_ptr;
for(int i = (N/T)*j; i<=((N/T)*(j+1)-1); j++)
{
if(i %2 =0)
vsum[j] += 4/((2*j)*(2*j+1)*(2*j+2));
else
vsum[j] -= 4/((2*j)*(2*j+1)*(2*j+2));
}
pthread_exit(0);
}
int main(int argc, char **argv)
{
if(argc != 3) {
printf("Error: Must send it 2 parameters, you sent %s", argc);
exit(1);
}
N = atoi[1];
T = atoi[2];
if(N !> T) {
printf("Error: Number of terms must be greater then number of threads.");
exit(1);
}
for(int p=0; p<T; p++) //initialize array to 0
{
vsum[p] = 0;
}
double pie = 3;
//launch threads
pthread_t tids[T];
for(int i = 0; i<T; i++)
{
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_create(&tids[i], &attr, pie_runner, &i);
}
//wait for threads...
for(int k = 0; k<T; k++)
{
pthread_join(tids[k], NULL);
}
for(int x=0; x<T; x++)
{
pie += vsum[x];
}
printf("pi computed with %d terms in %s threads is %k\n", N, T, pie);
}
One of the problems I'm having is with the array up top. It needs to be a global variable but it keeps telling me it's not a constant, even when I declare it as such.
Any help is appreciated, with the rest of the code also.
**EDIT: After updating the code using the comments below, here is the new code. I have a few errors still there and would appreciate help dealing with them.
1) Warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] int j = (int)arg;
2)Warning: cast to pointer from integer of different size [Wint - to - pointer - cast] pthread_create(.......... , (void*)i);
NEW CODE:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
//global variables
int N, T;
double *vsum;
//pie function
void* pie_runner(void* arg)
{
long j = (long)arg;
//double *limit_ptr = (double*) arg;
//double j = *limit_ptr;
//for(int i = (j-1)*N/T; i < N*(j) /T; i++)
for(int i = (N/T)*(j-1); i < ((N/T)*(j)); i++)
{
if(i % 2 == 0){
vsum[j] += 4.0/((2*j)*(2*j+1)*(2*j+2));
//printf("vsum %lu = %f\n", j, vsum[j]);
}
else{
vsum[j] -= 4.0/((2*j)*(2*j+1)*(2*j+2));
//printf("vsum %lu = %f\n", j, vsum[j]);
}
}
pthread_exit(0);
}
int main(int argc, char **argv)
{
if(argc != 3) {
printf("Error: Must send it 2 parameters, you sent %d\n", argc-1);
exit(1);
}
N = atoi(argv[1]);
T = atoi(argv[2]);
vsum = malloc((T+1) * sizeof(*vsum));
if(vsum == NULL) {
fprintf(stderr, "Memory allocation problem\n");
exit(1);
}
if(N <= T) {
printf("Error: Number of terms must be greater then number of threads.\n");
exit(1);
}
for(int p=1; p<=T; p++) //initialize array to 0
{
vsum[p] = 0;
}
double pie = 3.0;
//launch threads
pthread_t tids[T+1];
for(long i = 1; i<=T; i++)
{
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_create(&tids[i], &attr, pie_runner, (void*)i);
}
//wait for threads...
for(int k = 1; k<=T; k++)
{
pthread_join(tids[k], NULL);
}
for(int x=1; x<=T; x++)
{
pie += vsum[x];
}
printf("pi computed with %d terms in %d threads is %.20f\n", N, T, pie);
//printf("pi computed with %d terms in %d threads is %20f\n", N, T, pie);
free(vsum);
}
Values not working:
./pie1 2 1
pi computed with 2 terms in 1 threads is 3.00000000000000000000
./pie1 3 1
pi computed with 3 terms in 1 threads is 3.16666666666666651864
./pie1 3 2
pi computed with 3 terms in 2 threads is 3.13333333333333330373
./pie1 4 2
pi computed with 4 terms in 2 threads is 3.00000000000000000000
./pie1 4 1
pi computed with 4 terms in 1 threads is 3.00000000000000000000
./pie1 4 3
pi computed with 4 terms in 3 threads is 3.14523809523809516620
./pie1 10 1
pi computed with 10 terms in 1 threads is 3.00000000000000000000
./pie1 10 2
pi computed with 10 terms in 2 threads is 3.13333333333333330373
./pie1 10 3
pi computed with 10 terms in 3 threads is 3.14523809523809516620
./pie1 10 4
pi computed with 10 terms in 4 threads is 3.00000000000000000000
./pie1 10 5
pi computed with 10 terms in 5 threads is 3.00000000000000000000
./pie1 10 6
pi computed with 10 terms in 6 threads is 3.14088134088134074418
./pie1 10 7
pi computed with 10 terms in 7 threads is 3.14207181707181693042
./pie1 10 8
pi computed with 10 terms in 8 threads is 3.14125482360776464574
./pie1 10 9
pi computed with 10 terms in 9 threads is 3.14183961892940200045
./pie1 11 2
pi computed with 11 terms in 2 threads is 3.13333333333333330373
./pie1 11 4
pi computed with 11 terms in 4 threads is 3.00000000000000000000
There are numerous problems with that code. Your specific problem is that, in C, variable length arrays (VLAs) are not permitted at file scope.
So, if you want that array to be dynamic, you will have to declare the pointer to it and allocate it yourself:
int N, T;
double *vsum;
and then, in main() after T has been set:
vsum = malloc (T * sizeof(*vsum));
if (vsum == NULL) {
fprintf (stderr, "Memory allocation problem\n");
exit (1);
}
remembering to free it before exiting (not technically required but good form anyway):
free (vsum);
Among the other problems:
1/ There is no !> operator in C, I suspect the line should be:
if (N > T) {
rather than:
if (N !> T) {
2/ To get the arguments from the command line, change:
N = atoi[1];
T = atoi[2];
into:
N = atoi(argv[1]);
T = atoi(argv[2]);
3/ The comparison operator is ==, not =, so you need to change:
if(i %2 =0)
into:
if (i % 2 == 0)
4/ Your error message about not having enough parameters needs to use %d rather than %s, as argc is an integral type:
printf ("Error: Must send it 2 parameters, you sent %d\n", argc-1);
Ditto for your calculation message at the end (and fixing the %k for the floating point value):
printf ("pi computed with %d terms in %d threads is %.20f\n", N, T, pie);
5/ You pass an integer pointer into your thread function but there are two problems with that.
The first is that you then extract it into a double j, which cannot be used as an array index. If it's an integer being passed in, it should be turned back into an integer.
The second is that there is no guarantee the new thread will extract the value (or even start running its code at all) before the main thread changes that value to start up another thread. You should probably just convert the integer to a void * directly rather than messing about with integer pointers.
To fix both those, use this when creating the thread:
pthread_create (&tids[i], &attr, pie_runner, (void*)i);
and this at the start of the thread function:
int j = (int) arg;
If you get warnings or experience problems with that, it's probably because your integers and pointers are not compatible sizes. In that case, you could try something like:
pthread_create (&tids[i], &attr, pie_runner, (void*)(intptr_t)i);
though I'm not sure that will work any better.
Alternatively (though it's a bit of a kludge), stick with your pointer solution and just make sure there's no possibility of race conditions (by passing a unique pointer per thread).
First, revert the thread function to receiving its value by a pointer:
int j = *((int*) arg);
Then, before you start creating threads, you need to create a thread integer array and, for each thread created, populate and pass the (address of the) correct index of that array:
int tvals[T]; // add this line.
for (int i = 0; i < T; i++) {
tvals[i] = i; // and this one.
pthread_attr_t attr;
pthread_attr_init (&attr);
pthread_create (&tids[i], &attr, pie_runner, &(tvals[i]));
}
That shouldn't be too onerous unless you have so many threads the estra array will be problematic. But, if you have that many threads, you're going to have far greater problems.
6/ Your loop in the thread incorrectly incremented j rather than i. Since this is the same area touched by the following section, I'll correct it there.
7/ The use of integers in what is predominantly a floating point calculation means that you have to arrange your calculations so that they don't truncate divisions, such as 10 / 4 -> 2 where it should be 2.5.
That means the loop in the thread function should be changed as follows (including incrementing i as in previous point):
for (int i = j*N/T; i <= N * (j+1) / T - 1; i++)
if(i % 2 == 0)
vsum[j] += 4.0/((2*j)*(2*j+1)*(2*j+2));
else
vsum[j] -= 4.0/((2*j)*(2*j+1)*(2*j+2));
With all those changes, you get a reasonably sensible result:
$ ./picalc 100 101
pi computed with 100 terms in 101 threads is 3.14159241097198238535
Two problems with that array: The first is that T is not a compile-time constant, which it needs to be if you're programming in C++. The second is that T is initialized to zero, meaning the array will have a size of zero and all indexing of the array will be out of bounds.
You need to allocate the array dynamically once you have read T and know the size. In C you use malloc for that, in C++ you should use std::vector instead.

Multithreading pthread errors

Im trying to create a multithreaded application in C for Linux with pthreads library that makes an approximation of pi using infinite series with N+1 terms.Variable N and T are passed from the command line. I am using the Nilakantha approximation formula for pi. N is the upper limit of the number sequence to sum and T would be the # of child threads that calculate that sum. For example if I run command "./pie 100 4". The parent thread will create 4 child threads indexed 0 to 3. I have a global variable called vsum that is a double array allocated dynamically using malloc to hold values. So with 4 threads and 100 as the upper bound. My progam should compute:
Thread 0 computes the partial sum for i going from 0 to 24 stored to an element vsum[0]
Thread 1 computes the partial sum for i going from 25 to 49 stored to an element vsum[1]
Thread 2 computes the partial sum for i going from 50 to 74 stored to an element vsum[2]
Thread 3 computes the partial sum for i going from 75 to 99 stored to an element vsum[3]
After each thread makes calculations. The main thread will compute the sum by adding together all numbers from vsum[0] to vsum[T-1].
Im just starting to learn about threads and processes. Any help or advice would be appreciated. Thank you.
Code I wrote so far:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
double *vsum;
int N, T;
void *PI(void *sum) //takes param sum and gets close to pi
{
int upper = (int)sum;
double pi = 0;
int k = 1;
for (int i = (N/T)*upper; i <= (N/T)*(upper+1)-1; i++)
{
pi += k*4/((2*i)*(2*i+1)*(2*i+2));
if(i = (N/T)*(upper+1)-1)
{
vsum[upper] = pi;
}
k++;
}
pthread_exit(0);
}
int main(int argc, char*argv[])
{
T = atoi(argv[2]);
N = atoi(argv[1]);
if (N<T)
{
fprintf(stderr, "Upper bound(N) < # of threads(T)\n");
return -1;
}
int pie = 0;
pthread_t tid[T]; //thread identifier
pthread_attr_t attr; //thread attributes
vsum = (double *)malloc(sizeof(double));//creates dyn arr
//Initialize vsum to [0,0...0]
for (int i = 0; i < T; i++){
{
vsum[i] = 0;
}
if(argc!=2) //command line does not give proper # of values
{
fprintf(stderr, "usage: commandline error <integer values>\n");
return -1;
}
if (atoi(argv[1]) <0) //if its is negative/sum error
{
fprintf(stderr, "%d must be >=0\n", atoi(argv[1]));
return -1;
}
//CREATE A LOOP THAT MAKES PARAM N #OF THREADS
pthread_attr_init(&attr);
for(int j =0; j < T;j++)
{
int from = (N/T)*j;
int to = (N/T)*(j+1)-1;
//CREATE ARRAY VSUM TO HOLD VALUES FOR PI APPROX.
pthread_create(&tid[j],&attr,PI,(void *)j);
printf("Thread %d computes the partial sum for i going from %d to %d stored to an element vsum[%d]\n", j, from, to, j);
}
//WAITS FOR THREADS TO FINISH
for(int j =0; j <T; i++)
{
pthread_join(tid[j], NULL);
}
//LOOP TO ADD ALL THE vsum array values to get pi approximation
for(int i = 0; i < T; i++)
{
pie += vsum[i];
}
pie = pie +3;
printf("pi computed with %d terms in %d threads is %d\n",N,T,pie);
vsum = realloc(vsum, 0);
pthread_exit(NULL);
return 0;
}
Here is the error I dont see that I get on my program: What am I missing here?
^
pie.c:102:1: error: expected declaration or statement at end of input
}
When I try to run my program I get the following:
./pie.c: line 6: double: command not found
./pie.c: line 7: int: command not found
./pie.c: line 8: int: command not found
./pie.c: line 10: syntax error near unexpected token `('
./pie.c: line 10: `void *PI(void *sum) //takes param sum and gets close to pi'
I haven't looked at logic of your code, but I see following programming errors.
Change
pthread_create(&tid[j],&attr,PI,j);
to
pthread_create(&tid[j],&attr,PI,(void *)j);
pthread_create() takes 4th param as void * which is passed to the thread function.
Also fix your thread function PI to use passed parameter as int like
void *PI(void *sum) //takes param sum and gets close to pi
{
int upper = (int)sum; //don't use `atoi` as passed param is int.
...
//your existing code
}
The 3rd error is for line
realloc(vsum, 0);
By passing 0 to re-allocate, you are effectively just freeing vsum, so you can just use free(vsum). If you indeed want to reallocate you should take the new allocated memory returned by the function something like vsum = realloc(vsum, 0);
The Syntax of pthread is
pthread_create(threadId, threadAttribute, callingMethodName, parameters of calling method);
Ex:
void printLetter( void *p)
{
int i=0;
char c=(char *)p;
while (i<10000)
{
printf("%c",c);
}
}
int main()
{
pthread_t thread_id;
char c='x';
pthread_create (&thread_id, NULL, &printLetter, &c);
pthread_join (thread_id, NULL);
return 0;
}
}

Resources