I have two questions.
First:
I need to create thread blocks gradually not more then some max value, for example 20.
For example, first 20 thread go, job is finished, only then 20 second thread go, and so on in a loop.
Total number of jobs could be much larger then total number of threads (in our example 20), but total number of threads should not be bigger then our max value (in our example 20).
Second:
Could threads be added continuously? For example, 20 threads go, one thread job is finished, we see that total number of threads is 19 but our max value is 20, so we can create one more thread, and one more thread go :)
So we don't waste a time waiting another threads job to be done and our total threads number is not bigger then our some max value (20 in our example) - sounds cool.
Conclusion:
For total speed I consider the second variant would be much faster and better, and I would be very graceful if you help me with this, but also tell how to do the first variant.
Here is me code (it's not working properly and the result is strange - some_array elements become wrong after eleven step in a loop, something like this: Thread counter = 32748):
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <math.h>
#define num_threads 5 /* total max number of threads */
#define lines 17 /* total jobs to be done */
/* args for thread start function */
typedef struct {
int *words;
} args_struct;
/* thread start function */
void *thread_create(void *args) {
args_struct *actual_args = args;
printf("Thread counter = %d\n", *actual_args->words);
free(actual_args);
}
/* main function */
int main(int argc, char argv[]) {
float block;
int i = 0;
int j = 0;
int g;
int result_code;
int *ptr[num_threads];
int some_array[lines] = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17};
pthread_t threads[num_threads];
/* counting how many block we need */
block = ceilf(lines / (double)num_threads);
printf("blocks= %f\n", block);
/* doing ech thread block continuously */
for (g = 1; g <= block; g++) {
//for (i; i < num_threads; ++i) { i < (num_threads * g),
printf("g = %d\n", g);
for (i; i < lines; ++i) {
printf("i= %d\n", i);
/* locate memory to args */
args_struct *args = malloc(sizeof *args);
args->words = &some_array[i];
if(pthread_create(&threads[i], NULL, thread_create, args)) {
free(args);
/* goto error_handler */
}
}
/* wait for each thread to complete */
for (j; j < lines; ++j) {
printf("j= %d\n", j);
result_code = pthread_join(threads[j], (void**)&(ptr[j]));
assert(0 == result_code);
}
}
return 0;
}
Related
I am working with a large project and I am trying to create a test that does the following thing: first, create 5 threads. Each one of this threads will create 4 threads, which in turn each one creates 3 other threads. All this happens until 0 threads.
I have _ThreadInit() function used to create a thread:
status = _ThreadInit(mainThreadName, ThreadPriorityDefault, &pThread, FALSE);
where the 3rd parameter is the output(the thread created).
What am I trying to do is start from a number of threads that have to be created n = 5, the following way:
for(int i = 0; i < n; i++){
// here call the _ThreadInit method which creates a thread
}
I get stuck here. Please, someone help me understand how it should be done. Thanks^^
Building on Eugene Sh.'s comment, you could create a function that takes a parameter, which is the number of threads to create, that calls itself recursively.
Example using standard C threads:
#include <stdbool.h>
#include <stdio.h>
#include <threads.h>
int MyCoolThread(void *arg) {
int num = *((int*)arg); // cast void* to int* and dereference
printf("got %d\n", num);
if(num > 0) { // should we start any threads at all?
thrd_t pool[num];
int next_num = num - 1; // how many threads the started threads should start
for(int t = 0; t < num; ++t) { // loop and create threads
// Below, MyCoolThread creates a thread that executes MyCoolThread:
if(thrd_create(&pool[t], MyCoolThread, &next_num) != thrd_success) {
// We failed to create a thread, set `num` to the number of
// threads we actually created and break out.
num = t;
break;
}
}
int result;
for(int t = 0; t < num; ++t) { // join all the started threads
thrd_join(pool[t], &result);
}
}
return 0;
}
int main() {
int num = 5;
MyCoolThread(&num); // fire it up
}
Statistics from running:
1 thread got 5
5 threads got 4
20 threads got 3
60 threads got 2
120 threads got 1
120 threads got 0
I'm trying to count the number of prime numbers up to 10 million and I have to do it using multiple threads using Posix threads(so, that each thread computes a subset of 10 million). However, my code is not checking for the condition IsPrime. I'm thinking this is due to a race condition. If it is what can I do to ameliorate this issue?
I've tried using a global integer array with k elements but since k is not defined it won't let me declare that at the file scope.
I'm running my code using gcc -pthread:
/*
Program that spawns off "k" threads
k is read in at command line each thread will compute
a subset of the problem domain(check if the number is prime)
to compile: gcc -pthread lab5_part2.c -o lab5_part2
*/
#include <math.h>
#include <stdio.h>
#include <time.h>
#include <pthread.h>
#include <stdlib.h>
typedef int bool;
#define FALSE 0
#define TRUE 1
#define N 10000000 // 10 Million
int k; // global variable k willl hold the number of threads
int primeCount = 0; //it will hold the number of primes.
//returns whether num is prime
bool isPrime(long num) {
long limit = sqrt(num);
for(long i=2; i<=limit; i++) {
if(num % i == 0) {
return FALSE;
}
}
return TRUE;
}
//function to use with threads
void* getPrime(void* input){
//get the thread id
long id = (long) input;
printf("The thread id is: %ld \n", id);
//how many iterations each thread will have to do
int numOfIterations = N/k;
//check the last thread. to make sure is a whole number.
if(id == k-1){
numOfIterations = N - (numOfIterations * id);
}
long startingPoint = (id * numOfIterations);
long endPoint = (id + 1) * numOfIterations;
for(long i = startingPoint; i < endPoint; i +=2){
if(isPrime(i)){
primeCount ++;
}
}
//terminate calling thread.
pthread_exit(NULL);
}
int main(int argc, char** args) {
//get the num of threads from command line
k = atoi(args[1]);
//make sure is working
printf("Number of threads is: %d\n",k );
struct timespec start,end;
//start clock
clock_gettime(CLOCK_REALTIME,&start);
//create an array of threads to run
pthread_t* threads = malloc(k * sizeof(pthread_t));
for(int i = 0; i < k; i++){
pthread_create(&threads[i],NULL,getPrime,(void*)(long)i);
}
//wait for each thread to finish
int retval;
for(int i=0; i < k; i++){
int * result = NULL;
retval = pthread_join(threads[i],(void**)(&result));
}
//get the time time_spent
clock_gettime(CLOCK_REALTIME,&end);
double time_spent = (end.tv_sec - start.tv_sec) +
(end.tv_nsec - start.tv_nsec)/1000000000.0f;
printf("Time tasken: %f seconds\n", time_spent);
printf("%d primes found.\n", primeCount);
}
the current output I am getting: (using the 2 threads)
Number of threads is: 2
Time tasken: 0.038641 seconds
2 primes found.
The counter primeCount is modified by multiple threads, and therefore must be atomic. To fix this using the standard library (which is now supported by POSIX as well), you should #include <stdatomic.h>, declare primeCount as an atomic_int, and increment it with an atomic_fetch_add() or atomic_fetch_add_explicit().
Better yet, if you don’t care about the result until the end, each thread can store its own count in a separate variable, and the main thread can add all the counts together once the threads finish. You will need to create, in the main thread, an atomic counter per thread (so that updates don’t clobber other data in the same cache line), pass each thread a pointer to its output parameter, and then return the partial tally to the main thread through that pointer.
This looks like an exercise that you want to solve yourself, so I won’t write the code for you, but the approach to use would be to declare an array of counters like the array of thread IDs, and pass &counters[i] as the arg parameter of pthread_create() similarly to how you pass &threads[i]. Each thread would need its own counter. At the end of the thread procedure, you would write something like, atomic_store_explicit( (atomic_int*)arg, localTally, memory_order_relaxed );. This should be completely wait-free on all modern architectures.
You might also decide that it’s not worth going to that trouble to avoid a single atomic update per thread, declare primeCount as an atomic_int, and then atomic_fetch_add_explicit( &primeCount, localTally, memory_order_relaxed ); once before the thread procedure terminates.
I'm trying to write a program that squares numbers 1-10,000 by creating 8 threads and each thread will take turns squaring ONE NUMBER EACH. Meaning that one thread will square 1, another will square 2, etc until all threads square a number. Then one thread will square 9, etc, all the way to 10,000. My code is below:
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <pthread.h>
#include <sys/types.h>
#define NUMBER_OF_THREADS 8
#define START_NUMBER 1
#define END_NUMBER 10000
FILE *f;
void *sqrtfunc(void *tid) { //function for computing squares
int i;
for (i = START_NUMBER; i<= END_NUMBER; i++){
if ((i % NUMBER_OF_THREADS) == pthread_self()){ //if i%8 == thread id
fprintf(f, "%lu squared = %lu\n", i, i*i); //then that thread should do this
}
}
}
int main(){
//Do not modify starting here
struct timeval start_time, end_time;
gettimeofday(&start_time, 0);
long unsigned i;
f = fopen("./squared_numbers.txt", "w");
//Do not modify ending here
pthread_t mythreads[NUMBER_OF_THREADS]; //thread variable
long mystatus;
for (i = 0; i < NUMBER_OF_THREADS; i++){ //loop to create 8 threads
mystatus = pthread_create(&mythreads[i], NULL, sqrtfunc, (void *)i);
if (mystatus != 0){ //check if pthread_create worked
printf("pthread_create failed\n");
exit(-1);
}
}
for (i = 0; i < NUMBER_OF_THREADS; i++){
if(pthread_join(mythreads[i], NULL)){
printf("Thread failed\n");
}
}
exit(1);
//Do not modify starting here
fclose(f);
gettimeofday(&end_time, 0);
float elapsed = (end_time.tv_sec-start_time.tv_sec) * 1000.0f + \
(end_time.tv_usec-start_time.tv_usec) / 1000.0f;
printf("took %0.2f milliseconds\n", elapsed);
//Do not modify ending here
}
I am not sure where my error is. I create my 8 threads in main, and then depending on their thread id (tid), I want that thread to square a number. As of right now, nothing is being printed into the output file and I can't figure out why. Is my tid comparison not doing anything? Any tips are appreciated. Thanks guys.
First, you intentionally pass a parameter to each thread so it know which thread it is (from 0 to 7) That is good, but you then don't use it anymore inside the thread (this leads to one of the possible confussions you have)
Second, as you say in the explanation of how the algorithm should go, you say each thread must square a different set of numbers, but all of them do square the same set of numbers (indeed the whole set of numbers)
You have two approaches to this: Let each thread square the number, and go for the next, eight places further (so the algorithm is the one described in your explanation) or you give different sets (each 1250 consecutive numbers) and let each thread act on is own separate interval.
That said, you have to reconstruct your for loop to do one of two:
for (i = parameter; i < MAX; i += 8) ...
or
for (i = 1250*parameter; i < 1250*(parameter+1); i++) ...
that way, you'll get each thread run with a different set of input numbers.
I'm looking to do a matrix multiply using threads where each thread does a single multiplication and then the main thread will add up all of the results and place them in the appropriate spot in the final matrix (after the other threads have exited).
The way I am trying to do it is to create a single row array that holds the results of each thread. Then I would go through the array and add + place the results in the final matrix.
Ex: If you have the matrices:
A = [{1,4}, {2,5}, {3,6}]
B = [{8,7,6}, {5,4,3}]
Then I want an array holding [8, 20, 7, 16, 6, 12, 16 etc]
I would then loop through the array adding up every 2 numbers and placing them in my final array.
This is a HW assignment so I am not looking for exact code, but some logic on how to store the results in the array properly. I'm struggling with how to keep track of where I am in each matrix so that I don't miss any numbers.
Thanks.
EDIT2: Forgot to mention that there must be a single thread for every single multiplication to be done. Meaning for the example above, there will be 18 threads each doing its own calculation.
EDIT: I'm currently using this code as a base to work off of.
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#define M 3
#define K 2
#define N 3
#define NUM_THREADS 10
int A [M][K] = { {1,4}, {2,5}, {3,6} };
int B [K][N] = { {8,7,6}, {5,4,3} };
int C [M][N];
struct v {
int i; /* row */
int j; /* column */
};
void *runner(void *param); /* the thread */
int main(int argc, char *argv[]) {
int i,j, count = 0;
for(i = 0; i < M; i++) {
for(j = 0; j < N; j++) {
//Assign a row and column for each thread
struct v *data = (struct v *) malloc(sizeof(struct v));
data->i = i;
data->j = j;
/* Now create the thread passing it data as a parameter */
pthread_t tid; //Thread ID
pthread_attr_t attr; //Set of thread attributes
//Get the default attributes
pthread_attr_init(&attr);
//Create the thread
pthread_create(&tid,&attr,runner,data);
//Make sure the parent waits for all thread to complete
pthread_join(tid, NULL);
count++;
}
}
//Print out the resulting matrix
for(i = 0; i < M; i++) {
for(j = 0; j < N; j++) {
printf("%d ", C[i][j]);
}
printf("\n");
}
}
//The thread will begin control in this function
void *runner(void *param) {
struct v *data = param; // the structure that holds our data
int n, sum = 0; //the counter and sum
//Row multiplied by column
for(n = 0; n< K; n++){
sum += A[data->i][n] * B[n][data->j];
}
//assign the sum to its coordinate
C[data->i][data->j] = sum;
//Exit the thread
pthread_exit(0);
}
Source: http://macboypro.wordpress.com/2009/05/20/matrix-multiplication-in-c-using-pthreads-on-linux/
You need to store M * K * N element-wise products. The idea is presumably that the threads will all run in parallel, or at least will be able to do, so each thread needs its own distinct storage location of appropriate type. A straightforward way to do that would be to create an array with that many elements ... but of what element type?
Each thread will need to know not only where to store its result, but also which multiplication to perform. All of that information needs to be conveyed via a single argument of type void *. One would typically, then, create a structure type suitable for holding all the data needed by one thread, create an instance of that structure type for each thread, and pass pointers to those structures. Sounds like you want an array of structures, then.
The details could be worked a variety of ways, but the one that seems most natural to me is to give the structure members for the two factors, and a member in which to store the product. I would then have the main thread declare a 3D array of such structures (if the needed total number is smallish) or else dynamically allocate one. For example,
struct multiplication {
// written by the main thread; read by the compute thread:
int factor1;
int factor2;
// written by the compute thread; read by the main thread:
int product;
} partial_result[M][K][N];
How to write code around that is left as the exercise it is intended to be.
Not sure haw many threads you would need to dispatch and I am also not sure if you would use join later to pick them up. I am guessing you are in C here so I would use the thread id as a way to track which row to process .. something like :
#define NUM_THREADS 64
/*
* struct to pass parameters to a dispatched thread
*/
typedef struct {
int value; /* thread number */
char somechar[128]; /* char data passed to thread */
unsigned long ret;
struct foo *row;
} thread_parm_t;
Where I am guessing that each thread will pick up its row data in the pointer *row which has some defined type foo. A bunch of integers or floats or even complex types. Whatever you need to pass to the thread.
/*
* the thread to actually crunch the row data
*/
void *thr_rowcrunch( void *parm );
pthread_t tid[NUM_THREADS]; /* POSIX array of thread IDs */
Then in your main code segment something like :
thread_parm_t *parm=NULL;
Then dispatch the threads with something like :
for ( i = 0; i < NUM_THREADS; i++) {
parm = malloc(sizeof(thread_parm_t));
parm->value = i;
strcpy(parm->somechar, char_data_to-pass );
fill_in_row ( parm->row, my_row_data );
pthread_create(&tid[i], NULL, thr_insert, (void *)parm);
}
Then later on :
for ( i = 0; i < NUM_THREADS; i++)
pthread_join(tid[i], NULL);
However the real work needs to be done in thr_rowcrunch( void *parm ) which receives the row data and then each thread just knows its own thread number. The guts of what you do in that dispatched thread however I can only guess at.
Just trying to help here, not sure if this is clear.
The task is to have 5 threads present at the same time, and the user assigns each a burst time. Then a round robin algorithm with a quantum of 2 is used to schedule the threads. For example, if I run the program with
$ ./m 1 2 3 4 5
The output should be
A 1
B 2
C 2
D 2
E 2
C 1
D 2
E 2
E 1
But for now my output shows only
A 1
B 2
C 2
Since the program errs where one thread does not end for the time being, I think the problem is that this thread cannot unlock to let the next thread grab the lock. My sleep() does not work, either. But I have no idea how to modify my code in order to fix them. My code is as follows:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
double times[5];
char process[] = {'A', 'B', 'C', 'D', 'E'};
int turn = 0;
void StartNext(int tid) //choose the next thread to run
{
int i;
for(i = (tid + 1) % 5; times[i] == 0; i = (i + 1) % 5)
if(i == tid) //if every thread has finished
return;
turn = i;
}
void *Run(void *tid) //the thread function
{
int i = (int)tid;
while(times[i] != 0)
{
while(turn != i); //busy waiting till it is its turn
if(times[i] > 2)
{
printf("%c 2\n", process[i]);
sleep(2); //sleep is to simulate the actual running time
times[i] -= 2;
}
else if(times[i] > 0 && times[i] <= 2) //this thread will have finished after this turn
{
printf("%c %lf\n", process[i], times[i]);
sleep(times[i]);
times[i] = 0;
}
StartNext(i); //choose the next thread to run
}
pthread_exit(0);
}
int main(int argc, char **argv)
{
pthread_t threads[5];
int i, status;
if(argc == 6)
{
for(i = 0; i < 5; i++)
times[i] = atof(argv[i + 1]); //input the burst time of each thread
for(i = 0; i < 5; i++)
{
status = pthread_create(&threads[i], NULL, Run, (void *)i); //Create threads
if(status != 0)
{
printf("While creating thread %d, pthread_create returned error code %d\n", i, status);
exit(-1);
}
pthread_join(threads[i], 0); //Join threads
}
}
return 0;
}
The program is directly runnable. Could anyone help me figure it out? Thanks!
Some things I've figured out reading your code:
1. At the beginning of the Run function, you convert tid (which is a pointer to void) directly to int. Shouldn't you dereference it?
It is better to make int turn volatile, so that the compiler won't make any assumptions about its value not changing.
When you call the function sleep the second time, you pass a parameter that has type double (times[i]), and you should pass an unsigned int parameter. A direct cast like (unsigned int) times[i] should solve that.
You're doing the pthread_join before creating the other threads. When you create thread 3, and it enters its busy waiting state, the other threads won't be created. Try putting the joins after the for block.