Below is my C code to print an increasing global counter, one increment per thread.
#include <stdio.h>
#include <pthread.h>
static pthread_mutex_t pt_lock = PTHREAD_MUTEX_INITIALIZER;
int count = 0;
int *printnum(int *num) {
pthread_mutex_lock(&pt_lock);
printf("thread:%d ", *num);
pthread_mutex_unlock(&pt_lock);
return NULL;
}
int main() {
int i, *ret;
pthread_t pta[10];
for(i = 0; i < 10; i++) {
pthread_mutex_lock(&pt_lock);
count++;
pthread_mutex_unlock(&pt_lock);
pthread_create(&pta[i], NULL, (void *(*)(void *))printnum, &count);
}
for(i = 0; i < 10; i++) {
pthread_join(pta[i], (void **)&ret);
}
}
I want each thread to print one increment of the global counter but they miss increments and sometimes access same values of global counter from two threads. How can I make threads access the global counter sequentially?
Sample Output:
thread:2
thread:3
thread:5
thread:6
thread:7
thread:7
thread:8
thread:9
thread:10
thread:10
Edit
Blue Moon's answer solves this question. Alternative approach is available in MartinJames'es comment.
A simple-but-useless approach is to ensure thread1 prints 1, thread2 prints 2 and so on is to put join the thread immmediately:
pthread_create(&pta[i], NULL, printnum, &count);
pthread_join(pta[i], (void **)&ret);
But this totally defeats the purpose of multi-threading because only one can make any progress at a time.
Note that I removed the superfluous casts and also the thread function takes a void * argument.
A saner approach would be to pass the loop counter i by value so that each thread would print different value and you would see threading in action i.e. the numbers 1-10 could be printed in any order and also each thread would print a unique value.
Related
i wrote a simple c program to make every thread multiplate its index by 1000000 and add it to sum , i created 5 threads so the logic answer would be (0+1+2+3+4)*1000000 which is 10000000 but it throws 14000000 instead .could anyone helps me understanding this?
#include<pthread.h>
#include<stdio.h>
typedef struct argument {
int index;
int sum;
} arg;
void *fonction(void *arg0) {
((arg *) arg0) -> sum += ((arg *) arg0) -> index * 1000000;
}
int main() {
pthread_t thread[5];
int order[5];
arg a;
for (int i = 0; i < 5; i++)
order[i] = i;
a.sum = 0;
for (int i = 0; i < 5; i++) {
a.index = order[i];
pthread_create(&thread[i], NULL, fonction, &a);
}
for (int i = 0; i < 5; i++)
pthread_join(thread[i], NULL);
printf("%d\n", a.sum);
return 0;
}
It is 140.. because the behavior is undefined. The results will differ on different machines and other environmental factors. The undefined behavior is caused as a result of all threads accessing the same object (see &a given to each thread) that is modified after the first thread is created.
When each thread runs it accesses the same index (as part of accessing a member of the same object (&a)). Thus the assumption that the threads will see [0,1,2,3,4] is incorrect: multiple threads likely see the same value of index (eg. [0,2,4,4,4]1) when they run. This depends on the scheduling with the loop creating threads as it also modifies the shared object.
When each thread updates sum it has to read and write to the same shared memory. This is inherently prone to race conditions and unreliable results. For example, it could be lack of memory visibility (thread X doesn’t see value updated from thread Y) or it could be a conflicting thread schedule between the read and write (thread X read, thread Y read, thread X write, thread Y write) etc..
If creating a new arg object for each thread, then both of these problems are avoided. While the sum issue can be fixed with the appropriate locking, the index issue can only be fixed by not sharing the object given as the thread input.
// create 5 arg objects, one for each thread
arg a[5];
for (..) {
a[i].index = i;
// give DIFFERENT object to each thread
pthread_create(.., &a[i]);
}
// after all threads complete
int sum = 0;
for (..) {
sum += a[i].result;
}
1 Even assuming that there is no race condition in the current execution wrt. the usage of sum, the sequence for the different threads seeing index values as [0,2,4,4,4], the sum of which is 14, might look as follows:
a.index <- 0 ; create thread A
thread A reads a.index (0)
a.index <- 1 ; create thread B
a.index <- 2 ; create thread C
thread B reads a.index (2)
a.index <- 3 ; create thread D
a.index <- 4 ; create thread E
thread D reads a.index (4)
thread C reads a.index (4)
thread E reads a.index (4)
I have an array of 100 requests(integers). I want to create 4 threads to which i call a function(thread_function) and with this function i want every thread to take one by one the requests:
(thread0->request0,
thread1->request1,
thread2->request2,
thread3->request3
and then thread0->request4 etc up to 100) all these by using mutexes.
Here is the code i have writen so far:
threadRes = pthread_create(&(threadID[i]), NULL,thread_function, (void *)id_size);
This is inside my main and it is in a loop for 4 times.Now outside my main:
void *thread_function(void *arg){
int *val_p=(int *) arg;
for(i=0; i<200; i=i+2)
{
f=false;
for (j= 0; j<100; j++)
{
if (val_p[i]==cache[j].id)
f=true;
}
if(f==true)
{
printf("The request %d has been served.\n",val_p[i]);
}
else
{
cache[k].id=val_p[i];
printf("\nCurrent request to be served:%d \n",cache[k].id);
k++;
}
}
Where: val_p is the array with the requests and cache is an array of structs to store the id(requests).
-So now i want mutexes to synchronize my threads. I considered using inside my main:
pthread_join(threadID[0], NULL);
pthread_join(threadID[1], NULL);
pthread_join(threadID[2], NULL);
pthread_join(threadID[3], NULL);
pthread_mutex_destroy(&mutex);
and inside the function to use:
pthread_mutex_lock(&mutex);
pthread_mutex_unlock(&mutex);
Before i finish i would like to say that so far my programm result is that 4threads run 100 requests each(400) and what i want to achieve is that 4threads run 100 threads total.
Thanks for your time.
You need to use a loop that looks like this:
Acquire lock.
See if there's any work to be done. If not, release the lock and terminate.
Mark the work that we're going to do as not needing to be done anymore.
Release the lock.
Do the work.
(If necessary) Acquire the lock. Mark the work done and/or report results. Release the lock.
Go to step 1.
Notice how while holding the lock, the thread discovers what work it should do and then prevents any other thread from taking the same assignment before it releases the lock. Note also that the lock is not held while doing the work so that multiple threads can work concurrently.
You may want to post more of your code. How the arrays are set up, how the segment is passed to the individual threads, etc.
Note that using printf will perturb the timing of the threads. It does its own mutex for access to stdout, so it's probably better to no-op this. Or, have a set of per-thread logfiles so the printf calls don't block against one another.
Also, in your thread loop, once you set f to true, you can issue a break as there's no need to scan further.
val_p[i] is loop invariant, so we can fetch that just once at the start of the i loop.
We don't see k and cache, but you'd need to mutex wrap the code that sets these values.
But, that does not protect against races in the for loop. You'd have to wrap the fetch of cache[j].id in a mutex pair inside the loop. You might be okay without the mutex inside the loop on some arches that have good cache snooping (e.g. x86).
You might be better off using stdatomic.h primitives. Here's a version that illustrates that. It compiles but I've not tested it:
#include <stdio.h>
#include <pthread.h>
#include <stdatomic.h>
int k;
#define true 1
#define false 0
struct cache {
int id;
};
struct cache cache[100];
#ifdef DEBUG
#define dbgprt(_fmt...) \
printf(_fmt)
#else
#define dbgprt(_fmt...) \
do { } while (0)
#endif
void *
thread_function(void *arg)
{
int *val_p = arg;
int i;
int j;
int cval;
int *cptr;
for (i = 0; i < 200; i += 2) {
int pval = val_p[i];
int f = false;
// decide if request has already been served
for (j = 0; j < 100; j++) {
cptr = &cache[j].id;
cval = atomic_load(cptr);
if (cval == pval) {
f = true;
break;
}
}
if (f == true) {
dbgprt("The request %d has been served.\n",pval);
continue;
}
// increment the global k value [atomically]
int kold = atomic_load(&k);
int knew;
while (1) {
knew = kold + 1;
if (atomic_compare_exchange_strong(&k,&kold,knew))
break;
}
// get current cache value
cptr = &cache[kold].id;
int oldval = atomic_load(cptr);
// mark the cache
// this should never loop because we atomically got our slot with
// the k value
while (1) {
if (atomic_compare_exchange_strong(cptr,&oldval,pval))
break;
}
dbgprt("\nCurrent request to be served:%d\n",pval);
}
return (void *) 0;
}
I have this main code that executes pthread_create with a function "doit" as one of its parameter. I have three doit functions where each of them has P and V placed differently or doesn't have P and V at all. My question is, how would each of the output differ? More specifically, what would be possible outputs for each of the doit function?
What I know so far is that P(&sem) will turn the sem value into 0 and V will turn the value into 1. However, I'm having a hard time interpreting how it'll affect the code.
What I think so far is that doit function #1 will result in
1
2
3
as printf and i=i+1 are well secured by P(&sem) and V(&sem).
Also, all the possible outputs with doit function #2 in my opinion are
1, 2, 3///
1, 3, 3///
2, 2, 3///
2, 3, 3///
3, 3, 3.
please correct me if I'm wrong.
However, I'm really not sure about what would happen with multiple threads when it comes to doit function #3 in terms of possible outputs. I'd appreciate any kind of help, thank you.
sem_t sem;
/* semaphore */
int main(){
int j;
pthread_t tids[3];
sem_init(&sem, 0,1);
for (j=0; j<3; j++) {
pthread_create(&tids[j], NULL, doit, NULL);
}
for (j=0; j<3; j++) {
pthread_join(&tids[j], NULL);
}
return 0;
}
doit# 1.
int i = 0;
void *doit(void *arg){
P(&sem);
i = i + 1;
printf("%d\n", i);
V(&sem);
}
doit #2.
int i = 0;
void *doit(void *arg){
P(&sem);
i = i + 1;
V(&sem);
printf("%d\n", i);
}
doit #3.
int i = 0;
void *doit(void *arg){
i = i + 1;
printf("%d\n", i);
}
The first program will print
1
2
3
It is the only program where all accesses to i are properly guarded by P/V.
The behaviour of the 2 other programs is undefined, because the write in one thread might happen simultaneously with the read in other thread, and according to C11 5.1.2.4p25:
The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.
and C11 5.1.2.4p4
Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.
Oftentimes these kinds of assignments ask to find a difference between 2 and 3. They're wrong. From the C perspective you've opened the Pandora's box already and anything is possible.
I created a program that does the addition of 8 numbers using 4 threads, and then the product of the results. How to ensure that each thread is using a separate core for maximum performance gains. I am new to pthreads so I really don't have any idea on how to use it properly. Please provide answers as simple as possible.
My code:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
int global[9];
void *sum_thread(void *arg)
{
int *args_array;
args_array = arg;
int n1,n2,sum;
n1=args_array[0];
n2=args_array[1];
sum = n1*n2;
printf("N1 * N2 = %d\n",sum);
return (void*) sum;
}
void *sum_thread1(void *arg)
{
int *args_array;
args_array = arg;
int n3,n4,sum2;
n3=args_array[2];
n4=args_array[3];
sum2=n3*n4;
printf("N3 * N4 = %d\n",sum2);
return (void*) sum2;
}
void *sum_thread2(void *arg)
{
int *args_array;
args_array = arg;
int n5,n6,sum3;
n5=args_array[4];
n6=args_array[5];
sum3=n5*n6;
printf("N5 * N6 = %d\n",sum3);
return (void*) sum3;
}
void *sum_thread3(void *arg)
{
int *args_array;
args_array = arg;
int n8,n7,sum4;
n7=args_array[6];
n8=args_array[7];
sum4=n7*n8;
printf("N7 * N8 = %d\n",sum4);
return (void*) sum4;
}
int main()
{
int sum3,sum2,sum,sum4;
int prod;
global[0]=9220; global[1]=1110; global[2]=1120; global[3]=2320; global[4]=5100; global[5]=6720; global[6]=7800; global[7]=9290;// the input
pthread_t tid_sum;
pthread_create(&tid_sum,NULL,sum_thread,global);
pthread_join(tid_sum,(void*)&sum);
pthread_t tid_sum1;
pthread_create(&tid_sum1,NULL,sum_thread1,global);
pthread_join(tid_sum1,(void*)&sum2);
pthread_t tid_sum2;
pthread_create(&tid_sum2,NULL,sum_thread2,global);
pthread_join(tid_sum2,(void*)&sum3);
pthread_t tid_sum3;
pthread_create(&tid_sum3,NULL,sum_thread3,global);
pthread_join(tid_sum3,(void*)&sum4);
prod=sum+sum2+sum3+sum4;
printf("The sum of the products is: %d", prod);
return 0;
}
You don't have, don't want and mustn't (I don't know if you somehow you can though) manage hardware resources at such low levels. That's a job for your OS and partially for standard libraries: they have been tested optimized and standardized properly.
I doubt you can do better. If you do what you are saying either you are an expert hardware/OS programmer or you are destroying decades of works :) .
Also consider this fact: your code will not be portable anymore if you could index the cores manually since it depends on the number of cores of your machine.
On the other way multithread programs should work (and even better sometimes) even when having one core. An example is the case where one of the threads doesn't do anything until an event happens: you can make one thread go to "sleep" so that only the other threads use the CPU; then when the event happens it will execute. In a non-multithread program generally polling is used which uses CPU resource to do nothing.
Also #yano said you are multithread program is not really parallel in this case since you are creating the thread and then waiting for it to finish with pthread_join before starting the other threads.
I am beginner in multi thread programming. I have a homework, which I should create some number of thread (in my case 10) and generate a random number inside of each one.
when I debug my code (or put 'sleep(1)' command) it works correctly, otherwise the generated values are not equal anymore.
please help me and explain me if you think there is something which I still did'nt understand it.
The code:
#include<stdio.h>
#include<pthread.h>
#include<time.h>
int rn, sharedArray[10];
pthread_mutex_t lock;
int randomgenerator(){
//sleep(1);
srand(time(NULL)); // The value of random depend on the time with some changes in the time library
int rn, MAX = 10; // between 0 and 10
rn = rand()%MAX;
return rn;
}
void *rgen(void *arg){
int order = (int*)arg;
//pthread_mutex_lock(&lock);
int rn = randomgenerator();
printf("%d,%d\n", order,rn); //not working without printf why????
sharedArray[order]= rn;
//pthread_mutex_unlock(&lock);
pthread_exit(NULL);
}
void main(int argc, char *argv[]){
pthread_mutex_init(&lock, NULL);
pthread_t th[10];
int i;
for (i = 0; i <10; i ++){
pthread_create(&(th[i]), 0, rgen, (void *)i);
pthread_join(th[i], NULL);
}
int j, result = 0;
for (j=0; j<10; j++){
result = sharedArray[j] + result;
}
printf("the final result is: %d\n",result);
pthread_exit(NULL);
pthread_mutex_destroy(&lock);
}
And the output is:
0,3
1,3
2,3
3,3
4,3
5,3
6,3
7,3
8,3
9,3
the final result is: 30
The output which I expect with uncommenting the sleep is:
0,7
1,4
2,1
3,8
4,5
5,2
6,9
7,6
8,3
9,0
the final result is: 45
The problem is this line:
srand(time(NULL));
The man page for rand/srand says:
The srand() function sets its argument as the seed for a new sequence of pseudo-random integers to be returned by rand(). These sequences are repeatable by calling srand() with the same seed value.
You set a new seed every time. That is wrong! You should only call srand once.
When you comment out the sleep, all threads set the same seed and then gets the first random number in that sequence. If they do it in the same second, they each get the same random value.
Solution:
Move the call to srand to main, before you create any threads.
Note:
The function rand() is not reentrant or thread-safe, since it uses hidden state that is modified on each call.
That means that you cannot safely call rand simultaneously from multiple threads. You should either protect the call by using a mutex or semaphore, or switch to a reentrant function, like rand_r.