Wrong thread IDs in a multithreaded C program?

Wrong thread IDs in a multithreaded C program? - c

I am new to multithreading in C and I had this question. I wrote the following code:
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
pthread_mutex_t m=PTHREAD_MUTEX_INITIALIZER;
pthread_attr_t attr;
void* test(void *a)
{
int i=*((int *)a);
printf("The thread %d has started.\n",i);
pthread_mutex_lock(&m);
sleep(1);
printf("The thread %d has finished.\n",i);
pthread_mutex_unlock(&m);
pthread_exit(NULL);
}
int main()
{
int i=0;
pthread_attr_setdetachstate(&attr,PTHREAD_CREATE_JOINABLE);
pthread_t thread[5];
for (i=0;i<5;i++)
pthread_create(&thread[i],&attr,test,&i);
for (i=0;i<5;i++)
pthread_join(thread[i],NULL);
return 0;
}
Why do I get values like:
The thread 0 has started.
The thread 0 has started.
The thread 5 has started.
The thread 5 has started.
The thread 0 has started.
The thread 0 has finished.
The thread 0 has finished.
The thread 5 has finished.
The thread 5 has finished.
The thread 0 has finished.
or
The thread 1 has started.
The thread 2 has started.
The thread 5 has started.
The thread 4 has started.
The thread 0 has started.
The thread 1 has finished.
The thread 2 has finished.
The thread 5 has finished.
The thread 4 has finished.
The thread 0 has finished.
or even:
The thread 0 has started.
The thread 0 has started.
The thread 0 has started.
The thread 0 has started.
The thread 0 has started.
The thread 0 has finished.
The thread 0 has finished.
The thread 0 has finished.
The thread 0 has finished.
The thread 0 has finished.
etc, when I expected to get:
The thread 0 has started.
The thread 1 has started.
The thread 2 has started.
The thread 3 has started.
The thread 4 has started.
The thread 0 has finished.
The thread 1 has finished.
The thread 2 has finished.
The thread 3 has finished.
The thread 4 has finished.
Only when I put usleep(10) after thread_create do I get some "normal" values.
I compiled and run this code in Code::Blocks on Unix.

You're passing the address of a variable the for is changing (i) so you're at the mercy of the scheduler. You should just pass a copy. As a cheap, not completely-kosher way:
pthread_create(&thread[i],&attr,test, (void*)i);
/* ... */
int i = (int)a;

Notice that you are passing in the address of i as a parameter to your threads:
pthread_create(&thread[i],&attr,test,&i);
This means that all of your threads will be reading the same variable i to determine which thread they are. That is, all five threads will look at the same variable to determine their thread number. Consequently, as the value of i increments in your for loop, all the threads will perceive their thread number changing to use the new value of i. This is why you're sometimes seeing 5 come up as the thread number, and also explains the fact that you're often skipping numbers or seeing far too many duplicates.
To fix this, you will need to give each thread their own copy of i. For example, you could do something like this:
int* myI = malloc(sizeof(int));
*myI = i;
pthread_create(&thread[i], &attr, test, myI);
And then have the threads free the pointer before terminating:
void* test(void *a)
{
int i=*((int *)a);
printf("The thread %d has started.\n",i);
pthread_mutex_lock(&m);
sleep(1);
printf("The thread %d has finished.\n",i);
pthread_mutex_unlock(&m);
pthread_exit(NULL);
free(a);
}
Alternatively, you could cast i to a void* and pass that in:
pthread_create(&thread[i],&attr,test, (void*)i);
If you do this, you would then have the threads cast their arguments directly back to int, not to int*:
void* test(void *a)
{
int i = (int)a;
printf("The thread %d has started.\n",i);
pthread_mutex_lock(&m);
sleep(1);
printf("The thread %d has finished.\n",i);
pthread_mutex_unlock(&m);
pthread_exit(NULL);
}
Hope this helps!

Related

how can I do parallel reduction approach to combine the partial sums in c

I have to do partial sums using parallel reduction approach in C. but I doesn't have any idea about it. So, I need guidance of Community to achieve this.
What I need to achieve: for example, computational thread, then in first reduction step add two element in array, thread 4 should wait for thread 0 done, same as thread 5 has to wait for thread 1 done, thread 6 waits for thread 2 & thread 7 should waits for thread 3 done .
now in second step, , thread 6 waits for thread 4 finished, and thread 7 waits for thread 5 finished. , thread 6 waits for thread 4 finished, and thread 7 waits for thread 5 finished.
In the third step, thread 7 waits for thread 6 done. then need to print whole array
Please help me, give me a guidance to achieve this one.

The Intel oneTBB library include "parallel_reduce" algorithm, which can be used directly for such task.
https://spec.oneapi.io/versions/latest/elements/oneTBB/source/algorithms/functions/parallel_reduce_func.html
Since oneTBB only support C++, if only C can be used, you could consider to use OpenMP instead. But it also need toolchain support.
https://www.intel.com/content/www/us/en/develop/documentation/advisor-user-guide/top/model-threading-designs/add-parallelism-to-your-program/replace-annotations-with-openmp-code/openmp-reduction-operations.html

I'm confused about how to manage computing thread a given thread should wait. 2^r-i, where r = log(m).
There are at least two easy ways to do that:
save all thread-ids in a global pthread_t ptid[N]; array.
save all thread-ids in a local pthread_t ptid[N]; array, and pass the address of that array into each thread via the thread argument.
Example pseudo-code for the latter (error handling omitted for clarity):
struct Arg {
pthread_t *ptid; // Address of ptid[N].
int idx; // Index of this thread.
};
void *partial_sum(void *p)
{
struct Arg *arg = (struct Arg *)p;
int sum = 0;
... // Compute my portion of the partial sum.
int other_thread_idx = ...; // Compute index of the other thread
// this thread should join, -1 if none.
if (other_thread_idx >= 0) {
int other_sum;
// Get the result from other thread.
pthread_join(arg->ptid[other_thread_idx], (void*) &other_sum);
printf("Thread %d joined thread %d which returned sum %d\n",
arg->idx, other_thread_idx, other_sum);
sum += other_sum;
}
printf("Thread %d, sum: %d\n", sum);
return (void*) sum;
}
int main()
{
struct Arg args[N];
pthread_t ptid[N];
for (int i = 0; i < N; ++i) {
struct Arg* arg = &args[i];
arg->idx = i;
arg->ptid = &ptid[0];
pthread_create(&ptid[i], partial_sum, NULL, arg);
}
// Get the final result.
int sum;
// Note: joining only the last thread -- all others have been joined
// already.
pthread_join(ptid[N - 1], (void*) &sum);
printf("Sum: %d\n", sum);
return 0;
}

Pthread_join functionality in Linux C

Program:
#include<stdio.h>
#include<unistd.h>
#include<pthread.h>
void* pfun1(void *vargp);
void* pfun2(void *vargp);
void main(){
int treturn,jreturn;
pthread_t tid1,tid2;
printf("Before thread call\n");
treturn = pthread_create(&tid1,NULL,pfun1,NULL);
treturn = pthread_create(&tid2,NULL,pfun2,NULL);
jreturn = pthread_join(tid1,NULL);
//jreturn = pthread_join(tid2,NULL);
printf("After thread call\n");
}
void* pfun1(void *vargp){
int i;
for(i=0;i<5;i++){
printf("Thread1: %d\n",i);
sleep(1);
}
return (void*)0;
}
void* pfun2(void *vargp){
int i;
for(i=5;i<10;i++){
printf("Thread2: %d\n",i);
sleep(1);
}
return (void*)0;
}
In the above program, I joined only the first thread to the main program using pthread_join(). And the second thread is only created and not attached to main. But output function contains the output of the second thread too. How come it is possible to get the output of 2nd thread even though it is not attached to main?
Output:
Before thread call
Thread2: 5
Thread1: 0
Thread2: 6
Thread1: 1
Thread2: 7
Thread1: 2
Thread2: 8
Thread1: 3
Thread2: 9
Thread1: 4
After thread call

Joining is about synchronization (after a join, the joined thread is definitely finished) and obtaining the return value of the thread (the (void*)0s you're returning in each case).
It has nothing to do with IO redirection. Threads share the same stdout/stdin (as well as other filedescriptors and stdio buffers) and writes to (/reads from) those are immediate. They aren't postponed until the thread is joined.

As i can understand from this link pthread_join just wait for tid1 retrun inside main function, its not prevent tid2 from output. So i think, if you want run tid2 after tid1 just switch the lines:
treturn = pthread_create(&tid1,NULL,pfun1,NULL);
jreturn = pthread_join(tid1,NULL);
treturn = pthread_create(&tid2,NULL,pfun2,NULL);
Im not proffesional at this point, so you can make research for better solution if you want.

pthread_create() Output [duplicate]

this is my first pthread program, and I have no idea why the printf statement get printed twice in child thread:
int x = 1;
void *func(void *p)
{
x = x + 1;
printf("tid %ld: x is %d\n", pthread_self(), x);
return NULL;
}
int main(void)
{
pthread_t tid;
pthread_create(&tid, NULL, func, NULL);
printf("main thread: %ld\n", pthread_self());
func(NULL);
}
Observed output on my platform (Linux 3.2.0-32-generic #51-Ubuntu SMP x86_64 GNU/Linux):
1.
main thread: 140144423188224
tid 140144423188224: x is 2
2.
main thread: 140144423188224
tid 140144423188224: x is 3
3.
main thread: 139716926285568
tid 139716926285568: x is 2
tid 139716918028032: x is 3
tid 139716918028032: x is 3
4.
main thread: 139923881056000
tid 139923881056000: x is 3
tid 139923872798464tid 139923872798464: x is 2
for 3, two output lines from the child thread
for 4, the same as 3, and even the outputs are interleaved.

Threading generally occurs by time-division multiplexing. It is generally in-efficient for the processor to switch evenly between two threads, as this requires more effort and higher context switching. Typically what you'll find is a thread will execute several times before switching (as is the case with examples 3 and 4. The child thread executes more than once before it is finally terminated (because the main thread exited).
Example 2: I don't know why x is increased by the child thread while there is no output.
Consider this. Main thread executes. it calls the pthread and a new thread is created.The new child thread increments x. Before the child thread is able to complete the printf statement the main thread kicks in. All of a sudden it also increments x. The main thread is however also able to run the printf statement. Suddenly x is now equal to 3.
The main thread now terminates (also causing the child 3 to exit).
This is likely what happened in your case for example 2.
Examples 3 clearly shows that the variable x has been corrupted due to inefficient locking and stack data corruption!!
For more info on what a thread is.
Link 1 - Additional info about threading
Link 2 - Additional info about threading
Also what you'll find is that because you are using the global variable of x, access to this variable is shared amongst the threads. This is bad.. VERY VERY bad as threads accessing the same variable create race conditions and data corruption due to multiple read writes occurring on the same register for the variable x.
It is for this reason that mutexes are used which essentially create a lock whilst variables are being updated to prevent multiple threads attempting to modify the same variable at the same time.
Mutex locks will ensure that x is updated sequentially and not sporadically as in your case.
See this link for more about Pthreads in General and Mutex locking examples.
Pthreads and Mutex variables
Cheers,
Peter

Hmm. your example uses the same "resources" from different threads. One resource is the variable x, the other one is the stdout-file. So you should use mutexes as shown down here. Also a pthread_join at the end waits for the other thread to finish its job. (Usually a good idea would also be to check the return-codes of all these pthread... calls)
#include <pthread.h>
#include <stdio.h>
int x = 1;
pthread_mutex_t mutex;
void *func(void *p)
{
pthread_mutex_lock (&mutex);
x = x + 1;
printf("tid %ld: x is %d\n", pthread_self(), x);
pthread_mutex_unlock (&mutex);
return NULL;
}
int main(void)
{
pthread_mutex_init(&mutex, 0);
pthread_t tid;
pthread_create(&tid, NULL, func, NULL);
pthread_mutex_lock (&mutex);
printf("main thread: %ld\n", pthread_self());
pthread_mutex_unlock (&mutex);
func(NULL);
pthread_join (tid, 0);
}

It looks like the real answer is Michael Burr's comment which references this glibc bug: https://sourceware.org/bugzilla/show_bug.cgi?id=14697
In summary, glibc does not handle the stdio buffers correctly during program exit.

Wrong thread finish in C? [duplicate]

This question already has answers here:
Wrong thread IDs in a multithreaded C program?
(2 answers)
Closed 8 years ago.
I compiled this code and the 99th threads that it's been created keeps creating more than one thread of number 99. Instead if i insert values from 1-10 or something small then the results are quite normal.
Here is the code.
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
pthread_mutex_t m=PTHREAD_MUTEX_INITIALIZER;
pthread_attr_t attr;
void* test(void *a)
{
int i=*((int *)a);
printf("The thread %d has started.\n",i);
pthread_mutex_lock(&m);
usleep(10000);
printf("The thread %d has finished.\n",i);
pthread_mutex_unlock(&m);
pthread_exit(NULL);
}
int main()
{
int i=0,j=0;
pthread_attr_setdetachstate(&attr,PTHREAD_CREATE_JOINABLE);
pthread_t thread[100];
for (i=0;i<100;i++)
{
j=i;
pthread_create(&thread[i],&attr,test,&j);
}
for (i=0;i<100;i++)
pthread_join(thread[i],NULL);
return 0;
}
i get:
..../*Normal Results*/
The thread 99 has finished.
The thread 99 has finished.
The thread 99 has finished.
The thread 99 has finished.
The thread 99 has finished.
The thread 99 has finished.
Why is this happening?

You need to keep all theadIds
int indexes[PTHREAD_COUNT];
for (i=0;i<100;i++) {
indexes[i] = i;
pthread_create(&thread[i], &attr, test, &indexes[i]);
}

Each thread is passed the same pointer to the same stack location (j) in your main thread. Without further synchronisation, its undefined when each thread will be scheduled and will run and access j before printing its value.
There are lots of ways you could print out a unique number from each thread, including
malloc a struct which includes (or is) this number in the main thread. Pass it to the child threads which are then responsible for freeing it
(Suggested by Brian Roche below) Declare an array of 100 ints, with values 0, 1, 2, etc. Pass the address of a different array item to each thread.
have each thread lock a mutex then copy/increment a global counter. The mutex could be passed into the thread or another global
pass a semaphore into each thread, signalling it once the number has been accessed. Wait on this semaphore in the main thread
Note that options 3 & 4 involve serialising startup of the threads. There's little point in running multiple threads if you do much of this!

i is the same address as j in your main code, and you sleep of 30 ms in the threads. So all threads have time to run until the first mutex call, then they all stop for (a little over) 30ms, and then print they have finished. Of course, i in the main loop is now 99, because you are finished with the pthread_join loop.
You need to have an array of "j" values [assuming you want all threads to run independently]. Or do something else. It all depends on what you are actually wanting to do.

you are passing the same memory location (&j) as the pointer to data a.
when threads start, they print out the value just assigned to j, which looks OK
But then only one get the lock and went to sleep, all others thus blocked. when the nightmare is over, memory location "a", of course, has a int value 99.

Pthread_join of one of a number of threads

My question is similar to How do I check if a thread is terminated when using pthread?. but i did not quite get an answer.
My problem is...I create a certain number of threads say n. As soon as main detects the exit of any one thread it creates another thread thus keeping the degree of concurrency as n and so on.
How does the main thread detect the exit of a thread. pthread_join waits for a particular thread to exit but in my case it can be any one of the n threads.
Thanks

Most obvious, without restructuring your code as aix suggests, is to have each thread set something to indicate that it has finished (probably a value in an array shared between all threads, one slot per worker thread), and then signal a condition variable. Main thread waits on the condition variable and each time it wakes up, handle all threads that have indicated themselves finished: there may be more than one.
Of course that means that if the thread is cancelled you never get signalled, so use a cancellation handler or don't cancel the thread.

There are several ways to solve this.
One natural way is to have a thread pool of fixed size n and have a queue into which the main thread would place tasks and from which the workers would pick up tasks and process them. This will maintain a constant degree of concurrency.
An alternative is to have a semaphore with the initial value set to n. Every time a worker thread is created, the value of the semaphore would need to be decremented. Whenever a worker is about to terminate, it would need to increment ("post") the semaphore. Now, waiting on the semaphore in the main thread will block until there's fewer than n workers left; a new worker thread would then be spawned and the wait resumed. Since you won't be using pthread_join on the workers, they should be detached (pthread_detach).

If you want to be informed of a thread exiting (via pthread_exit or cancellation), you can use a handler with pthread_cleanup_push to inform the main thread of the child exiting (via a condition variable, semaphore or similar) so it can either wait on it, or simply start a new one (assuming the child is detached first).
Alternately, I'd suggest having the threads wait for more work (as suggested by #aix), rather than ending.

If your parent thread needs to do other other things, then it can't just constantly be blocking on pthread_join, You will need a way to send a message to the main thread from the child thread to tell it to call pthread_join. There are a number of IPC mechanisms that you could use for this.
When a child thread has done it's work, it would then send some sort of message to the main thread through IPC saying "I completed my job" and also pass its own thread id, then the main thread knows to call pthread_join on that thread id.

One easy way is to use a pipe as a communication channel between the (worker) threads and your main thread. When a thread terminates it writes its result (thread id in the following example) to the pipe. The main thread waits on the pipe and reads the thread result from it as soon as it becomes available.
Unlike mutex or semaphore, a pipe file descriptor can be easily handled by the application main event loop (such as libevent). The writes from different threads to the same pipe are atomic as long as they write PIPE_BUF or less bytes (4096 on my Linux).
Below is a demo that creates ten threads each of which has a different life span. Then the main thread waits for any thread to terminate and prints its thread id. It terminates when all ten threads have completed.
$ cat test.cc
#include <iostream>
#include <pthread.h>
#include <unistd.h>
#include <stdlib.h>
#include <time.h>
void* thread_fun(void* arg) {
// do something
unsigned delay = rand() % 10;
usleep(delay * 1000000);
// notify termination
int* thread_completed_fd = static_cast<int*>(arg);
pthread_t thread_id = pthread_self();
if(sizeof thread_id != write(*thread_completed_fd, &thread_id, sizeof thread_id))
abort();
return 0;
}
int main() {
int fd[2];
if(pipe(fd))
abort();
enum { THREADS = 10 };
time_t start = time(NULL);
// start threads
for(int n = THREADS; n--;) {
pthread_t thread_id;
if(pthread_create(&thread_id, NULL, thread_fun, fd + 1))
abort();
std::cout << time(NULL) - start << " sec: started thread " << thread_id << '\n';
}
// wait for the threads to finish
for(int n = THREADS; n--;) {
pthread_t thread_id;
if(sizeof thread_id != read(fd[0], &thread_id, sizeof thread_id))
abort();
if(pthread_join(thread_id, NULL)) // detached threads don't need this call
abort();
std::cout << time(NULL) - start << " sec: thread " << thread_id << " has completed\n";
}
close(fd[0]);
close(fd[1]);
}
$ g++ -o test -pthread -Wall -Wextra -march=native test.cc
$ ./test
0 sec: started thread 140672287479552
0 sec: started thread 140672278759168
0 sec: started thread 140672270038784
0 sec: started thread 140672261318400
0 sec: started thread 140672252598016
0 sec: started thread 140672243877632
0 sec: started thread 140672235157248
0 sec: started thread 140672226436864
0 sec: started thread 140672217716480
0 sec: started thread 140672208996096
1 sec: thread 140672208996096 has completed
2 sec: thread 140672226436864 has completed
3 sec: thread 140672287479552 has completed
3 sec: thread 140672243877632 has completed
5 sec: thread 140672252598016 has completed
5 sec: thread 140672261318400 has completed
6 sec: thread 140672278759168 has completed
6 sec: thread 140672235157248 has completed
7 sec: thread 140672270038784 has completed
9 sec: thread 140672217716480 has completed