pthread_join success = thread entirely executed? - c

I have a question about pthreads whith this little C source:
int calc = 0;
void func(void* data){
calc = 2 * 2;
return NULL;
}
int main(){
pthread_t t;
if(0==pthread_create(&t,NULL,func,NULL)){
if(0==pthread_join(t,NULL)){
printf("Result: %d\n",calc); // 4 ?
}
}
}
If pthread_join return success, is "func" always executed entirely ? (calc always equal 4 on printf ?).

The function pthread_join returns zero on success.
The documentation says that pthread_join blocks until the thread ends, so, with some applied logic one can easily conclude that the thread has ended.
On the other side, pthread_join fails in different ways:
When the handle is invalid: EINVAL
When a deadlock has been detected: EDEADLK
There is another possible error (recommended by the open group, but depending on the implementation): ESRCH, when it detects that the thread handle is being used past the end of the thread.
If you want to know more you may want to take a look at the documentation.

Related

How collect thread exit status(using join) when cancelled

I am trying to cancel thread from caller or calle, but both are crashing the program
But if I join I am getting the exit status correct.
how to collect the exit status properly on pthread_cancel
man page says below
After a canceled thread has terminated, a join with that thread using
pthread_join(3) obtains PTHREAD_CANCELED as the thread's exit status.
(Joining with a thread is the only way to know that cancellation has
completed.)
#include <stdio.h>
#include <pthread.h>
void *thread_func(void *arg);
int errNum = 3;
int main()
{
pthread_t t_id;
void *status;
// on success pthread_create return zero
if(pthread_create(&t_id,NULL,thread_func,NULL) != 0){
printf("thread creation failed\n");
return 0;
}
printf("thread created with id %u successfully\n",t_id);
// status will be collecting the pthread_exit value
// error numberis returned incase of error
// pthread_cancel(t_id);
if(pthread_join(t_id,&status) != 0){
printf("join failed\n");
}
printf("thread %u exited with code %d\n", t_id, *(int *)status);
return 0;
}
void *thread_func(void *arg)
{
printf("Inside thread_func :%u\n",pthread_self());
//the arguments of pthread_exit should not be from local space, as it will be collected in caller using join
//pthread_exit(&errNum);
// if we return it may cause seg fault as we are trying to print the value from ptr(status)
//return ;
pthread_cancel(pthread_self());
}
If a thread is cancelled (before it has terminated normally), then when you join it, you will receive PTHREAD_CANCELED as the thread's return value / exit status. That macro expands to the actual void * value that is returned, so you can compare the value you receive directly to that to judge whether the thread was cancelled. It generally is not a valid pointer, so you must not try to dereference it.
Example:
void *status;
// ...
if (pthread_join(t_id, &status) != 0) {
// pthread_join failed
} else if (status == PTHREAD_CANCELED) {
// successfully joined a thread that was cancelled
// 'status' MUST NOT be dereferenced
} else {
// successfully joined a thread that terminated normally
// whether 'status' may be dereferenced or how else it may be
// used depends on the thread
}
It is worth noting that the wording of the Linux manual page is a bit fast and loose. Threads do not have an "exit status" in the sense that processes do, and the actual POSIX specifications do not use the term in the context of threads. For example, the POSIX specifications for pthread_join() say:
On return from a successful pthread_join() call with a non-NULL value_ptr argument, the value passed to pthread_exit() by the terminating thread shall be made available in the location referenced by value_ptr.
That's a bit of a mouthful compared to the Linux wording, but it is chosen to be very precise.
Note also that the choice of type void * here is intentional and useful. It is not merely an obtuse way to package an int. Through such a pointer, a thread can provide access to an object of any type, as may be useful for communicating information about the outcome of its computations. On the other hand, it is fairly common for threads to eschew that possibility and just return NULL. But if a thread did want to provide an integer code that way, then it would most likely provide an intvalue cast to type void *, rather than a pointer to an object of type int containing the chosen value. In that case, one would obtain the value by casting back to int, not by dereferencing the pointer.

Multithreaded program with mutex on mutual resource [duplicate]

This question already has an answer here:
Pthread_create() incorrect start routine parameter passing
(1 answer)
Closed 3 years ago.
I tried to build a program which should create threads and assign a Print function to each one of them, while the main process should use printf function directly.
Firstly, I made it without any synchronization means and expected to get a randomized output.
Later I tried to add a mutex to the Print function which was assigned to the threads and expected to get a chronological output but it seems like the mutex had no effect about the output.
Should I use a mutex on the printf function in the main process as well?
Thanks in advance
My code:
#include <stdio.h>
#include <pthread.h>
#include <errno.h>
pthread_t threadID[20];
pthread_mutex_t lock;
void* Print(void* _num);
int main(void)
{
int num = 20, indx = 0, k = 0;
if (pthread_mutex_init(&lock, NULL))
{
perror("err pthread_mutex_init\n");
return errno;
}
for (; indx < num; ++indx)
{
if (pthread_create(&threadID[indx], NULL, Print, &indx))
{
perror("err pthread_create\n");
return errno;
}
}
for (; k < num; ++k)
{
printf("%d from main\n", k);
}
indx = 0;
for (; indx < num; ++indx)
{
if (pthread_join(threadID[indx], NULL))
{
perror("err pthread_join\n");
return errno;
}
}
pthread_mutex_destroy(&lock);
return 0;
}
void* Print(void* _indx)
{
pthread_mutex_lock(&lock);
printf("%d from thread\n", *(int*)_indx);
pthread_mutex_unlock(&lock);
return NULL;
}
All questions of program bugs notwithstanding, pthreads mutexes provide only mutual exclusion, not any guarantee of scheduling order. This is typical of mutex implementations. Similarly, pthread_create() only creates and starts threads; it does not make any guarantee about scheduling order, such as would justify an assumption that the threads reach the pthread_mutex_lock() call in the same order that they were created.
Overall, if you want to order thread activities based on some characteristic of the threads, then you have to manage that yourself. You need to maintain a sense of which thread's turn it is, and provide a mechanism sufficient to make a thread notice when it's turn arrives. In some circumstances, with some care, you can do this by using semaphores instead of mutexes. The more general solution, however, is to use a condition variable together with your mutex, and some shared variable that serves as to indicate who's turn it currently is.
The code passes the address of the same local variable to all threads. Meanwhile, this variable gets updated by the main thread.
Instead pass it by value cast to void*.
Fix:
pthread_create(&threadID[indx], NULL, Print, (void*)indx)
// ...
printf("%d from thread\n", (int)_indx);
Now, since there is no data shared between the threads, you can remove that mutex.
All the threads created in the for loop have different value of indx. Because of the operating system scheduler, you can never be sure which thread will run. Therefore, the values printed are in random order depending on the randomness of the scheduler. The second for-loop running in the parent thread will run immediately after creating the child threads. Again, the scheduler decides the order of what thread should run next.
Every OS should have an interrupt (at least the major operating systems have). When running the for-loop in the parent thread, an interrupt might happen and leaves the scheduler to make a decision of which thread to run. Therefore, the numbers being printed in the parent for-loop are printed randomly, because all threads run "concurrently".
Joining a thread means waiting for a thread. If you want to make sure you print all numbers in the parent for loop in chronological order, without letting child thread interrupt it, then relocate the for-loop section to be after the thread joining.

How to handle errors occured inside a thread

I want to implement a very simple design with pthreads:
From this image you only need to know that I have one thread that is created with start() and is destroyed with stop(), and inside the thread, there is a function that loops infinitely until stop() is called.
This what I have (mutexes are omitted):
int running = 0;
pthread_t thread;
void* fn (){
while (1){
if (!running) break;
if (foo ()){
//Error, how should I handle it?
//The main thread is still not waiting with join()
}
}
pthread_exit (0);
}
int start (){
running = 1;
if (pthread_create (&thread, 0, fn, 0)){
return -1;
}
return 0;
}
int stop (){
running = 0;
if (pthread_join (thread, 0)){
return -1;
}
return 0;
}
Usage:
start ();
//At this point the infinite loop is running
//Let's say that in the second 1 something fails inside the loop
//How can I handle the error if join() is still not called?
sleep (3);
stop ();
One solution to this problem is to use a callback. The error is passed to the main thread via a function passed as paramater to the secondary thread. The big problem with this is that I'm converting the program into an asynchronous model, which I'd like to avoid for now.
I can save the error in a global variable and check it when stop() is called. More solutions?
I am not really sure that I understand your problem correctly, but for pthread_join your thread must not necessarily be running anymore. It is perfectly legal to join a dead thread. The only constraint is that you can only call join once for any thread.
You should then transfer the information why your thread stop through the return argument of the thread function, that is what it is meant for, and your stop thread will receive that through the second parameter of join.
Also
your thread function has an incorrect interface, this leads to undefined behavior. modern ABIs transfer function arguments in registers, and here the two sides may have a different vision of which registers are safe to use
you only need pthread_exit when you return from another function than the thread function itself. a normal return would do.

How to be notified when a thread has been terminated for some error

I am working on a program with a fixed number of threads in C using posix threads.
How can i be notified when a thread has been terminated due to some error?
Is there a signal to detect it?
If so, can the signal handler create a new thread to keep the number of threads the same?
Make the threads detached
Get them to handle errors gracefully. i.e. Close mutexs, files etc...
Then you will have no probolems.
Perhaps fire a USR1 signal to the main thread to tell it that things have gone pear shaped (i was going to say tits up!)
Create your threads by passing the function pointers to an intermediate function. Start that intermediate function asynchronously and have it synchronously call the passed function. When the function returns or throws an exception, you can handle the results in any way you like.
With the latest inputs you've provided, I suggest you do something like this to get the number of threads a particular process has started-
#include<stdio.h>
#define THRESHOLD 50
int main ()
{
unsigned count = 0;
FILE *a;
a = popen ("ps H `ps -A | grep a.out | awk '{print $1}'` | wc -l", "r");
if (a == NULL)
printf ("Error in executing command\n");
fscanf(a, "%d", &count );
if (count < THRESHOLD)
{
printf("Number of threads = %d\n", count-1);
// count - 1 in order to eliminate header.
// count - 2 if you don't want to include the main thread
/* Take action. May be start a new thread etc */
}
return 0;
}
Notes:
ps H displays all threads.
$1 prints first column where PID is displayed on my system Ubuntu. The column number might change depending on the system
Replace a.out it with your process name
The backticks will evaluate the expression within them and give you the PID of your process. We are taking advantage of the fact that all POSIX threads will have same PID.
I doubt Linux would signal you when a thread dies or exits for any reason. You can do so manually though.
First, let's consider 2 ways for the thread to end:
It terminates itself
It dies
In the first method, the thread itself can tell someone (say the thread manager) that it is being terminated. The thread manager will then spawn another thread.
In the second method, a watchdog thread can keep track of whether the threads are alive or not. This is done more or less like this:
Thread:
while (do stuff)
this_thread->is_alive = true
work
Watchdog:
for all threads t
t->timeout = 0
while (true)
for all threads t
if t->is_alive
t->timeout = 0
t->is_alive = false
else
++t->timeout
if t->timeout > THRESHOLD
Thread has died! Tell the thread manager to respawn it
If for any reason one could not go for Ed Heal's "just work properly"-approach (which is my favorite answer to the OP's question, btw), the lazy fox might take a look at the pthread_cleanup_push() and pthread_cleanup_pop() macros, and think about including the whole thread function's body in between such two macros.
The clean way to know whether a thread is done is to call pthread_join() against that thread.
// int pthread_join(pthread_t thread, void **retval);
int retval = 0;
int r = pthread_join(that_thread_id, &retval);
... here you know that_thread_id returned ...
The problem with pthread_join() is, if the thread never returns (continues to run as expected) then you are blocked. That's therefore not very useful in your case.
However, you may actually check whether you can join (tryjoin) as follow:
//int pthread_tryjoin_np(pthread_t thread, void **retval);
int retval = 0;
int r = pthread_tryjoin_np(that_thread_id, &relval);
// here 'r' tells you whether the thread returned (joined) or not.
if(r == 0)
{
// that_thread_id is done, create new thread here
...
}
else if(errno != EBUSY)
{
// react to "weird" errors... (maybe a perror() at least?)
}
// else -- thread is still running
There is also a timed join which will wait for the amount of time you specified, like a few seconds. Depending on the number of threads to check and if your main process just sits around otherwise, it could be a solution. Block on thread 1 for 5 seconds, then thread 2 for 5 seconds, etc. which would be 5,000 seconds per loop for 1,000 threads (about 85 minutes to go around all threads with the time it takes to manage things...)
There is a sample code in the man page which shows how to use the pthread_timedjoin_np() function. All you would have to do is put a for loop around to check each one of your thread.
struct timespec ts;
int s;
...
if (clock_gettime(CLOCK_REALTIME, &ts) == -1) {
/* Handle error */
}
ts.tv_sec += 5;
s = pthread_timedjoin_np(thread, NULL, &ts);
if (s != 0) {
/* Handle error */
}
If your main process has other things to do, I would suggest you do not use the timed version and just go through all the threads as fast as you can.

About pthread_kill() behavior

I have a question about pthread_kill() behavior.
Here's a small code I'm trying out:
void my_handler1(int sig)
{
printf("my_handle1: Got signal %d, tid: %lu\n",sig,pthread_self());
//exit(0);
}
void *thread_func1(void *arg)
{
struct sigaction my_action;
my_action.sa_handler = my_handler1;
my_action.sa_flags = SA_RESTART;
sigaction(SIGUSR1, &my_action, NULL);
printf("thread_func1 exit\n");
}
void *thread_func2(void *arg)
{
int s;
s = pthread_kill(tid1_g,SIGUSR1);
if(s)
handle_error(s,"tfunc2: pthread_kill");
printf("thread_func2 exit\n");
}
int main()
{
int s = 0;
pthread_t tid1;
s = pthread_create(&tid1,NULL,thread_func1,NULL);
if(s)
handle_error(s,"pthread_create1");
tid1_g = tid1;
printf("tid1: %lu\n",tid1);
s = pthread_join(tid1,NULL);
if(s)
handle_error(s, "pthread_join");
printf("After join tid1\n");
pthread_t tid3;
s = pthread_create(&tid3,NULL,thread_func2,NULL);
if(s)
handle_error(s,"pthread_create3");
s = pthread_join(tid3,NULL);
if(s)
handle_error(s, "pthread_join3");
printf("After join tid3\n");
return 0;
}
The output I'm getting is:
tid1: 140269627565824
thread_func1 exit
After join tid1
my_handle1: Got signal 10, tid: 140269627565824
thread_func2 exit
After join tid3
So, even though I'm calling pthread_kill() on a thread that has already finished, the handler for that thread is still getting called. Isn't pthread_kill() supposed to return error(ESRCH) in case the thread doesn't exist?
Any use (*) of the pthread_t for a thread after its lifetime (i.e. after pthread_join successfully returns, or after the thread terminates in the detached state) results in undefined behavior. You should only expect ESRCH if the pthread_t is still valid, i.e. if you haven't joined the thread yet. Otherwise all bets are off.
Note: By "use" (*), I mean passing it to a pthread_ function in the standard library. As far as I can tell, merely assigning it to another pthread_t variable or otherwise passing it around between your own functions without "using" it doesn't result in UB.
According this SO thread says that passing a signal to an already dead thread (Only if the thread was joined or exited ) results in undefined behavior!
EDIT: Found a thread which clearly quotes the latest POSIX spec which indicates the behavior to be undefined. Thanks R.. for the correct pointers!
The question asked here (How to determine if a pthread is still alive) has been marked as duplicate as this question.
But I believe this post just clarifies the behavior of pthread_kill and confirms that it does not guarantee the correct behavior if pthread_kill is called with the ID which is no more valid. Hence pthread_kill can not be used to know if thread is alive or not as if the thread was joined earlier, the ID would not have been valid or would have been re-used and same is the case if its been detached as the resources may have got reclaimed if thread was terminated.
So to determine if thread is alive (question is specifically asked for joinable threads), I could think of only one solution as below:
Use some global data/memory which can be accessed by both the threads and store the return/exit status of thread-whose-status-needs-to-be-determined there. Other threads can check this data/locatin to get its status. (Obviously this assumes that thread exited normally i.e either joined or detached).
For e.g:
Have a global bool named as "bTerminated" initialized with "FALSE" and in
the handler function of this thread either make it as "TRUE" before
returning or modify it once it is returned to the caller (i.e where you have
called `pthread_join` for this thread). Check for this variable in any other
threads where you want to know if this thread is alive. Probably it will be
straight to implement such a logic which fits into your original code.

Resources