Non-blocking pthread_join - c

I'm coding the shutdown of a multithreaded server.If everything goes as it should all the threads exit by their own, but there's a small chance that a thread gets stuck.In this case it would be convenient to have a non-blocking join so I could do.
Is there a way of doing a non-blocking pthread_join?
Some sort of timed join would be good too.
something like this:
foreach thread do
nb_pthread_join();
if still running
pthread_cancel();
I can think more cases where a a non-bloking join would be useful.
As it seems there is no such a function so I have already coded a workaround, but it's not as simple as I would like.

If you are running your application on Linux, you may be interested to know that:
int pthread_tryjoin_np(pthread_t thread, void **retval);
int pthread_timedjoin_np(pthread_t thread, void **retval,
const struct timespec *abstime);
Be careful, as the suffix suggests it, "np" means "non-portable". They are not POSIX standard, gnu extensions, useful though.
link to man page

The 'pthread_join' mechanism is a convenience to be used if it happens to do exactly what you want. It doesn't do anything you couldn't do yourself, and where it's not exactly what you want, code exactly what you want.
There is no real reason you should actually care whether a thread has terminated or not. What you care about is whether the work the thread was doing is completed. To tell that, have the thread do something to indicate that it is working. How you do that depends on what is ideal for your specific problem, which depends heavily on what the threads are doing.
Start by changing your thinking. It's not a thread that gets stuck, it's what the thread was doing that gets stuck.

If you're developing for QNX, you can use pthread_timedjoin() function.
Otherwise, you can create a separate thread that will perform pthread_join() and alert the parent thread, by signalling a semaphore for example, that the child thread completes. This separate thread can return what is gets from pthread_join() to let the parent thread determine not only when the child completes but also what value it returns.

As others have pointed out there is not a non-blocking pthread_join available in the standard pthread libraries.
However, given your stated problem (trying to guarantee that all of your threads have exited on program shutdown) such a function is not needed. You can simply do this:
int killed_threads = 0;
for(i = 0; i < num_threads; i++) {
int return = pthread_cancel(threads[i]);
if(return != ESRCH)
killed_threads++;
}
if(killed_threads)
printf("%d threads did not shutdown properly\n", killed_threads)
else
printf("All threads exited successfully");
There is nothing wrong with calling pthread_cancel on all of your threads (terminated or not) so calling that for all of your threads will not block and will guarantee thread exit (clean or not).
That should qualify as a 'simple' workaround.

The answer really depends on why you want to do this. If you just want to clean up dead threads, for example, it's probably easiest just to have a "dead thread cleaner" thread that loops and joins.

I'm not sure what exactly you mean, but I'm assuming that what you really need is a wait and notify mechanism.
In short, here's how it works: You wait for a condition to satisfy with a timeout. Your wait will be over if:
The timeout occurs, or
If the condition is satisfied.
You can have this in a loop and add some more intelligence to your logic. The best resource I've found for this related to Pthreads is this tutorial:
POSIX Threads Programming (https://computing.llnl.gov/tutorials/pthreads/).
I'm also very surprised to see that there's no API for timed join in Pthreads.

There is no timed pthread_join, but if you are waiting for other thread blocked on conditions, you can use timed pthread_cond_timed_wait instead of pthread_cond_wait

You could push a byte into a pipe opened as non-blocking to signal to the other thread when its done, then use a non-blocking read to check the status of the pipe.

Related

Check if pthread is still alive in Linux C

I know similar questions have been asked, but I think my situation is little bit different. I need to check if child thread is alive, and if it's not print error message. Child thread is supposed to run all the time. So basically I just need non-block pthread_join and in my case there are no race conditions. Child thread can be killed so I can't set some kind of shared variable from child thread when it completes because it will not be set in this case.
Killing of child thread can be done like this:
kill -9 child_pid
EDIT: alright, this example is wrong but still I'm sure there exists way to kill a specific thread in some way.
EDIT: my motivation for this is to implement another layer of security in my application which requires this check. Even though this check can be bypassed but that is another story.
EDIT: lets say my application is intended as a demo for reverse engineering students. And their task is to hack my application. But I placed some anti-hacking/anti-debugging obstacles in child thread. And I wanted to be sure that this child thread is kept alive. As mentioned in some comments - it's probably not that easy to kill child without messing parent so maybe this check is not necessary. Security checks are present in main thread also but this time I needed to add them in another thread to make main thread responsive.
killed by what and why that thing can't indicate the thread is dead? but even then this sounds fishy
it's almost universally a design error if you need to check if a thread/process is alive - the logic in the code should implicitly handle this.
In your edit it seems you want to do something about a possibility of a thread getting killed by something completely external.
Well, good news. There is no way to do that without bringing the whole process down. All ways of non-voluntary death of a thread kill all threads in the process, apart from cancellation but that can only be triggered by something else in the same process.
The kill(1) command does not send signals to some thread, but to a entire process. Read carefully signal(7) and pthreads(7).
Signals and threads don't mix well together. As a rule of thumb, you don't want to use both.
BTW, using kill -KILL or kill -9 is a mistake. The receiving process don't have the opportunity to handle the SIGKILL signal. You should use SIGTERM ...
If you want to handle SIGTERM in a multi-threaded application, read signal-safety(7) and consider setting some pipe(7) to self (and use poll(2) in some event loop) which the signal handler would write(2). That well-known trick is well explained in Qt documentation. You could also consider the signalfd(2) Linux specific syscall.
If you think of using pthread_kill(3), you probably should not in your case (however, using it with a 0 signal is a valid but crude way to check that the thread exists). Read some Pthread tutorial. Don't forget to pthread_join(3) or pthread_detach(3).
Child thread is supposed to run all the time.
This is the wrong approach. You should know when and how a child thread terminates because you are coding the function passed to pthread_create(3) and you should handle all error cases there and add relevant cleanup code (and perhaps synchronization). So the child thread should run as long as you want it to run and should do appropriate cleanup actions when ending.
Consider also some other inter-process communication mechanism (like socket(7), fifo(7) ...); they are generally more suitable than signals, notably for multi-threaded applications. For example you might design your application as some specialized web or HTTP server (using libonion or some other HTTP server library). You'll then use your web browser, or some HTTP client command (like curl) or HTTP client library like libcurl to drive your multi-threaded application. Or add some RPC ability into your application, perhaps using JSONRPC.
(your putative usage of signals smells very bad and is likely to be some XY problem; consider strongly using something better)
my motivation for this is to implement another layer of security in my application
I don't understand that at all. How can signal and threads add security? I'm guessing you are decreasing the security of your software.
I wanted to be sure that this child thread is kept alive.
You can't be sure, other than by coding well and avoiding bugs (but be aware of Rice's theorem and the Halting Problem: there cannot be any reliable and sound static source code program analysis to check that). If something else (e.g. some other thread, or even bad code in your own one) is e.g. arbitrarily modifying the call stack of your thread, you've got undefined behavior and you can just be very scared.
In practice tools like the gdb debugger, address and thread sanitizers, other compiler instrumentation options, valgrind, can help to find most such bugs, but there is No Silver Bullet.
Maybe you want to take advantage of process isolation, but then you should give up your multi-threading approach, and consider some multi-processing approach. By definition, threads share a lot of resources (notably their virtual address space) with other threads of the same process. So the security checks mentioned in your question don't make much sense. I guess that they are adding more code, but just decrease security (since you'll have more bugs).
Reading a textbook like Operating Systems: Three Easy Pieces should be worthwhile.
You can use pthread_kill() to check if a thread exists.
SYNOPSIS
#include <signal.h>
int pthread_kill(pthread_t thread, int sig);
DESCRIPTION
The pthread_kill() function shall request that a signal be delivered
to the specified thread.
As in kill(), if sig is zero, error checking shall be performed
but no signal shall actually be sent.
Something like
int rc = pthread_kill( thread_id, 0 );
if ( rc != 0 )
{
// thread no longer exists...
}
It's not very useful, though, as stated by others elsewhere, and it's really weak as any type of security measure. Anything with permissions to kill a thread will be able to stop it from running without killing it, or make it run arbitrary code so that it doesn't do what you want.

Will signals be delivered to a program blocked on POSIX semaphore?

This really is two questions, but I suppose it's better they be combined.
We're working on a client that uses asynchronous TCP connection. The idea is that the program will block until certain message is received from the server, which will invoke a SIGPOLL handler. We are using a busy waiting loop, basically:
var = 1
while (var) usleep(100);
//...and somewhere else
void sigpoll_handler(int signum){
......
var = 0;
......
}
We would like to use something more reliable instead, like a semaphore. The thing is, when a thread is blocked on a semaphore, will the signal get through still? Especially considering that signals get delivered when it switches back to user level; if the process is off the runqueue, how will it happen?
Side question (just out of curiosity):
Without the "usleep(100)" the program never progresses past the while loop, although I can verify the variable was set in the handler. Why is that? Printing changes its behaviour too.
Cheers!
[too long for a comment]
Accessing var from inside the signal handler invokes undefined behaviour (at least for a POSIX conforming system).
From the related POSIX specification:
[...] if the process is single-threaded and a signal handler is executed [...] the behavior is undefined if the signal handler refers to any object [...] with static storage duration other than by assigning a value to an object declared as volatile sig_atomic_t [...]
So var shall be defined:
volatile sig_atomic_t var;
The busy waiting while-loop, can be replaced by a single call to a blocking pause(), as it will return on reception of the signal.
From the related POSIX specification:
The pause() function shall suspend the calling thread until delivery of a signal whose action is either to execute a signal-catching function or to terminate the process.
Using pause(), btw, will make the use of any global flag like var redundant, to not say needless.
Short answer: yes, the signal will get through fine with a good implementation.
If you're going to be using a semaphore to control the flow of the program, you'll want to have the listening be on one child with the actual data processing be on another. This will then put the concurrency fairness in the hands of the OS which will make sure your signal listening thread gets a chance to check for a signal with some regularity. It shouldn't ever be really "off the runqueue," but cycling through positions on the runqueue instead.
If it helps you to think about it, what you have right now seems to basically be a a very rough implementation of a semaphore on its own -- a shared variable whose value will stop one block of code from executing until another code block clears it. There isn't anything inherently paralyzing about a semaphore on a system level.
I kind of wonder why whatever function you're using to listen for the SIGPOLL isn't doing its own blocking, though. Most of those utilities that I've seen will stop their calling thread until they return a value. Basically they handle the concurrency for you and you can code as if you were dealing with a normal synchronous program.
With regards to the usleep loop: I'd have to look at what the optimizer's doing, but I think there are basically two possibilities. I think it's unlikely, but it could be that the no-body loop is compiling into something that isn't actually checking for a value change and is instead just looping. More likely to me would be that the lack of any body steps is messing up the underlying concurrency handling, and the loop is executing so quickly that nothing else is getting a chance to run -- the queue is being flooded by loop iterations and your signal processsing can't get a word in edgewise. You could try just watching it for a few hours to see if anything changes; theoretically if it's just a concurrency problem then the random factor involved could clear the block on its own with a few billion chances.

How pthread returns the fastest result and terminates the slower ones?

I'm currently writing a program that the main thread is going to create three child threads. These threads are running simultaneously and what I want to do is once one of the child thread is done, I will check if the output is right. If it is, then terminate the other two threads; if not, then throw away this thread's result and wait for the other two threads' result.
I'm creating the three results in the main function with pthread_create. But I do not know how to use join function. If I use join function three times in the main function, it just waits one by one until the three threads are done.
My plan is like this:
int return_value;
main(){
pthread_create(&pid[0], NULL, fun0, NULL);
pthread_create(&pid[1], NULL, fun1, NULL);
pthread_create(&pid[2], NULL, fun2, NULL);
}
fun0(){
...
if( check the result is right ){
return_value = result;
if (pid[1] is running) pthread_kill( pid[1], SIGTERM );
if (pid[2] is running) pthread_kill( pid[2], SIGTERM );
}
fun1() ...
fun2() ...
function 0, 1, and 2 are similar to each other and once one function has the right answer, it will kill the other two threads. However, while running the program, once the pthread_kill is processed, the whole program is terminated, not just one thread. I don't know why.
And I still do not know if there are any other ways to code this program. Thanks for helping me out of this.
The pthread_kill() function is not designed to terminate threads, just like kill() is not designed to terminate processes. These functions just send signals, and their names are unfortunate byproducts of history. Certain signal handlers will cause the process to terminate. Using pthread_kill() allows you to select which thread handles a signal, but the signal handler will still do the exact same thing (e.g., terminate the process).
To terminate a thread, use pthread_cancel(). This will normally terminate the thread at the next cancellation point. Cancellation points are listed in the man page for pthread_cancel(), only certain functions like write(), sleep(), pthread_testcancel() are cancellation points.
However, if you set the cancelability type of the thread (with pthread_setcanceltype()) to PTHREAD_CANCEL_ASYNCHRONOUS, you can cancel the thread at any time. This can be DANGEROUS and you must be very careful. For example, if you cancel a thread in the middle of a malloc() call, you will get all sorts of nasty problems later on.
You will probably find it much easier to either test a shared variable every now and then, or perhaps even to use different processes which you can then just kill() if you don't need them any more. Canceling a thread is tricky.
Summary
Easiest option is to just test a variable in each thread to see if it should be canceled.
If this doesn't work, my next recommendation is to use fork() instead of pthread_create(), after which you can use kill().
If you want to play with fire, use asynchronous pthread_cancel(). This will probably explode in your face. You will have to spend hours of your precious time hunting bugs and trying to figure out how to do cleanup correctly. You will lose sleep and your cat will die from neglect.

Check thread status while leaving it in a waitable state

I am wondering if it is possible to check the status of a thread, which could possibly be in a waitable state but doesn't have to be and if it is in a waitable state I would like to leave it in that state.
Basically, how can I check the status of a thread without changing its (waitable) state.
By waitable, I mean if I called wait(pid) it would return properly and not hang.
Let me also add that I am tracing a multithreaded program, therefore I cannot change the code of it. Also, I omitted this information as well but this is a Linux-based system.
Are you asking about processes or threads? The wait function acts on processes, not threads, so your question as-written is not valid.
For (child) processes, you can check the state by calling waitid with the WNOWAIT flag. This will leave the process in a waitable state.
For threads, on some implementatiosn you can call pthread_kill(thread, 0) and check for ESRCH to determine if the thread has exited or not, while leaving thread in a joinable state. Note that this is valid only if the thread is joinable. If it was detached or already joined, you are invoking Undefined Behavior and your program should crash or worse. Unfortunately, there is no requirement that pthread_kill report ESRCH in this case, so it might falsely report that a thread still exists when in fact it already terminated. Of course, formally there is no difference between a thread that's sitting around forever between the call to pthread_exit and actual termination, and a thread that has actually finished terminating, so the question is a bit meaningless. In other words, there's no requirement that a joinable thread ever terminate until pthread_join is blocked waiting for it to terminate.
Do you want to do something like this (pseudo-code)?
if (status(my_thread) == waiting)
do_something();
else
do_something_else();
If that is indeed what you are trying to do, you are exposing yourself to race conditions. For example, what if my_thread wakes up after status(my_thread) but before do_something() (or even before == waiting)?
You might want to consider condition variables for safely communicating "status" between threads. Thread-safe queue might also be an option...
BTW, Lawrence Livermore National Laboratory has an excellent tutorial on multithreading concepts at https://computing.llnl.gov/tutorials/pthreads/ (including condition variables). This particular document uses POSIX API, but concepts that are explained are universal.

Kill Thread in Pthread Library

I use pthread_create(&thread1, &attrs, //... , //...); and need if some condition occured need to kill this thread how to kill this ?
First store the thread id
pthread_create(&thr, ...)
then later call
pthread_cancel(thr)
However, this not a recommended programming practice! It's better to use an inter-thread communication mechanism like semaphores or messages to communicate to the thread that it should stop execution.
Note that pthread_kill(...) does not actually terminate the receiving thread, but instead delivers a signal to it, and it depends on the signal and signal handlers what happens.
There are two approaches to this problem.
Use a signal: The thread installs a signal handler using sigaction() which sets a flag, and the thread periodically checks the flag to see whether it must terminate. When the thread must terminate, issue the signal to it using pthread_kill() and wait for its termination with pthread_join(). This approach requires pre-synchronization between the parent thread and the child thread, to guarantee that the child thread has already installed the signal handler before it is able to handle the termination signal;
Use a cancellation point: The thread terminates whenever a cancellation function is executed. When the thread must terminate, execute pthread_cancel() and wait for its termination with pthread_join(). This approach requires detailed usage of pthread_cleanup_push() and pthread_cleanup_pop() to avoid resource leakage. These last two calls might mess with the lexical scope of the code (since they may be macros yielding { and } tokens) and are very difficult to maintain properly.
(Note that if you have already detached the thread using pthread_detach(), you cannot join it again using pthread_join().)
Both approaches can be very tricky, but either might be specially useful in a given situation.
I agree with Antti, better practice would be to implement some checkpoint(s) where the thread checks if it should terminate. These checkpoints can be implemented in a number of ways e.g.: a shared variable with lock or an event that the thread checks if it is set (the thread can opt to wait zero time).
Take a look at the pthread_kill() function.
pthread_exit(0)
This will kill the thread.

Resources