My question is similar to this one. But after viewing all its answers, I still don't know what kind of safety guarantee one can get with pthread_cancel(). So I would like to ask a more specific question:
Say that pthread_cancel() is called on a pthread_t variable, named my_thread, is it possible that by the time pthread_cancel(my_thread) is executed, the actual thread corresponding to my_thread has already been terminated somehow, and the kernel recycled the value of my_thread for another newly created thread, such that by executing pthread_cancel(my_thread), another unintended thread gets killed?
The value can't be "recycled" until the thread is detached or joined. As long as you didn't do either of those things, it's safe to call pthread_cancel, even if the thread already terminated.
The question is about race conditions involving pthread_cancel(). POSIX requires that function to be thread safe in the specific, limited sense in which it uses that term, but that doesn't really speak to the question at hand. The key details are specified in XSH 2.9.2, as #R.. observed earlier in a comment. In particular:
The lifetime of a thread ID ends after the thread terminates if it was
created with the detachstate attribute set to PTHREAD_CREATE_DETACHED
or if pthread_detach() or pthread_join() has been called for that
thread. A conforming implementation is free to reuse a thread ID after
its lifetime has ended. If an application attempts to use a thread ID
whose lifetime has ended, the behavior is undefined.
So an application is permitted to re-use thread IDs whose lifetime has ended, but that's really a side issue because if you attempt to use a stale one then the behavior is undefined, whether the ID has been reused or not. And of course, one of the innumerable possible manifestations of UB that could ensue in the case described is indeed that a different thread is cancelled than the one you meant to cancel, regardless of whether the thread ID has been reused.
The lifetime of a thread ID ends when the thread it identifies terminates if that thread was created detached, or when it is passed to either pthread_detach or pthread_join if the thread was created joinable. It is entirely possible to have a race between that and the execution of pthread_cancel. If the thread was created joinable then you need at least three threads total for that, but if it was created detached then you don't need any other than the one calling pthread_cancel and a separate one being cancelled. Either way pthread_cancel is risky.
The accepted answer to the question you linked is misleading, at best, but #DavidSchwartz's comment on it is much more useful, even if I don't think it accurately reflects the specification in every detail. Here is how I would put it:
It is safe to cancel a thread with pthread_cancel if one of these cases holds:
the thread was created joinable, and it is certain that it cannot have been detached or joined before the pthread_cancel call completes, or
the thread was created detached, and it is certain that it cannot have terminated, nor have been passed to pthread_join or pthread_attach (regardless of the success of these calls) before the pthread_cancel call completes.
It is not safe (i.e. it risks UB) to attempt to cancel
a thread that was created joinable, via the thread ID provided by pthread_create, if it is possible for that thread to be detached or joined before the pthread_cancel call completes, or
a thread that was created detached, if it is possible for that thread to terminate or have pthread_join or pthread_detach called on it before the pthread_cancel call completes.
It is unclear whether it is safe to cancel a thread that was created joinable and later detatched, via a thread ID obtained from pthread_self() after the detachment, if it is certain that neither pthread_join nor pthread_detach can have been called on that thread ID before the `pthread_cancel completes.*
*One could interpret the specifications to imply that under those circumstances, pthread_self returns a thread ID whose lifetime has already ended, so that the cancellation would definitely produce UB. But there are at least a couple different contrary interpretations, and under any of those, there is no defined condition under which the lifetime of the thread ID from pthread_self ends before the end of the program, leaving it safe to cancel the thread via that ID at any time.
Related
Right before exiting, I call from the main() in the following order to:
pthread_cancel() other threads uses mtx which are "waiting" (They are waiting for other cond_variable and mutex. Maybe that's the problem?
pthread_cond_destroy(&cnd) (which is "coupled" whith mtx)
pthread_mutex_unlock(&mtx)
pthread_mutex_destroy(&mtx)
However, the last function results EBUSY. Each time another thread uses the mutex it almost immediately release it. Also, as mentioned, I kill all those threads before trying to destroy the mutex.
Why is it happening?
As per man pthread_mutex_destroy:
The pthread_mutex_destroy() function may fail if:
EBUSY
The implementation has detected an attempt to destroy the object referenced by mutex while it is locked or referenced (for example,
while being used in a pthread_cond_timedwait() or pthread_cond_wait())
by another thread.
Check if the mutex is not used by another thread when you try to destroy it.
pthread_cancel() other threads uses mtx which are "waiting" (They are waiting for other cond_variable and mutex.
Cancellation is running asynchronously to the cancelling process, that is pthread_cancel() might very well return before the thread to be cancelled ended.
This results in resources (mutexes, conditions, ...) used by the thread to be cancelled perhaps still being in use when immediately calling pthread_mutex_destroy() afterwards.
The only way to test whether cancellation succeeded it to call pthread_join()on the cancelled thread and expect it to return PTHREAD_CANCELED. This implies that the thread to be cancelled wasn't detached.
Here you see one possible issue with cancelling threads. There are others. Simply avoid all this by not using pthread_cancel(), but implement a proper design ending all threads in well defined manner.
i am using posix threads my question is as to whether or not a thread can cancel itself by passing its own thread id in pthread_cancel function?
if yes then what are its implications
also if a main program creates two threads and one of the thread cancels the other thread then what happens to the return value and the resources of the cancelled thread
and how to know from main program as to which thread was cancelled ..since main program is not cancelling any of the threads
i am using asynchronous cancellation
kindly help
Q1: Yes, a thread can cancel itself. However, doing so has all of the negative consequences of cancellation in general; you probably want to use pthread_exit instead, which is somewhat more predictable.
Q2: When a thread has been cancelled, it doesn't get to generate a return value; instead, pthread_join will put the special value PTHREAD_CANCELED in the location pointed to by its retval argument. Unfortunately, you have to know by some other means that a specific thread has definitely terminated (in some fashion) before you call pthread_join, or the calling thread will block forever. There is no portable equivalent of waitpid(..., WNOHANG) nor of waitpid(-1, ...). (The manpage says "If you believe you need this functionality, you probably need to rethink your application design" which makes me want to punch someone in the face.)
Q2a: It depends what you mean by "resources of the thread". The thread control block and stack will be deallocated. All destructors registered with pthread_cleanup_push or pthread_key_create will be executed (on the thread, before it terminates); some runtimes also execute C++ class destructors for objects on the stack. It is the application programmer's responsibility to make sure that all resources owned by the thread are covered by one of these mechanisms. Note that some of these mechanisms have inherent race conditions; for instance, it is impossible to open a file and push a cleanup that closes it as an atomic action, so there is a window where cancellation can leak the open file. (Do not think this can be worked around by pushing the cleanup before opening the file, because a common implementation of deferred cancels is to check for them whenever a system call returns, i.e. exactly timed to hit the tiny gap between the OS writing the file descriptor number to the return-value register, and the calling function copying that register to the memory location where the cleanup expects it to be.)
Qi: you didn't ask this, but you should be aware that a thread with asynchronous cancellation enabled is officially not allowed to do anything other than pure computation. The behavior is undefined if it calls any library function other than pthread_cancel, pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED), or pthread_setcancelstate(PTHREAD_CANCEL_DISABLE).
Q1. Yes,thread can cancel itself.
Q2. If one thread cancel another thread , its resources are hang around until main thread
join that thread with pthread_join() function(if the thread is joinable). And if the canceled
thread is not join in main thread resources are free with program ends/terminate.
Q3. I am not sure, but main program don't know which thread was canceled.
thread can cancel any other thread (within the same process) including itself
threads do not have return values (in general way, they can have return status only), resources of the thread will be freed upon cancellation
main program can store thread's handler and test whether it valid or not
I create more than 100 threads from my main() so I just wanted to know that do I need to call pthread_join() before I exit my main().
Also, I do not need the data generated by these threads, basically, all the threads are doing some job independent from main() and other threads.
pthread_join does two things:
Wait for the thread to finish.
Clean up any resources associated with the thread.
If you exit the process without joining, then (2) will be done for you by the OS (although it won't do thread cancellation cleanup, just nuke the thread from orbit), and (1) will not. So whether you need to call pthread_join depends whether you need (1) to happen.
If you don't need the thread to run, then as everyone else is saying you may as well detach it. A detached thread cannot be joined (so you can't wait on its completion), but its resources are freed automatically if it does complete.
Yes if thread is attachable then pthread_join is must otherwise it creates a Zombie thread.
Agree with answers above, just sharing a note from man page of pthread_join.
NOTES
After a successful call to pthread_join(), the caller is guaranteed that the target thread has terminated.
Joining with a thread that has previously been joined results in undefined behavior.
Failure to join with a thread that is joinable (i.e., one that is not detached), produces a "zombie thread". Avoid doing this, since each zombie thread consumes some system resources, and when
enough zombie threads have accumulated, it will no longer be possible to create new threads (or processes).
When you exit, you do not need to join because all other threads and resources will be automatically cleaned up. This assumes that you actually want all the threads to be killed when main exits.
If you don't need to join with a thread, you can create it as a "detached" thread by using pthread_attr_setdetachstate on the attributes before creating the thread. Detached threads cannot be joined, but they don't need to be joined either.
So,
If you want all threads to complete before the program finishes, joining from the main thread makes this work.
As an alternative, you can create the threads as detached, and return from main after all threads exit, coordinating using a semaphore or mutex+condition variable.
If you don't need all threads to complete, simply return from main. All other threads will be destroyed. You may also create the threads as detached threads, which may reduce resource consumption.
By default threads in pthreads library are created as joinable.
Threads may, however, detach, rendering them no longer joinable. Because threads consume system resources until joined, just as processes consume resources until their parent calls wait(), threads that you do not intend to join must be detached, which is a good programming practice.
Of course once the main routine exits, all threading resources are freed.
If we fail to do that(detaching), then, when the thread terminates it produces the thread equivalent of a zombie process. Aside from wasting system resources, if enough thread zombies accumulate, we won't be able to create additional threads.
Per default a thread runs attached, that means the resources it needs are kept in use until the thread is joined.
As from your description noone but the thread itself needs the thread's resources, so you might create the thread detached or detach the thread prior to having it started.
To detach a thread after its creation call pthread_detach().
Anyhow if you want to make sure all threads are gone before the program ends, you should run the threads attached and join them before leaving the main thread (the program).
If you want to be sure that your thread have actually finished, you want to call pthread_join.
If you don't, then terminating your program will terminate all the unfinished thread abruptly.
That said, your main can wait a sufficiently long time until it exits. But then, how can you be sure that it is suffucient?
If your main ends your application ends and your threads die... So you do need to use thread join (or use fork instead).
I am wondering if it is possible to check the status of a thread, which could possibly be in a waitable state but doesn't have to be and if it is in a waitable state I would like to leave it in that state.
Basically, how can I check the status of a thread without changing its (waitable) state.
By waitable, I mean if I called wait(pid) it would return properly and not hang.
Let me also add that I am tracing a multithreaded program, therefore I cannot change the code of it. Also, I omitted this information as well but this is a Linux-based system.
Are you asking about processes or threads? The wait function acts on processes, not threads, so your question as-written is not valid.
For (child) processes, you can check the state by calling waitid with the WNOWAIT flag. This will leave the process in a waitable state.
For threads, on some implementatiosn you can call pthread_kill(thread, 0) and check for ESRCH to determine if the thread has exited or not, while leaving thread in a joinable state. Note that this is valid only if the thread is joinable. If it was detached or already joined, you are invoking Undefined Behavior and your program should crash or worse. Unfortunately, there is no requirement that pthread_kill report ESRCH in this case, so it might falsely report that a thread still exists when in fact it already terminated. Of course, formally there is no difference between a thread that's sitting around forever between the call to pthread_exit and actual termination, and a thread that has actually finished terminating, so the question is a bit meaningless. In other words, there's no requirement that a joinable thread ever terminate until pthread_join is blocked waiting for it to terminate.
Do you want to do something like this (pseudo-code)?
if (status(my_thread) == waiting)
do_something();
else
do_something_else();
If that is indeed what you are trying to do, you are exposing yourself to race conditions. For example, what if my_thread wakes up after status(my_thread) but before do_something() (or even before == waiting)?
You might want to consider condition variables for safely communicating "status" between threads. Thread-safe queue might also be an option...
BTW, Lawrence Livermore National Laboratory has an excellent tutorial on multithreading concepts at https://computing.llnl.gov/tutorials/pthreads/ (including condition variables). This particular document uses POSIX API, but concepts that are explained are universal.
I have two threads, communicating with each other; each thread employs 'while(1) ..'. Now I need to let the threads exit upon a specific condition met, and therefore finish the application.
My question: is it safe to just 'return (NULL)' from the thread, or do I have to use 'pthread_exit' or 'pthread_join' functions as well?
It is safe to return null from the thread functions; the code that waits for them should be OK.
POSIX says of pthread_exit():
An implicit call to pthread_exit() is made when a thread other than the thread in which main() was first invoked returns from the start routine that was used to create it.
You do need something to wait for the thread with pthread_join() unless the thread was created with the detached attribute or detached later with pthread_detach().
Calling pthread_exit(NULL) and returning NULL at the end of the thread's initial function should be equivalent. However, doing either of these alone will lead to a resource leak. To avoid that, you must either call pthread_join on the thread from another thread, or put the thread in the detached state by calling pthread_detach on it or setting it to start in the detached state before creating it.