What does it mean to "join" a thread? - c

For my class I'm supposed to find out what's wrong with a piece of code, and the part I'm having trouble deciphering is
// joining a thread blocks until that thread finishes
a.join();
b.join();
Is joining a thread the same as locking a thread? Because I think the point of this assignment is you're not supposed to leave threads unlocked.

This is how one thread waits for the completion of another thread!
A nice use case of join is - say for example the main() function/thread creates a thread and doesn't wait ( using join ) for the created thread to complete and simply exits, then the newly created thread will also stop!
Here is a nice explanation of Thread Management in general and Thread Join in particular! And here are some code snippets that show you some use cases of join and what happens when you don't use it!

Think of starting a thread as "forking" your process into two distinct threads of execution. Then, join is the reverse -- it's where these two separate threads join together (and only the parent continues from there).

The comment says it all, really. Joining a thread means to wait for it to complete. That is, block the current thread until another completes.

To join a thread means to wait until that thread is live. When the thread exits, the thread calling join() will continue executing. Thus, in the above example, the thread (presumably main thread) that is calling a.join() and b.join() will wait until both threads a and b (in that order) finish their job and then continue executing the code that is after b.join().

join() waits on a thread to complete it's execution.
You need to either detach() a thread or join() a thread for managing it.
join() also, cleans up the thread occupied resources. You will find join() called in the destructor of an RAII class because of the same reason.

Related

What happens if we don't join threads? [duplicate]

I create more than 100 threads from my main() so I just wanted to know that do I need to call pthread_join() before I exit my main().
Also, I do not need the data generated by these threads, basically, all the threads are doing some job independent from main() and other threads.
pthread_join does two things:
Wait for the thread to finish.
Clean up any resources associated with the thread.
If you exit the process without joining, then (2) will be done for you by the OS (although it won't do thread cancellation cleanup, just nuke the thread from orbit), and (1) will not. So whether you need to call pthread_join depends whether you need (1) to happen.
If you don't need the thread to run, then as everyone else is saying you may as well detach it. A detached thread cannot be joined (so you can't wait on its completion), but its resources are freed automatically if it does complete.
Yes if thread is attachable then pthread_join is must otherwise it creates a Zombie thread.
Agree with answers above, just sharing a note from man page of pthread_join.
NOTES
After a successful call to pthread_join(), the caller is guaranteed that the target thread has terminated.
Joining with a thread that has previously been joined results in undefined behavior.
Failure to join with a thread that is joinable (i.e., one that is not detached), produces a "zombie thread". Avoid doing this, since each zombie thread consumes some system resources, and when
enough zombie threads have accumulated, it will no longer be possible to create new threads (or processes).
When you exit, you do not need to join because all other threads and resources will be automatically cleaned up. This assumes that you actually want all the threads to be killed when main exits.
If you don't need to join with a thread, you can create it as a "detached" thread by using pthread_attr_setdetachstate on the attributes before creating the thread. Detached threads cannot be joined, but they don't need to be joined either.
So,
If you want all threads to complete before the program finishes, joining from the main thread makes this work.
As an alternative, you can create the threads as detached, and return from main after all threads exit, coordinating using a semaphore or mutex+condition variable.
If you don't need all threads to complete, simply return from main. All other threads will be destroyed. You may also create the threads as detached threads, which may reduce resource consumption.
By default threads in pthreads library are created as joinable.
Threads may, however, detach, rendering them no longer joinable. Because threads consume system resources until joined, just as processes consume resources until their parent calls wait(), threads that you do not intend to join must be detached, which is a good programming practice.
Of course once the main routine exits, all threading resources are freed.
If we fail to do that(detaching), then, when the thread terminates it produces the thread equivalent of a zombie process. Aside from wasting system resources, if enough thread zombies accumulate, we won't be able to create additional threads.
Per default a thread runs attached, that means the resources it needs are kept in use until the thread is joined.
As from your description noone but the thread itself needs the thread's resources, so you might create the thread detached or detach the thread prior to having it started.
To detach a thread after its creation call pthread_detach().
Anyhow if you want to make sure all threads are gone before the program ends, you should run the threads attached and join them before leaving the main thread (the program).
If you want to be sure that your thread have actually finished, you want to call pthread_join.
If you don't, then terminating your program will terminate all the unfinished thread abruptly.
That said, your main can wait a sufficiently long time until it exits. But then, how can you be sure that it is suffucient?
If your main ends your application ends and your threads die... So you do need to use thread join (or use fork instead).

Close all threads, except the main

Is there a way to close all created threads if I don't have a list of their identifiers?
It is assumed that I only need the main thread, and the rest can be closed.
It's usually a good idea to have threads in charge of their own lifetime, periodically checking for some event indicating they should shut down. This usually make the architecture of your code much easier to understand.
What I'm talking about is along the lines of (pseudo-code):
def main():
# Start up all threads.
synchronised runFlag = true
for count = 1 to 10:
start thread threadFn, receiving id[count]
sleep for a bit
# Tell them all to exit, then wait.
synchronised runFlag = false
for count = 1 to 10:
wait for thread id[count] to exit
exit program
def threadFn():
initialise
# Thread will do its stuff until told to stop.
while synchronised runFlag:
do something relatively quick
exit thread
The periodic checking is a balance between efficiency of the thread loop and the amount of time you may have to wait for the thread to exit.
And, yes, I'm aware that pseudo-code uses identifiers (that you specifically stated you didn't have), but that's just one example of how to effect shutdown. You could equally, for example:
maintain a (synchronised) thread count incremented as a thread starts and decremented when it stops, then wait for it to reach zero;
have threads continue to run while a synchronised counter hasn't changed from the value it was when the thread started (you could just increment the counter in main then freely create a new batch of threads, knowing that the old ones would eventually disappear since the counter is different).
do one of a half dozen other things, depending on your needs :-)
This "lifetime handled by thread" approach is often the simplest way to achieve things since the thread is fully in control of when things happen to it. The one thing you don't want is a thread being violently killed from outside while it holds a resource lock of some sort.
Some threading implementations have ways to handle that with, for example, cancellability points, so you can cancel a thread from outside and it will die at such time it allows itself to. But, in my experience, that just complicates things.
In any case, pthread_cancel requires a thread ID so is unsuitable based on your requirements.
Is there a way to close all created threads if I don't have a list of their identifiers?
No, with POSIX threads there is not.
It is assumed that I only need the main thread, and the rest can be closed.
What you could do is have main() call fork() and let the calling main() (the parent) return, which will end the parent process along with all its thread.
The fork()ed off child process would live on as a copy of the original parent process' main() but without any other threads.
If going this route be aware, that the threads of the process going down might very well run into undefined behaviour, so that strange things might happen including messy left-overs.
All in all a bad approach.
Is there a way to close all created threads if I don't have a list of their identifiers? It is assumed that I only need the main thread, and the rest can be closed.
Technically, you can fork your process and terminate the parent. Only the thread calling fork exists in the new child process. However, the mutexes locked by other threads remain locked and this is why forking a multi-threaded process without immediately calling exec may be unwise.

Can I close/terminate an running thread from its thread function?

I have created a thread, with a custom thread function. I have a condition in the thread function that if it becomes true, I want to close the thread from inside the thread function.
Is it possible?
You can return from the thread and if you want to return some value, you can use pthread_join on that thread.
I assume you are using pthread for the thread functionlaity. You can call the pthread_detach() function in your custom thread function after creating the thread. In the created thread just returning from the thread function will be sufficient to close the thread and release all the resources associated with the thread.
For PThreads there are two ways to end a thread cleanly.
Detached the thread using pthread_detach(). To end it then call pthread_exit(). To find a thread's phtread-id from inside the thread itself use phtread_self().
Call pthread_exit() and have another thread call pthread_join() on the pthread-id received when creating the thread that ended.
If you miss to call pthread_join() on a thread not having been detached by calling pthread_detach(), the resources in use by the thread will not be released, even after the thread ended.
This could lead to a shortage on memory and/or other system resources. Take care this does not happen.
A third was to end a thread is to just cancel it using pthread_cancel(), which typically isn't initated by the thread itself (as I could just use one of the two ways described above to end itself), but from another thread in the situation where the thread to end is not aware of this and could not be notified to do so.
The need to cancel a thread should rarely arise, and if it does one might start to overthink the program's design.

Do you have to wait for a child thread to finish before you leave the thread that started it?

So lets say you create a thread in main (thread 1). This thread takes in some input from a file and creates multiple other threads (thread 2...etc) to process something. Do you have to exit the other threads (thread 2...) before exiting thread 1? If so how would I go about waiting for all the threads spawned by thread 1 to finish?
There are no parent/child relationships among threads. Threads are all peers. It makes no difference which thread started another thread, all the threads are equal parts of the process that contains them.
The special rule about calling pthread_exit from main only applies because returning from main terminates the process. There is no such concern with other threads -- they could only terminate the process by calling exit or a similar function.
Note that you should either join or detach each thread. You can detach all your threads and then you never have to worry about joining them -- they'll just run to completion and then clean themselves up.
No, you don't have to wait for the other threads to exit, in most situations. The whole point of threads is to start a sub-process of sorts that's largely independent of the thread that started it.
If you don't care how/when the thread will exit, though, you should usually detach the thread. Otherwise, it'll assume you care about its exit status, and it will sit there taking up resources -- even after it exits -- until some other thread joins it to retrieve the exit status.

is it necessary to call pthread_join()

I create more than 100 threads from my main() so I just wanted to know that do I need to call pthread_join() before I exit my main().
Also, I do not need the data generated by these threads, basically, all the threads are doing some job independent from main() and other threads.
pthread_join does two things:
Wait for the thread to finish.
Clean up any resources associated with the thread.
If you exit the process without joining, then (2) will be done for you by the OS (although it won't do thread cancellation cleanup, just nuke the thread from orbit), and (1) will not. So whether you need to call pthread_join depends whether you need (1) to happen.
If you don't need the thread to run, then as everyone else is saying you may as well detach it. A detached thread cannot be joined (so you can't wait on its completion), but its resources are freed automatically if it does complete.
Yes if thread is attachable then pthread_join is must otherwise it creates a Zombie thread.
Agree with answers above, just sharing a note from man page of pthread_join.
NOTES
After a successful call to pthread_join(), the caller is guaranteed that the target thread has terminated.
Joining with a thread that has previously been joined results in undefined behavior.
Failure to join with a thread that is joinable (i.e., one that is not detached), produces a "zombie thread". Avoid doing this, since each zombie thread consumes some system resources, and when
enough zombie threads have accumulated, it will no longer be possible to create new threads (or processes).
When you exit, you do not need to join because all other threads and resources will be automatically cleaned up. This assumes that you actually want all the threads to be killed when main exits.
If you don't need to join with a thread, you can create it as a "detached" thread by using pthread_attr_setdetachstate on the attributes before creating the thread. Detached threads cannot be joined, but they don't need to be joined either.
So,
If you want all threads to complete before the program finishes, joining from the main thread makes this work.
As an alternative, you can create the threads as detached, and return from main after all threads exit, coordinating using a semaphore or mutex+condition variable.
If you don't need all threads to complete, simply return from main. All other threads will be destroyed. You may also create the threads as detached threads, which may reduce resource consumption.
By default threads in pthreads library are created as joinable.
Threads may, however, detach, rendering them no longer joinable. Because threads consume system resources until joined, just as processes consume resources until their parent calls wait(), threads that you do not intend to join must be detached, which is a good programming practice.
Of course once the main routine exits, all threading resources are freed.
If we fail to do that(detaching), then, when the thread terminates it produces the thread equivalent of a zombie process. Aside from wasting system resources, if enough thread zombies accumulate, we won't be able to create additional threads.
Per default a thread runs attached, that means the resources it needs are kept in use until the thread is joined.
As from your description noone but the thread itself needs the thread's resources, so you might create the thread detached or detach the thread prior to having it started.
To detach a thread after its creation call pthread_detach().
Anyhow if you want to make sure all threads are gone before the program ends, you should run the threads attached and join them before leaving the main thread (the program).
If you want to be sure that your thread have actually finished, you want to call pthread_join.
If you don't, then terminating your program will terminate all the unfinished thread abruptly.
That said, your main can wait a sufficiently long time until it exits. But then, how can you be sure that it is suffucient?
If your main ends your application ends and your threads die... So you do need to use thread join (or use fork instead).

Resources