What happens if we don't join threads? [duplicate] - c

I create more than 100 threads from my main() so I just wanted to know that do I need to call pthread_join() before I exit my main().
Also, I do not need the data generated by these threads, basically, all the threads are doing some job independent from main() and other threads.

pthread_join does two things:
Wait for the thread to finish.
Clean up any resources associated with the thread.
If you exit the process without joining, then (2) will be done for you by the OS (although it won't do thread cancellation cleanup, just nuke the thread from orbit), and (1) will not. So whether you need to call pthread_join depends whether you need (1) to happen.
If you don't need the thread to run, then as everyone else is saying you may as well detach it. A detached thread cannot be joined (so you can't wait on its completion), but its resources are freed automatically if it does complete.

Yes if thread is attachable then pthread_join is must otherwise it creates a Zombie thread.
Agree with answers above, just sharing a note from man page of pthread_join.
NOTES
After a successful call to pthread_join(), the caller is guaranteed that the target thread has terminated.
Joining with a thread that has previously been joined results in undefined behavior.
Failure to join with a thread that is joinable (i.e., one that is not detached), produces a "zombie thread". Avoid doing this, since each zombie thread consumes some system resources, and when
enough zombie threads have accumulated, it will no longer be possible to create new threads (or processes).

When you exit, you do not need to join because all other threads and resources will be automatically cleaned up. This assumes that you actually want all the threads to be killed when main exits.
If you don't need to join with a thread, you can create it as a "detached" thread by using pthread_attr_setdetachstate on the attributes before creating the thread. Detached threads cannot be joined, but they don't need to be joined either.
So,
If you want all threads to complete before the program finishes, joining from the main thread makes this work.
As an alternative, you can create the threads as detached, and return from main after all threads exit, coordinating using a semaphore or mutex+condition variable.
If you don't need all threads to complete, simply return from main. All other threads will be destroyed. You may also create the threads as detached threads, which may reduce resource consumption.

By default threads in pthreads library are created as joinable.
Threads may, however, detach, rendering them no longer joinable. Because threads consume system resources until joined, just as processes consume resources until their parent calls wait(), threads that you do not intend to join must be detached, which is a good programming practice.
Of course once the main routine exits, all threading resources are freed.
If we fail to do that(detaching), then, when the thread terminates it produces the thread equivalent of a zombie process. Aside from wasting system resources, if enough thread zombies accumulate, we won't be able to create additional threads.

Per default a thread runs attached, that means the resources it needs are kept in use until the thread is joined.
As from your description noone but the thread itself needs the thread's resources, so you might create the thread detached or detach the thread prior to having it started.
To detach a thread after its creation call pthread_detach().
Anyhow if you want to make sure all threads are gone before the program ends, you should run the threads attached and join them before leaving the main thread (the program).

If you want to be sure that your thread have actually finished, you want to call pthread_join.
If you don't, then terminating your program will terminate all the unfinished thread abruptly.
That said, your main can wait a sufficiently long time until it exits. But then, how can you be sure that it is suffucient?

If your main ends your application ends and your threads die... So you do need to use thread join (or use fork instead).

Related

Use of pthread_join()

I am wondering, what can happen if we do a pthread_create without a pthread_join?
Who will "clean" all the memory of the "non-joined" thread.
When the process terminates, all resources associated with the process cease to exist. (This of course does not include shared resources the process created, like files in the filesystem, shared memory segments, etc.) Until then, unjoined threads will continue to consume resources, potentially calling future calls to pthread_create or even malloc to fail.
Well, assuming that it's an app-lifetime thread that does not need or try to explicitly terminate, the OS will do it when its process is terminated, (on all non-trivial OS).
If the thread is created without using pthread_join then when the main thread completes execution all other threads created in main function will be stopped and hence will not complete executing the whole statements in it.
Look at the documentation of Pthread_join.
It will make the main thread to suspend until the spawned thread completes execution.

Do you have to wait for a child thread to finish before you leave the thread that started it?

So lets say you create a thread in main (thread 1). This thread takes in some input from a file and creates multiple other threads (thread 2...etc) to process something. Do you have to exit the other threads (thread 2...) before exiting thread 1? If so how would I go about waiting for all the threads spawned by thread 1 to finish?
There are no parent/child relationships among threads. Threads are all peers. It makes no difference which thread started another thread, all the threads are equal parts of the process that contains them.
The special rule about calling pthread_exit from main only applies because returning from main terminates the process. There is no such concern with other threads -- they could only terminate the process by calling exit or a similar function.
Note that you should either join or detach each thread. You can detach all your threads and then you never have to worry about joining them -- they'll just run to completion and then clean themselves up.
No, you don't have to wait for the other threads to exit, in most situations. The whole point of threads is to start a sub-process of sorts that's largely independent of the thread that started it.
If you don't care how/when the thread will exit, though, you should usually detach the thread. Otherwise, it'll assume you care about its exit status, and it will sit there taking up resources -- even after it exits -- until some other thread joins it to retrieve the exit status.

is it necessary to call pthread_join()

I create more than 100 threads from my main() so I just wanted to know that do I need to call pthread_join() before I exit my main().
Also, I do not need the data generated by these threads, basically, all the threads are doing some job independent from main() and other threads.
pthread_join does two things:
Wait for the thread to finish.
Clean up any resources associated with the thread.
If you exit the process without joining, then (2) will be done for you by the OS (although it won't do thread cancellation cleanup, just nuke the thread from orbit), and (1) will not. So whether you need to call pthread_join depends whether you need (1) to happen.
If you don't need the thread to run, then as everyone else is saying you may as well detach it. A detached thread cannot be joined (so you can't wait on its completion), but its resources are freed automatically if it does complete.
Yes if thread is attachable then pthread_join is must otherwise it creates a Zombie thread.
Agree with answers above, just sharing a note from man page of pthread_join.
NOTES
After a successful call to pthread_join(), the caller is guaranteed that the target thread has terminated.
Joining with a thread that has previously been joined results in undefined behavior.
Failure to join with a thread that is joinable (i.e., one that is not detached), produces a "zombie thread". Avoid doing this, since each zombie thread consumes some system resources, and when
enough zombie threads have accumulated, it will no longer be possible to create new threads (or processes).
When you exit, you do not need to join because all other threads and resources will be automatically cleaned up. This assumes that you actually want all the threads to be killed when main exits.
If you don't need to join with a thread, you can create it as a "detached" thread by using pthread_attr_setdetachstate on the attributes before creating the thread. Detached threads cannot be joined, but they don't need to be joined either.
So,
If you want all threads to complete before the program finishes, joining from the main thread makes this work.
As an alternative, you can create the threads as detached, and return from main after all threads exit, coordinating using a semaphore or mutex+condition variable.
If you don't need all threads to complete, simply return from main. All other threads will be destroyed. You may also create the threads as detached threads, which may reduce resource consumption.
By default threads in pthreads library are created as joinable.
Threads may, however, detach, rendering them no longer joinable. Because threads consume system resources until joined, just as processes consume resources until their parent calls wait(), threads that you do not intend to join must be detached, which is a good programming practice.
Of course once the main routine exits, all threading resources are freed.
If we fail to do that(detaching), then, when the thread terminates it produces the thread equivalent of a zombie process. Aside from wasting system resources, if enough thread zombies accumulate, we won't be able to create additional threads.
Per default a thread runs attached, that means the resources it needs are kept in use until the thread is joined.
As from your description noone but the thread itself needs the thread's resources, so you might create the thread detached or detach the thread prior to having it started.
To detach a thread after its creation call pthread_detach().
Anyhow if you want to make sure all threads are gone before the program ends, you should run the threads attached and join them before leaving the main thread (the program).
If you want to be sure that your thread have actually finished, you want to call pthread_join.
If you don't, then terminating your program will terminate all the unfinished thread abruptly.
That said, your main can wait a sufficiently long time until it exits. But then, how can you be sure that it is suffucient?
If your main ends your application ends and your threads die... So you do need to use thread join (or use fork instead).

What does it mean to "join" a thread?

For my class I'm supposed to find out what's wrong with a piece of code, and the part I'm having trouble deciphering is
// joining a thread blocks until that thread finishes
a.join();
b.join();
Is joining a thread the same as locking a thread? Because I think the point of this assignment is you're not supposed to leave threads unlocked.
This is how one thread waits for the completion of another thread!
A nice use case of join is - say for example the main() function/thread creates a thread and doesn't wait ( using join ) for the created thread to complete and simply exits, then the newly created thread will also stop!
Here is a nice explanation of Thread Management in general and Thread Join in particular! And here are some code snippets that show you some use cases of join and what happens when you don't use it!
Think of starting a thread as "forking" your process into two distinct threads of execution. Then, join is the reverse -- it's where these two separate threads join together (and only the parent continues from there).
The comment says it all, really. Joining a thread means to wait for it to complete. That is, block the current thread until another completes.
To join a thread means to wait until that thread is live. When the thread exits, the thread calling join() will continue executing. Thus, in the above example, the thread (presumably main thread) that is calling a.join() and b.join() will wait until both threads a and b (in that order) finish their job and then continue executing the code that is after b.join().
join() waits on a thread to complete it's execution.
You need to either detach() a thread or join() a thread for managing it.
join() also, cleans up the thread occupied resources. You will find join() called in the destructor of an RAII class because of the same reason.

Detached vs. Joinable POSIX threads

I've been using the pthread library for creating & joining threads in C.
When should I create a thread as detached, right from the outset? Does it offer any performance advantage vs. a joinable thread?
Is it legal to not do a pthread_join() on a joinable (by default) thread? Or should such a thread always use the detach() function before pthread_exit()ing?
Create a detached thread when you know you won't want to wait for it with pthread_join(). The only performance benefit is that when a detached thread terminates, its resources can be released immediately instead of having to wait for the thread to be joined before the resources can be released.
It is 'legal' not to join a joinable thread; but it is not usually advisable because (as previously noted) the resources won't be released until the thread is joined, so they'll remain tied up indefinitely (until the program exits) if you don't join it.
When should I create a thread as detached, right from the outset?
Whenever the application doesn't care when that thread completes and doesn't care about its return value of a thread, either (a thread may communicate a value back to other thread/application via pthread_exit).
For example, in a client-server application model, a server may create a new thread to process each request. But the server itself doesn't care about thread's return value of the thread. In that case, it makes sense to created detached threads.
The only thing the server needs to ensure is that the currently processed requests are completed. Which it can do so, just by exiting the main thread without exiting the whole program/application. When the last thread in the process exits, the application/program will naturally exit.
The pseudocode might look like:
/* A server application */
void process(void *arg)
{
/* Detach self. */
pthread_detach(pthread_self());
/* process a client request. */
pthread_exit(NULL);
}
int main(void)
{
while (not_done) {
pthread_t t_id;
errno = pthread_create(&t_id, NULL, process, NULL);
if (errno) perror("pthread_create:");
}
/* There may be pending requests at this point. */
/* Just exit the main thread - not the whole program - so that remaining
requests that may still be processed can continue. */
pthread_exit(NULL);
}
Another example could be a daemon or logger thread that logs some information at regular intervals for as long as the application runs.
Does it offer any performance advantage vs. a joinable thread?
Performance-wise, there's no difference between joinable threads vs detached threads. The only difference is that with detached threads, its resources (such as thread stack and any associated heap memory, and so on - exactly what constitutes those "resources" are implementation-specific).
Is it legal to not do a pthread_join() on a joinable (by default) thread?
Yes, it's legal to not join with a thread. pthread_join is a just convenience function that's by no means needs to be used unless you need. But note that the threads created are joinable threads by default.
An example when you might want to join is when threads do a "piece" of work that's split between them. In that case, you'd want to check all threads complete before proceeding. Task farm parallelism is a good example.
Or should such a thread always use the detach() function before pthread_exit()ing?
Not necessary. But you'd often want to decide whether you want a joinable or detached thread at the time of creation.
Note that while a detachable thread can be created in by setting the attribute PTHREAD_CREATE_DETACHED with a call to pthread_attr_setdetachstate, a thread decide can decide to detach itself at any point in time e.g. with pthread_detach(pthread_self()). Also, a thread that has the thread id (pthread_t) of another thread can detach with pthread_detach(thread_id);.

Resources