Detached vs. Joinable POSIX threads - c

I've been using the pthread library for creating & joining threads in C.
When should I create a thread as detached, right from the outset? Does it offer any performance advantage vs. a joinable thread?
Is it legal to not do a pthread_join() on a joinable (by default) thread? Or should such a thread always use the detach() function before pthread_exit()ing?

Create a detached thread when you know you won't want to wait for it with pthread_join(). The only performance benefit is that when a detached thread terminates, its resources can be released immediately instead of having to wait for the thread to be joined before the resources can be released.
It is 'legal' not to join a joinable thread; but it is not usually advisable because (as previously noted) the resources won't be released until the thread is joined, so they'll remain tied up indefinitely (until the program exits) if you don't join it.

When should I create a thread as detached, right from the outset?
Whenever the application doesn't care when that thread completes and doesn't care about its return value of a thread, either (a thread may communicate a value back to other thread/application via pthread_exit).
For example, in a client-server application model, a server may create a new thread to process each request. But the server itself doesn't care about thread's return value of the thread. In that case, it makes sense to created detached threads.
The only thing the server needs to ensure is that the currently processed requests are completed. Which it can do so, just by exiting the main thread without exiting the whole program/application. When the last thread in the process exits, the application/program will naturally exit.
The pseudocode might look like:
/* A server application */
void process(void *arg)
{
/* Detach self. */
pthread_detach(pthread_self());
/* process a client request. */
pthread_exit(NULL);
}
int main(void)
{
while (not_done) {
pthread_t t_id;
errno = pthread_create(&t_id, NULL, process, NULL);
if (errno) perror("pthread_create:");
}
/* There may be pending requests at this point. */
/* Just exit the main thread - not the whole program - so that remaining
requests that may still be processed can continue. */
pthread_exit(NULL);
}
Another example could be a daemon or logger thread that logs some information at regular intervals for as long as the application runs.
Does it offer any performance advantage vs. a joinable thread?
Performance-wise, there's no difference between joinable threads vs detached threads. The only difference is that with detached threads, its resources (such as thread stack and any associated heap memory, and so on - exactly what constitutes those "resources" are implementation-specific).
Is it legal to not do a pthread_join() on a joinable (by default) thread?
Yes, it's legal to not join with a thread. pthread_join is a just convenience function that's by no means needs to be used unless you need. But note that the threads created are joinable threads by default.
An example when you might want to join is when threads do a "piece" of work that's split between them. In that case, you'd want to check all threads complete before proceeding. Task farm parallelism is a good example.
Or should such a thread always use the detach() function before pthread_exit()ing?
Not necessary. But you'd often want to decide whether you want a joinable or detached thread at the time of creation.
Note that while a detachable thread can be created in by setting the attribute PTHREAD_CREATE_DETACHED with a call to pthread_attr_setdetachstate, a thread decide can decide to detach itself at any point in time e.g. with pthread_detach(pthread_self()). Also, a thread that has the thread id (pthread_t) of another thread can detach with pthread_detach(thread_id);.

Related

What happens if we don't join threads? [duplicate]

I create more than 100 threads from my main() so I just wanted to know that do I need to call pthread_join() before I exit my main().
Also, I do not need the data generated by these threads, basically, all the threads are doing some job independent from main() and other threads.
pthread_join does two things:
Wait for the thread to finish.
Clean up any resources associated with the thread.
If you exit the process without joining, then (2) will be done for you by the OS (although it won't do thread cancellation cleanup, just nuke the thread from orbit), and (1) will not. So whether you need to call pthread_join depends whether you need (1) to happen.
If you don't need the thread to run, then as everyone else is saying you may as well detach it. A detached thread cannot be joined (so you can't wait on its completion), but its resources are freed automatically if it does complete.
Yes if thread is attachable then pthread_join is must otherwise it creates a Zombie thread.
Agree with answers above, just sharing a note from man page of pthread_join.
NOTES
After a successful call to pthread_join(), the caller is guaranteed that the target thread has terminated.
Joining with a thread that has previously been joined results in undefined behavior.
Failure to join with a thread that is joinable (i.e., one that is not detached), produces a "zombie thread". Avoid doing this, since each zombie thread consumes some system resources, and when
enough zombie threads have accumulated, it will no longer be possible to create new threads (or processes).
When you exit, you do not need to join because all other threads and resources will be automatically cleaned up. This assumes that you actually want all the threads to be killed when main exits.
If you don't need to join with a thread, you can create it as a "detached" thread by using pthread_attr_setdetachstate on the attributes before creating the thread. Detached threads cannot be joined, but they don't need to be joined either.
So,
If you want all threads to complete before the program finishes, joining from the main thread makes this work.
As an alternative, you can create the threads as detached, and return from main after all threads exit, coordinating using a semaphore or mutex+condition variable.
If you don't need all threads to complete, simply return from main. All other threads will be destroyed. You may also create the threads as detached threads, which may reduce resource consumption.
By default threads in pthreads library are created as joinable.
Threads may, however, detach, rendering them no longer joinable. Because threads consume system resources until joined, just as processes consume resources until their parent calls wait(), threads that you do not intend to join must be detached, which is a good programming practice.
Of course once the main routine exits, all threading resources are freed.
If we fail to do that(detaching), then, when the thread terminates it produces the thread equivalent of a zombie process. Aside from wasting system resources, if enough thread zombies accumulate, we won't be able to create additional threads.
Per default a thread runs attached, that means the resources it needs are kept in use until the thread is joined.
As from your description noone but the thread itself needs the thread's resources, so you might create the thread detached or detach the thread prior to having it started.
To detach a thread after its creation call pthread_detach().
Anyhow if you want to make sure all threads are gone before the program ends, you should run the threads attached and join them before leaving the main thread (the program).
If you want to be sure that your thread have actually finished, you want to call pthread_join.
If you don't, then terminating your program will terminate all the unfinished thread abruptly.
That said, your main can wait a sufficiently long time until it exits. But then, how can you be sure that it is suffucient?
If your main ends your application ends and your threads die... So you do need to use thread join (or use fork instead).

Multithreading in C/C++ without waiting for the thread to finish

All the examples that I have seen about multithreading uses this method in the main method to wait until the thread is done:
pthread_join(thread_id, NULL);
But what if I don't want it to wait? I want my main function to continue as the thread is doing it's work, but at the same time, I don't want main to exit before the thread exists. Is this possible in C/C++?
If you want to avoid using pthread_join(), then pthread_detach() is an option.
From man-page:
int pthread_detach(pthread_t thread);
The pthread_detach() function marks the thread identified by thread
as detached. When a detached thread terminates, its resources are
automatically released back to the system without the need for
another thread to join with the terminated thread.
it does not prevent the thread from being
terminated if the process terminates using exit(3) (or equivalently,
if the main thread returns).

Group together two or more threads

I have a multi-thread application where each thread has a helper thread that helps the first one to accomplish a task. I would like that when a thread is terminated (likely calling exit) the helper thread is terminated as well.
I know that there is the possibility to use exit_group, but this system call kills all threads in the same group of the calling thread. For example, if my application has 10 threads (and therefore 10 additional helper threads) I would like that only the thread and the helper thread associated is terminated, while the other threads keep on running.
My application works exclusively on Linux.
How can I have this behavior?
Reading around about multithreading I got a bit confused about the concept of thread group and process group in Linux. Are these terms referring to the same thing?
Precisely, the process group (and perhaps the thread group) is the pid retrieved by one of the following calls :
pid_t getpgid(pid_t pid);
pid_t getpgrp(void); /* POSIX.1 version */
pid_t getpgrp(pid_t pid); /* BSD version */
You are a bit adrift here. Forget exit_group, which these days is the same as exit on linux is not what you are looking for. Similarly the various get-pid calls aren't really what you want either.
The simplest (and usually best) way to handle this is have each primary thread signal its helper thread to shut down and then pthread_join it - or not if it is detached.
So something like:
(a) primary work thread knows - however it knows - its work is done.
(b) signals helper thread via a shared switch or similar mechanism
(c) helper thread periodically checks flag, cleans up and calls pthread_exit
(d) primary worker thread calls pthread_join (or not) on dead helper thread
(e) primary worker cleans up and calls pthread_exit on itself.
There are a lot of variations on that but that's the basic idea. Beyond that you get into things like pthread_cancel and areas you may want to avoid if you don't absolutely require them (and the potential headaches).

is it necessary to call pthread_join()

I create more than 100 threads from my main() so I just wanted to know that do I need to call pthread_join() before I exit my main().
Also, I do not need the data generated by these threads, basically, all the threads are doing some job independent from main() and other threads.
pthread_join does two things:
Wait for the thread to finish.
Clean up any resources associated with the thread.
If you exit the process without joining, then (2) will be done for you by the OS (although it won't do thread cancellation cleanup, just nuke the thread from orbit), and (1) will not. So whether you need to call pthread_join depends whether you need (1) to happen.
If you don't need the thread to run, then as everyone else is saying you may as well detach it. A detached thread cannot be joined (so you can't wait on its completion), but its resources are freed automatically if it does complete.
Yes if thread is attachable then pthread_join is must otherwise it creates a Zombie thread.
Agree with answers above, just sharing a note from man page of pthread_join.
NOTES
After a successful call to pthread_join(), the caller is guaranteed that the target thread has terminated.
Joining with a thread that has previously been joined results in undefined behavior.
Failure to join with a thread that is joinable (i.e., one that is not detached), produces a "zombie thread". Avoid doing this, since each zombie thread consumes some system resources, and when
enough zombie threads have accumulated, it will no longer be possible to create new threads (or processes).
When you exit, you do not need to join because all other threads and resources will be automatically cleaned up. This assumes that you actually want all the threads to be killed when main exits.
If you don't need to join with a thread, you can create it as a "detached" thread by using pthread_attr_setdetachstate on the attributes before creating the thread. Detached threads cannot be joined, but they don't need to be joined either.
So,
If you want all threads to complete before the program finishes, joining from the main thread makes this work.
As an alternative, you can create the threads as detached, and return from main after all threads exit, coordinating using a semaphore or mutex+condition variable.
If you don't need all threads to complete, simply return from main. All other threads will be destroyed. You may also create the threads as detached threads, which may reduce resource consumption.
By default threads in pthreads library are created as joinable.
Threads may, however, detach, rendering them no longer joinable. Because threads consume system resources until joined, just as processes consume resources until their parent calls wait(), threads that you do not intend to join must be detached, which is a good programming practice.
Of course once the main routine exits, all threading resources are freed.
If we fail to do that(detaching), then, when the thread terminates it produces the thread equivalent of a zombie process. Aside from wasting system resources, if enough thread zombies accumulate, we won't be able to create additional threads.
Per default a thread runs attached, that means the resources it needs are kept in use until the thread is joined.
As from your description noone but the thread itself needs the thread's resources, so you might create the thread detached or detach the thread prior to having it started.
To detach a thread after its creation call pthread_detach().
Anyhow if you want to make sure all threads are gone before the program ends, you should run the threads attached and join them before leaving the main thread (the program).
If you want to be sure that your thread have actually finished, you want to call pthread_join.
If you don't, then terminating your program will terminate all the unfinished thread abruptly.
That said, your main can wait a sufficiently long time until it exits. But then, how can you be sure that it is suffucient?
If your main ends your application ends and your threads die... So you do need to use thread join (or use fork instead).

How to set the attributes of threads in Linux?

Now I want to create three processes in my program and there are several threads in each process.
And each thread is infinite task, which may sleep and be waked periodically. Besides, the process has some task to do.
My questions are:
1) Do I need to set the threads as detached ? If I set the threads as detached , they seem not to run!!
But, If threads as joinable, the process has to wait the threads to exit and it can't do its own work!!
which one should I choose?
2)What's the scope of schedule policy ? I mean, if I set the schedule policy as FIFO, all the threads in the all processes are scheduled by FIFO policy? Or just the thread which is set with this attribute is scheduled by this policy?
3)What's the scope of thread priority? The thread priorities are just useful in the single process, and in another process, there exist another set of thread priorities ????? And they don't infect each other???
I would appreciate for your help! thank you!
DETACHED OR JOINED: It depends on the type of requirement you need.
If you want the main executable thread(which is spawning new threads) need to continue on its work and no need to wait for the spawned thread return value, you can use DETACH.
If you need the main executable thread, to only wait for the return value and do not need to perform any other task on its own. You can use JOIN.
When a thread is created, it uses the default scheduling policy unless changed by the attribute, before calling pthread_create. Also after creation, dynamically you can change the scheduling policy. NOTE: Scheduling Policy affects threads with same priority.
Priority: you can change priority using pthread_setschedparam (also for scheduling policy).
However, in Linux thread is also a light weight process. So, all the threads are priority are looked at entire process level,
not within each process.
(1) You have a coding error. A detached thread gets a time slice like everything else. If it is not running then it is something you are doing. You should post your threadfunc and the function which creates the threads in another question.
It's impossible to say whether your threads should be joinable or detached without knowing what you are doing. The main benefit of joinable threads are you know when they finish and you can check the return data. If these aren't important to you there is no real advantage to making them joinable - other than it is marginally easier to create them because that is the default.
If you don't want to block in pthread_join there are strategies you can pursue. Your threads can set switches before they die, you can use condition variables, you can have a separate thread that joins the dead threads and so forth. Again, it is impossible to know what is the best strategy for your particular case.
(2 & 3) A thread inherits the schedule policy and priority of the thread that creates it and they remain that way unless you specifically change them. The policy/priority of threads in one process are not directly related to any other process.
I'm answering only to the first question:
No need to create the threads as detached, since you can simply join them at the end of the main process.
To create threads as detached you should first create an attribute and then use it as a parameter to pthread_create
pthread_t thread1;
pthread_attr_t attr;
int chk;
chk = pthread_attr_init(&attr);
printf("attr_init: %d\n",chk);
chk = pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
printf("attr_setdetachstate: %d\n",chk);
chk = pthread_create(&thread1, &attr, function, NULL);

Resources