Related
I am new to pthreads, and I am trying to understand it. I saw some examples like the following.
I could see that the main() is blocked by the API pthread_exit(), and I have seen examples where the main function is blocked by the API pthread_join(). I am not able to understand when to use what?
I am referring to the following site - https://computing.llnl.gov/tutorials/pthreads/. I am not able to get the concept of when to use pthread_join() and when to use pthread_exit().
Can somebody please explain? Also, a good tutorial link for pthreads will be appreciated.
#include <pthread.h>
#include <stdio.h>
#define NUM_THREADS 5
void *PrintHello(void *threadid)
{
long tid;
tid = (long)threadid;
printf("Hello World! It's me, thread #%ld!\n", tid);
pthread_exit(NULL);
}
int main (int argc, char *argv[])
{
pthread_t threads[NUM_THREADS];
int rc;
long t;
for(t=0; t<NUM_THREADS; t++){
printf("In main: creating thread %ld\n", t);
rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t);
if (rc){
printf("ERROR; return code from pthread_create() is %d\n", rc);
exit(-1);
}
}
/* Last thing that main() should do */
pthread_exit(NULL);
Realized one more thing i.e.
pthread_cancel(thread);
pthread_join(thread, NULL);
Sometimes, you want to cancel the thread while it is executing.
You could do this using pthread_cancel(thread);.
However, remember that you need to enable pthread cancel support.
Also, a clean up code upon cancellation.
thread_cleanup_push(my_thread_cleanup_handler, resources);
pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, 0);
static void my_thread_cleanup_handler(void *arg)
{
// free
// close, fclose
}
As explained in the openpub documentations,
pthread_exit() will exit the thread that calls it.
In your case since the main calls it, main thread will terminate whereas your spawned threads will continue to execute. This is mostly used in cases where the
main thread is only required to spawn threads and leave the threads to do their job
pthread_join
will suspend execution of the thread that has called it unless the target thread terminates
This is useful in cases when you want to wait for thread/s to terminate before further
processing in main thread.
pthread_exit terminates the calling thread while pthread_join suspends execution of calling thread until target threads completes execution.
They are pretty much well explained in detail in the open group documentation:
pthread_exit
pthread_join
Both methods ensure that your process doesn't end before all of your threads have ended.
The join method has your thread of the main function explicitly wait for all threads that are to be "joined".
The pthread_exit method terminates your main function and thread in a controlled way. main has the particularity that ending main otherwise would be terminating your whole process including all other threads.
For this to work, you have to be sure that none of your threads is using local variables that are declared inside them main function. The advantage of that method is that your main doesn't have to know all threads that have been started in your process, e.g because other threads have themselves created new threads that main doesn't know anything about.
The pthread_exit() API
as has been already remarked, is used for the calling thread termination.
After a call to that function a complicating clean up mechanism is started.
When it completes the thread is terminated.
The pthread_exit() API is also called implicitly when a call to the return() routine occurs in a thread created by pthread_create().
Actually, a call to return() and a call to pthread_exit() have the same impact, being called from a thread created by pthread_create().
It is very important to distinguish the initial thread, implicitly created when the main() function starts, and threads created by pthread_create().
A call to the return() routine from the main() function implicitly invokes the exit() system call and the entire process terminates.
No thread clean up mechanism is started.
A call to the pthread_exit() from the main() function causes the clean up mechanism to start and when it finishes its work the initial thread terminates.
What happens to the entire process (and to other threads) when pthread_exit() is called from the main() function depends on the PTHREAD implementation.
For example, on IBM OS/400 implementation the entire process is terminated, including other threads, when pthread_exit() is called from the main() function.
Other systems may behave differently.
On most modern Linux machines a call to pthread_exit() from the initial thread does not terminate the entire process until all threads termination.
Be careful using pthread_exit() from main(), if you want to write a portable application.
The pthread_join() API
is a convenient way to wait for a thread termination.
You may write your own function that waits for a thread termination, perhaps more suitable to your application, instead of using pthread_join().
For example, it can be a function based on waiting on conditional variables.
I would recommend for reading a book of David R. Butenhof “Programming with POSIX Threads”.
It explains the discussed topics (and more complicated things) very well (although some implementation details, such as pthread_exit usage in the main function, not always reflected in the book).
You don't need any calls to pthread_exit(3) in your particular code.
In general, the main thread should not call pthread_exit, but should often call pthread_join(3) to wait for some other thread to finish.
In your PrintHello function, you don't need to call pthread_exit because it is implicit after returning from it.
So your code should rather be:
void *PrintHello(void *threadid) {
long tid = (long)threadid;
printf("Hello World! It's me, thread #%ld!\n", tid);
return threadid;
}
int main (int argc, char *argv[]) {
pthread_t threads[NUM_THREADS];
int rc;
intptr_t t;
// create all the threads
for(t=0; t<NUM_THREADS; t++){
printf("In main: creating thread %ld\n", (long) t);
rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t);
if (rc) { fprintf(stderr, "failed to create thread #%ld - %s\n",
(long)t, strerror(rc));
exit(EXIT_FAILURE);
};
}
pthread_yield(); // useful to give other threads more chance to run
// join all the threads
for(t=0; t<NUM_THREADS; t++){
printf("In main: joining thread #%ld\n", (long) t);
rc = pthread_join(&threads[t], NULL);
if (rc) { fprintf(stderr, "failed to join thread #%ld - %s\n",
(long)t, strerror(rc));
exit(EXIT_FAILURE);
}
}
}
pthread_exit() will terminate the calling thread and exit from that(but resources used by calling thread is not released to operating system if it is not detached from main thread.)
pthrade_join() will wait or block the calling thread until target thread is not terminated.
In simple word it will wait for to exit the target thread.
In your code, if you put sleep(or delay) in PrintHello function before pthread_exit(), then main thread may be exit and terminate full process, Although your PrintHello function is not completed it will terminate. If you use pthrade_join() function in main before calling pthread_exit() from main it will block main thread and wait to complete your calling thread (PrintHello).
Hmm.
POSIX pthread_exit description from http://pubs.opengroup.org/onlinepubs/009604599/functions/pthread_exit.html:
After a thread has terminated, the result of access to local (auto) variables of the thread is
undefined. Thus, references to local variables of the exiting thread should not be used for
the pthread_exit() value_ptr parameter value.
Which seems contrary to the idea that local main() thread variables will remain accessible.
Using pthread_exit in the main thread(in place of pthread_join), will leave the main thread in defunct(zombie) state. Since not using pthread_join, other joinable threads which are terminated will also remain in the zombie state and cause resource leakage.
Failure to join with a thread that is joinable (i.e., one that is
not detached), produces a "zombie thread". Avoid doing this, since
each zombie thread consumes some system resources, and when enough
zombie threads have accumulated, it will no longer be possible to
create new threads (or processes).
Another point is keeping the main thread in the defunct state, while other threads are running may cause implementation dependent issues in various conditions like if resources are allocated in main thread or variables which are local to the main thread are used in other threads.
Also, all the shared resources are released only when the process exits, it's not saving any resources. So, I think using pthread_exit in place of pthread_join should be avoided.
When pthread_exit() is called, the calling threads stack is no longer addressable as "active" memory for any other thread. The .data, .text and .bss parts of "static" memory allocations are still available to all other threads. Thus, if you need to pass some memory value into pthread_exit() for some other pthread_join() caller to see, it needs to be "available" for the thread calling pthread_join() to use. It should be allocated with malloc()/new, allocated on the pthread_join threads stack, 1) a stack value which the pthread_join caller passed to pthread_create or otherwise made available to the thread calling pthread_exit(), or 2) a static .bss allocated value.
It's vital to understand how memory is managed between a threads stack, and values store in .data/.bss memory sections which are used to store process wide values.
#include<stdio.h>
#include<pthread.h>
#include<semaphore.h>
sem_t st;
void *fun_t(void *arg);
void *fun_t(void *arg)
{
printf("Linux\n");
sem_post(&st);
//pthread_exit("Bye");
while(1);
pthread_exit("Bye");
}
int main()
{
pthread_t pt;
void *res_t;
if(pthread_create(&pt,NULL,fun_t,NULL) == -1)
perror("pthread_create");
if(sem_init(&st,0,0) != 0)
perror("sem_init");
if(sem_wait(&st) != 0)
perror("sem_wait");
printf("Sanoundry\n");
//Try commenting out join here.
if(pthread_join(pt,&res_t) == -1)
perror("pthread_join");
if(sem_destroy(&st) != 0)
perror("sem_destroy");
return 0;
}
Copy and paste this code on a gdb. Onlinegdb would work and see for yourself.
Make sure you understand once you have created a thread, the process run along with main together at the same time.
Without the join, main thread continue to run and return 0
With the join, main thread would be stuck in the while loop because it waits for the thread to be done executing.
With the join and delete the commented out pthread_exit, the thread will terminate before running the while loop and main would continue
Practical usage of pthread_exit can be used as an if conditions or case statements to ensure 1 version of some code runs before exiting.
void *fun_t(void *arg)
{
printf("Linux\n");
sem_post(&st);
if(2-1 == 1)
pthread_exit("Bye");
else
{
printf("We have a problem. Computer is bugged");
pthread_exit("Bye"); //This is redundant since the thread will exit at the end
//of scope. But there are instances where you have a bunch
//of else if here.
}
}
I would want to demonstrate how sometimes you would need to have a segment of code running first using semaphore in this example.
#include<stdio.h>
#include<pthread.h>
#include<semaphore.h>
sem_t st;
void* fun_t (void* arg)
{
printf("I'm thread\n");
sem_post(&st);
}
int main()
{
pthread_t pt;
pthread_create(&pt,NULL,fun_t,NULL);
sem_init(&st,0,0);
sem_wait(&st);
printf("before_thread\n");
pthread_join(pt,NULL);
printf("After_thread\n");
}
Noticed how fun_t is being ran after "before thread" The expected output if it is linear from top to bottom would be before thread, I'm thread, after thread. But under this circumstance, we block the main from running any further until the semaphore is released by func_t. The result can be verified with https://www.onlinegdb.com/
In a Linux C program, how do I print the thread id of a thread created by the pthread library? For example like how we can get pid of a process by getpid().
What? The person asked for Linux specific, and the equivalent of getpid(). Not BSD or Apple. The answer is gettid() and returns an integral type. You will have to call it using syscall(), like this:
#include <sys/types.h>
#include <unistd.h>
#include <sys/syscall.h>
....
pid_t x = syscall(__NR_gettid);
While this may not be portable to non-linux systems, the threadid is directly comparable and very fast to acquire. It can be printed (such as for LOGs) like a normal integer.
pthread_self() function will give the thread id of current thread.
pthread_t pthread_self(void);
The pthread_self() function returns the Pthread handle of the calling thread. The pthread_self() function does NOT return the integral thread of the calling thread. You must use pthread_getthreadid_np() to return an integral identifier for the thread.
NOTE:
pthread_id_np_t tid;
tid = pthread_getthreadid_np();
is significantly faster than these calls, but provides the same behavior.
pthread_id_np_t tid;
pthread_t self;
self = pthread_self();
pthread_getunique_np(&self, &tid);
As noted in other answers, pthreads does not define a platform-independent way to retrieve an integral thread ID.
On Linux systems, you can get thread ID thus:
#include <sys/types.h>
pid_t tid = gettid();
On many BSD-based platforms, this answer https://stackoverflow.com/a/21206357/316487 gives a non-portable way.
However, if the reason you think you need a thread ID is to know whether you're running on the same or different thread to another thread you control, you might find some utility in this approach
static pthread_t threadA;
// On thread A...
threadA = pthread_self();
// On thread B...
pthread_t threadB = pthread_self();
if (pthread_equal(threadA, threadB)) printf("Thread B is same as thread A.\n");
else printf("Thread B is NOT same as thread A.\n");
If you just need to know if you're on the main thread, there are additional ways, documented in answers to this question how can I tell if pthread_self is the main (first) thread in the process?.
pid_t tid = syscall(SYS_gettid);
Linux provides such system call to allow you get id of a thread.
You can use pthread_self()
The parent gets to know the thread id after the pthread_create() is executed sucessfully, but while executing the thread if we want to access the thread id we have to use the function pthread_self().
This single line gives you pid , each threadid and spid.
printf("before calling pthread_create getpid: %d getpthread_self: %lu tid:%lu\n",getpid(), pthread_self(), syscall(SYS_gettid));
I think not only is the question not clear but most people also are not cognizant of the difference. Examine the following saying,
POSIX thread IDs are not the same as the thread IDs returned by the
Linux specific gettid() system call. POSIX thread IDs are assigned
and maintained by the threading implementation. The thread ID returned
by gettid() is a number (similar to a process ID) that is assigned by
the kernel. Although each POSIX thread has a unique kernel thread ID
in the Linux NPTL threading implementation, an application generally
doesn’t need to know about the kernel IDs (and won’t be portable if it
depends on knowing them).
Excerpted from: The Linux Programming Interface: A Linux and UNIX System Programming Handbook, Michael Kerrisk
IMHO, there is only one portable way that pass a structure in which define a variable holding numbers in an ascending manner e.g. 1,2,3... to per thread. By doing this, threads' id can be kept track. Nonetheless, int pthread_equal(tid1, tid2) function should be used.
if (pthread_equal(tid1, tid2)) printf("Thread 2 is same as thread 1.\n");
else printf("Thread 2 is NOT same as thread 1.\n");
pthread_getthreadid_np wasn't on my Mac os x. pthread_t is an opaque type. Don't beat your head over it. Just assign it to void* and call it good. If you need to printf use %p.
There is also another way of getting thread id. While creating threads with
int pthread_create(pthread_t * thread, const pthread_attr_t * attr, void * (*start_routine)(void *), void *arg);
function call; the first parameter pthread_t * thread is actually a thread id (that is an unsigned long int defined in bits/pthreadtypes.h). Also, the last argument void *arg is the argument that is passed to void * (*start_routine) function to be threaded.
You can create a structure to pass multiple arguments and send a pointer to a structure.
typedef struct thread_info {
pthread_t thread;
//...
} thread_info;
//...
tinfo = malloc(sizeof(thread_info) * NUMBER_OF_THREADS);
//...
pthread_create (&tinfo[i].thread, NULL, handler, (void*)&tinfo[i]);
//...
void *handler(void *targs) {
thread_info *tinfo = targs;
// here you get the thread id with tinfo->thread
}
For different OS there is different answer. I find a helper here.
You can try this:
#include <unistd.h>
#include <sys/syscall.h>
int get_thread_id() {
#if defined(__linux__)
return syscall(SYS_gettid);
#elif defined(__FreeBSD__)
long tid;
thr_self(&tid);
return (int)tid;
#elif defined(__NetBSD__)
return _lwp_self();
#elif defined(__OpenBSD__)
return getthrid();
#else
return getpid();
#endif
}
Platform-independent way (starting from c++11) is:
#include <thread>
std::this_thread::get_id();
You can also write in this manner and it does the same. For eg:
for(int i=0;i < total; i++)
{
pthread_join(pth[i],NULL);
cout << "SUM of thread id " << pth[i] << " is " << args[i].sum << endl;
}
This program sets up an array of pthread_t and calculate sum on each. So it is printing the sum of each thread with thread id.
I am new to pthreads, and I am trying to understand it. I saw some examples like the following.
I could see that the main() is blocked by the API pthread_exit(), and I have seen examples where the main function is blocked by the API pthread_join(). I am not able to understand when to use what?
I am referring to the following site - https://computing.llnl.gov/tutorials/pthreads/. I am not able to get the concept of when to use pthread_join() and when to use pthread_exit().
Can somebody please explain? Also, a good tutorial link for pthreads will be appreciated.
#include <pthread.h>
#include <stdio.h>
#define NUM_THREADS 5
void *PrintHello(void *threadid)
{
long tid;
tid = (long)threadid;
printf("Hello World! It's me, thread #%ld!\n", tid);
pthread_exit(NULL);
}
int main (int argc, char *argv[])
{
pthread_t threads[NUM_THREADS];
int rc;
long t;
for(t=0; t<NUM_THREADS; t++){
printf("In main: creating thread %ld\n", t);
rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t);
if (rc){
printf("ERROR; return code from pthread_create() is %d\n", rc);
exit(-1);
}
}
/* Last thing that main() should do */
pthread_exit(NULL);
Realized one more thing i.e.
pthread_cancel(thread);
pthread_join(thread, NULL);
Sometimes, you want to cancel the thread while it is executing.
You could do this using pthread_cancel(thread);.
However, remember that you need to enable pthread cancel support.
Also, a clean up code upon cancellation.
thread_cleanup_push(my_thread_cleanup_handler, resources);
pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, 0);
static void my_thread_cleanup_handler(void *arg)
{
// free
// close, fclose
}
As explained in the openpub documentations,
pthread_exit() will exit the thread that calls it.
In your case since the main calls it, main thread will terminate whereas your spawned threads will continue to execute. This is mostly used in cases where the
main thread is only required to spawn threads and leave the threads to do their job
pthread_join
will suspend execution of the thread that has called it unless the target thread terminates
This is useful in cases when you want to wait for thread/s to terminate before further
processing in main thread.
pthread_exit terminates the calling thread while pthread_join suspends execution of calling thread until target threads completes execution.
They are pretty much well explained in detail in the open group documentation:
pthread_exit
pthread_join
Both methods ensure that your process doesn't end before all of your threads have ended.
The join method has your thread of the main function explicitly wait for all threads that are to be "joined".
The pthread_exit method terminates your main function and thread in a controlled way. main has the particularity that ending main otherwise would be terminating your whole process including all other threads.
For this to work, you have to be sure that none of your threads is using local variables that are declared inside them main function. The advantage of that method is that your main doesn't have to know all threads that have been started in your process, e.g because other threads have themselves created new threads that main doesn't know anything about.
The pthread_exit() API
as has been already remarked, is used for the calling thread termination.
After a call to that function a complicating clean up mechanism is started.
When it completes the thread is terminated.
The pthread_exit() API is also called implicitly when a call to the return() routine occurs in a thread created by pthread_create().
Actually, a call to return() and a call to pthread_exit() have the same impact, being called from a thread created by pthread_create().
It is very important to distinguish the initial thread, implicitly created when the main() function starts, and threads created by pthread_create().
A call to the return() routine from the main() function implicitly invokes the exit() system call and the entire process terminates.
No thread clean up mechanism is started.
A call to the pthread_exit() from the main() function causes the clean up mechanism to start and when it finishes its work the initial thread terminates.
What happens to the entire process (and to other threads) when pthread_exit() is called from the main() function depends on the PTHREAD implementation.
For example, on IBM OS/400 implementation the entire process is terminated, including other threads, when pthread_exit() is called from the main() function.
Other systems may behave differently.
On most modern Linux machines a call to pthread_exit() from the initial thread does not terminate the entire process until all threads termination.
Be careful using pthread_exit() from main(), if you want to write a portable application.
The pthread_join() API
is a convenient way to wait for a thread termination.
You may write your own function that waits for a thread termination, perhaps more suitable to your application, instead of using pthread_join().
For example, it can be a function based on waiting on conditional variables.
I would recommend for reading a book of David R. Butenhof “Programming with POSIX Threads”.
It explains the discussed topics (and more complicated things) very well (although some implementation details, such as pthread_exit usage in the main function, not always reflected in the book).
You don't need any calls to pthread_exit(3) in your particular code.
In general, the main thread should not call pthread_exit, but should often call pthread_join(3) to wait for some other thread to finish.
In your PrintHello function, you don't need to call pthread_exit because it is implicit after returning from it.
So your code should rather be:
void *PrintHello(void *threadid) {
long tid = (long)threadid;
printf("Hello World! It's me, thread #%ld!\n", tid);
return threadid;
}
int main (int argc, char *argv[]) {
pthread_t threads[NUM_THREADS];
int rc;
intptr_t t;
// create all the threads
for(t=0; t<NUM_THREADS; t++){
printf("In main: creating thread %ld\n", (long) t);
rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t);
if (rc) { fprintf(stderr, "failed to create thread #%ld - %s\n",
(long)t, strerror(rc));
exit(EXIT_FAILURE);
};
}
pthread_yield(); // useful to give other threads more chance to run
// join all the threads
for(t=0; t<NUM_THREADS; t++){
printf("In main: joining thread #%ld\n", (long) t);
rc = pthread_join(&threads[t], NULL);
if (rc) { fprintf(stderr, "failed to join thread #%ld - %s\n",
(long)t, strerror(rc));
exit(EXIT_FAILURE);
}
}
}
pthread_exit() will terminate the calling thread and exit from that(but resources used by calling thread is not released to operating system if it is not detached from main thread.)
pthrade_join() will wait or block the calling thread until target thread is not terminated.
In simple word it will wait for to exit the target thread.
In your code, if you put sleep(or delay) in PrintHello function before pthread_exit(), then main thread may be exit and terminate full process, Although your PrintHello function is not completed it will terminate. If you use pthrade_join() function in main before calling pthread_exit() from main it will block main thread and wait to complete your calling thread (PrintHello).
Hmm.
POSIX pthread_exit description from http://pubs.opengroup.org/onlinepubs/009604599/functions/pthread_exit.html:
After a thread has terminated, the result of access to local (auto) variables of the thread is
undefined. Thus, references to local variables of the exiting thread should not be used for
the pthread_exit() value_ptr parameter value.
Which seems contrary to the idea that local main() thread variables will remain accessible.
Using pthread_exit in the main thread(in place of pthread_join), will leave the main thread in defunct(zombie) state. Since not using pthread_join, other joinable threads which are terminated will also remain in the zombie state and cause resource leakage.
Failure to join with a thread that is joinable (i.e., one that is
not detached), produces a "zombie thread". Avoid doing this, since
each zombie thread consumes some system resources, and when enough
zombie threads have accumulated, it will no longer be possible to
create new threads (or processes).
Another point is keeping the main thread in the defunct state, while other threads are running may cause implementation dependent issues in various conditions like if resources are allocated in main thread or variables which are local to the main thread are used in other threads.
Also, all the shared resources are released only when the process exits, it's not saving any resources. So, I think using pthread_exit in place of pthread_join should be avoided.
When pthread_exit() is called, the calling threads stack is no longer addressable as "active" memory for any other thread. The .data, .text and .bss parts of "static" memory allocations are still available to all other threads. Thus, if you need to pass some memory value into pthread_exit() for some other pthread_join() caller to see, it needs to be "available" for the thread calling pthread_join() to use. It should be allocated with malloc()/new, allocated on the pthread_join threads stack, 1) a stack value which the pthread_join caller passed to pthread_create or otherwise made available to the thread calling pthread_exit(), or 2) a static .bss allocated value.
It's vital to understand how memory is managed between a threads stack, and values store in .data/.bss memory sections which are used to store process wide values.
#include<stdio.h>
#include<pthread.h>
#include<semaphore.h>
sem_t st;
void *fun_t(void *arg);
void *fun_t(void *arg)
{
printf("Linux\n");
sem_post(&st);
//pthread_exit("Bye");
while(1);
pthread_exit("Bye");
}
int main()
{
pthread_t pt;
void *res_t;
if(pthread_create(&pt,NULL,fun_t,NULL) == -1)
perror("pthread_create");
if(sem_init(&st,0,0) != 0)
perror("sem_init");
if(sem_wait(&st) != 0)
perror("sem_wait");
printf("Sanoundry\n");
//Try commenting out join here.
if(pthread_join(pt,&res_t) == -1)
perror("pthread_join");
if(sem_destroy(&st) != 0)
perror("sem_destroy");
return 0;
}
Copy and paste this code on a gdb. Onlinegdb would work and see for yourself.
Make sure you understand once you have created a thread, the process run along with main together at the same time.
Without the join, main thread continue to run and return 0
With the join, main thread would be stuck in the while loop because it waits for the thread to be done executing.
With the join and delete the commented out pthread_exit, the thread will terminate before running the while loop and main would continue
Practical usage of pthread_exit can be used as an if conditions or case statements to ensure 1 version of some code runs before exiting.
void *fun_t(void *arg)
{
printf("Linux\n");
sem_post(&st);
if(2-1 == 1)
pthread_exit("Bye");
else
{
printf("We have a problem. Computer is bugged");
pthread_exit("Bye"); //This is redundant since the thread will exit at the end
//of scope. But there are instances where you have a bunch
//of else if here.
}
}
I would want to demonstrate how sometimes you would need to have a segment of code running first using semaphore in this example.
#include<stdio.h>
#include<pthread.h>
#include<semaphore.h>
sem_t st;
void* fun_t (void* arg)
{
printf("I'm thread\n");
sem_post(&st);
}
int main()
{
pthread_t pt;
pthread_create(&pt,NULL,fun_t,NULL);
sem_init(&st,0,0);
sem_wait(&st);
printf("before_thread\n");
pthread_join(pt,NULL);
printf("After_thread\n");
}
Noticed how fun_t is being ran after "before thread" The expected output if it is linear from top to bottom would be before thread, I'm thread, after thread. But under this circumstance, we block the main from running any further until the semaphore is released by func_t. The result can be verified with https://www.onlinegdb.com/
In Figure 11.2 APUE 2nd, there is a code demoing usage for threads API, as below:
#include <pthread.h>
#include <stdio.h>
pthread_t ntid;
void printids(const char *s)
{
pid_t pid;
pthread_t tid;
pid = getpid();
tid = pthread_self();
printf("%s pid %u tid %u (0x%x)\n", s,
(unsigned int)pid, (unsigned int)tid, (unsigned int)(tid));
}
void *thr_fn(void *arg)
{
printids("new thread: ");
return (void*)0;
}
int main(void)
{
int err;
err = pthread_create(&ntid, NULL, thr_fn, NULL);
if (err != 0)
return -1;
printids("main thread: ");
sleep(1);
return 0;
}
and the book says the output is like this,
$./a.out
new thread: pid 6628 ...
main thread: pid 6626 ...
Pid is different ! This is because "Linux use the clone() to implement threads, same as fork(), so the system consider the threads as separate processes who share resources".
But When I tested, I found the result is different from the APUE result, which is
$ ./a.out
main thread: pid 13301 tid 3078153920 (0xb778e6c0)
new thread: pid 13301 tid 3078151024 (0xb778db70)
the pid is the same ! so is APUE outdated ? But linux do use clone to implement threads, and in linux kernel, they are viewed as different processes. How come the process ID is the same ?
There's a high chance that your edition APUE might be outdated in the sense that it refers to LinuxThreads instead of NPTL.
Here's the relevant section from the clone(2) manpage that sheds light on this issue:
"Thread groups were a feature added in Linux 2.4 to support the POSIX threads notion of a set of threads that share a single PID. Internally, this shared PID is the so-called thread group identifier (TGID) for the thread group. Since Linux 2.4, calls to getpid(2) return the TGID of the caller.
The threads within a group can be distinguished by their (system-wide) unique thread IDs (TID). A new thread's TID is available as the function result returned to the caller of clone(), and a thread can obtain its own TID using gettid(2)."
I thought pthread uses clone to spawn one new thread in linux. But if so, all of the threads should have their seperate pid. Otherwise, if they have the same pid, the global variables in the libc seem to be shared. However, as I ran the following program, I got the same pid but the different address of errno.
extern errno;
void*
f(void *arg)
{
printf("%u,%p\n", getpid(), &errno);
fflush(stdin);
return NULL;
}
int
main(int argc, char **argv)
{
pthread_t tid;
pthread_create(&tid, NULL, f, NULL);
printf("%u,%p\n", getpid(), &errno);
fflush(stdin);
pthread_join(tid, NULL);
return 0;
}
Then, why?
I'm not sure exactly how clone() is used when pthread_create() is called. That said, looking at the clone() man page, it looks like there is a flag called CLONE_THREAD which:
If CLONE_THREAD is set, the child is
placed in the same thread group as the
calling process. To make the remainder
of the discussion of CLONE_THREAD more
readable, the term "thread" is used to
refer to the processes within a thread
group.
Thread groups were a feature added in
Linux 2.4 to support the POSIX threads
notion of a set of threads that share
a single PID. Internally, this shared
PID is the so-called thread group
identifier (TGID) for the thread
group. Since Linux 2.4, calls to
getpid(2) return the TGID of the
caller.
It then goes on to talk about a gettid() function for getting the unique ID of an individual thread within a process. Modifying your code:
#include <stdio.h>
#include <pthread.h>
#include <sys/types.h>
#include <sys/syscall.h>
#include <unistd.h>
int errno;
void*
f(void *arg)
{
printf("%u,%p, %u\n", getpid(), &errno, syscall(SYS_gettid));
fflush(stdin);
return NULL;
}
int
main(int argc, char **argv)
{
pthread_t tid;
pthread_create(&tid, NULL, f, NULL);
printf("%u,%p, %u\n", getpid(), &errno, syscall(SYS_gettid));
fflush(stdin);
pthread_join(tid, NULL);
return 0;
}
(make sure to use "-lpthread"!) we can see that the individual thread id is indeed unique, while the pid remains the same.
rascher#coltrane:~$ ./a.out
4109,0x804a034, 4109
4109,0x804a034, 4110
Global variables: your mistake is that errno is not a global variable but a macro that expands to an lvalue of type int. In practice, it expands to (*__errno_location()) or similar.
getpid is a library function that returns the process id in the POSIX sense of process, not the bogus Linux per-clone pid. Nowadays Linux has the minimal kernel-level functionality necessary to make near-POSIX-compliance possible with respect to threads, but most of it still depends on ugly hacks at the userspace libc level.