How to get thread id of a pthread in linux c program? - c

In a Linux C program, how do I print the thread id of a thread created by the pthread library? For example like how we can get pid of a process by getpid().

What? The person asked for Linux specific, and the equivalent of getpid(). Not BSD or Apple. The answer is gettid() and returns an integral type. You will have to call it using syscall(), like this:
#include <sys/types.h>
#include <unistd.h>
#include <sys/syscall.h>
....
pid_t x = syscall(__NR_gettid);
While this may not be portable to non-linux systems, the threadid is directly comparable and very fast to acquire. It can be printed (such as for LOGs) like a normal integer.

pthread_self() function will give the thread id of current thread.
pthread_t pthread_self(void);
The pthread_self() function returns the Pthread handle of the calling thread. The pthread_self() function does NOT return the integral thread of the calling thread. You must use pthread_getthreadid_np() to return an integral identifier for the thread.
NOTE:
pthread_id_np_t tid;
tid = pthread_getthreadid_np();
is significantly faster than these calls, but provides the same behavior.
pthread_id_np_t tid;
pthread_t self;
self = pthread_self();
pthread_getunique_np(&self, &tid);

As noted in other answers, pthreads does not define a platform-independent way to retrieve an integral thread ID.
On Linux systems, you can get thread ID thus:
#include <sys/types.h>
pid_t tid = gettid();
On many BSD-based platforms, this answer https://stackoverflow.com/a/21206357/316487 gives a non-portable way.
However, if the reason you think you need a thread ID is to know whether you're running on the same or different thread to another thread you control, you might find some utility in this approach
static pthread_t threadA;
// On thread A...
threadA = pthread_self();
// On thread B...
pthread_t threadB = pthread_self();
if (pthread_equal(threadA, threadB)) printf("Thread B is same as thread A.\n");
else printf("Thread B is NOT same as thread A.\n");
If you just need to know if you're on the main thread, there are additional ways, documented in answers to this question how can I tell if pthread_self is the main (first) thread in the process?.

pid_t tid = syscall(SYS_gettid);
Linux provides such system call to allow you get id of a thread.

You can use pthread_self()
The parent gets to know the thread id after the pthread_create() is executed sucessfully, but while executing the thread if we want to access the thread id we have to use the function pthread_self().

This single line gives you pid , each threadid and spid.
printf("before calling pthread_create getpid: %d getpthread_self: %lu tid:%lu\n",getpid(), pthread_self(), syscall(SYS_gettid));

I think not only is the question not clear but most people also are not cognizant of the difference. Examine the following saying,
POSIX thread IDs are not the same as the thread IDs returned by the
Linux specific gettid() system call. POSIX thread IDs are assigned
and maintained by the threading implementation. The thread ID returned
by gettid() is a number (similar to a process ID) that is assigned by
the kernel. Although each POSIX thread has a unique kernel thread ID
in the Linux NPTL threading implementation, an application generally
doesn’t need to know about the kernel IDs (and won’t be portable if it
depends on knowing them).
Excerpted from: The Linux Programming Interface: A Linux and UNIX System Programming Handbook, Michael Kerrisk
IMHO, there is only one portable way that pass a structure in which define a variable holding numbers in an ascending manner e.g. 1,2,3... to per thread. By doing this, threads' id can be kept track. Nonetheless, int pthread_equal(tid1, tid2) function should be used.
if (pthread_equal(tid1, tid2)) printf("Thread 2 is same as thread 1.\n");
else printf("Thread 2 is NOT same as thread 1.\n");

pthread_getthreadid_np wasn't on my Mac os x. pthread_t is an opaque type. Don't beat your head over it. Just assign it to void* and call it good. If you need to printf use %p.

There is also another way of getting thread id. While creating threads with
int pthread_create(pthread_t * thread, const pthread_attr_t * attr, void * (*start_routine)(void *), void *arg);
function call; the first parameter pthread_t * thread is actually a thread id (that is an unsigned long int defined in bits/pthreadtypes.h). Also, the last argument void *arg is the argument that is passed to void * (*start_routine) function to be threaded.
You can create a structure to pass multiple arguments and send a pointer to a structure.
typedef struct thread_info {
pthread_t thread;
//...
} thread_info;
//...
tinfo = malloc(sizeof(thread_info) * NUMBER_OF_THREADS);
//...
pthread_create (&tinfo[i].thread, NULL, handler, (void*)&tinfo[i]);
//...
void *handler(void *targs) {
thread_info *tinfo = targs;
// here you get the thread id with tinfo->thread
}

For different OS there is different answer. I find a helper here.
You can try this:
#include <unistd.h>
#include <sys/syscall.h>
int get_thread_id() {
#if defined(__linux__)
return syscall(SYS_gettid);
#elif defined(__FreeBSD__)
long tid;
thr_self(&tid);
return (int)tid;
#elif defined(__NetBSD__)
return _lwp_self();
#elif defined(__OpenBSD__)
return getthrid();
#else
return getpid();
#endif
}

Platform-independent way (starting from c++11) is:
#include <thread>
std::this_thread::get_id();

You can also write in this manner and it does the same. For eg:
for(int i=0;i < total; i++)
{
pthread_join(pth[i],NULL);
cout << "SUM of thread id " << pth[i] << " is " << args[i].sum << endl;
}
This program sets up an array of pthread_t and calculate sum on each. So it is printing the sum of each thread with thread id.

Related

Get thread identifier C

I am trying to get the identifier of a thread, but it always returns some random numbers.
What isn't good here ?
// C program to demonstrate working of pthread_self()
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void* calls(void* ptr)
{
// using pthread_self() get current thread id
printf("In function \nthread id = %d\n", pthread_self());
pthread_exit(NULL);
return NULL;
}
int main()
{
pthread_t thread; // declare thread
pthread_create(&thread, NULL, calls, NULL);
printf("In main \nthread id = %d\n", thread);
pthread_join(thread, NULL);
return 0;
}
Maybe it would be good time to read the documentation. For Linux, it is said that
NOTES
POSIX.1 allows an implementation wide freedom in choosing the type used to represent a thread ID; for example, representation using either an arithmetic type or a structure is permitted. Therefore, variables of
type pthread_t can't portably be compared using the C equality operator (==); use pthread_equal(3) instead.
Thread identifiers should be considered opaque: any attempt to use a thread ID other than in pthreads calls is nonportable and can lead to unspecified results.
Thread IDs are guaranteed to be unique only within a process. A thread ID may be reused after a terminated thread has been joined, or a detached thread has terminated.
Which means that for example printing the value returned by pthread_self is not sensible.
In Linux/GLIBC the pthread_t type is actually the address of the "Thread Control Block" (TCB) of the thread (located at the bottom of its stack). This is unique. But as pointed out in a previous answer from #AnttiHaapala, no supposition can be done as the type should be considered opaque. So, printing it with a "%d" specifier is definitely not portable.
In Linux/GLIBC environment, you also have another unique identifier for a thread. This is its task identifier in the kernel obtained with a call to gettid(). It returns a pid_t typed value (signed integer suitable for "%d" format specifier) as discussed in this post.
You can use gettid(). Under Linux, it's available with glibc 2.30 or higher. If you have an older version, you can write your own trivial syscall wrapper.
#include <syscall.h>
static pid_t my_gettid(void)
{
return syscall(SYS_gettid);
}
#define gettid() my_gettid()

Why does printing the pointer to structure of type pthread, gives us thread ID?

The structure of pthread is as follows. It is taken from https://stuff.mit.edu/afs/sipb/project/pthreads/include/pthread.h
struct pthread {
struct machdep_pthread machdep_data;
enum pthread_state state;
pthread_attr_t attr;
/* Signal interface */
sigset_t sigmask;
sigset_t sigpending;
/* Time until timeout */
struct timespec wakeup_time;
/* Cleanup handlers Link List */
struct pthread_cleanup *cleanup;
/* Join queue for waiting threads */
struct pthread_queue join_queue;
/* Queue thread is waiting on, (mutexes, cond. etc.) */
struct pthread_queue *queue;
/*
* Thread implementations are just multiple queue type implemenations,
* Below are the various link lists currently necessary
* It is possible for a thread to be on multiple, or even all the
* queues at once, much care must be taken during queue manipulation.
*
* The pthread structure must be locked before you can even look at
* the link lists.
*/
struct pthread *pll; /* ALL threads, in any state */
/* struct pthread *rll; Current run queue, before resced */
struct pthread *sll; /* For sleeping threads */
struct pthread *next; /* Standard for mutexes, etc ... */
/* struct pthread *fd_next; For kernel fd operations */
int fd; /* Used when thread waiting on fd */
semaphore lock;
/* Data that doesn't need to be locked */
void *ret;
int error;
const void **specific_data;
};
typedef struct pthread * pthread_t;
Now let's see the following code to print the ID of the thread:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void* calls(void* ptr)
{
// using pthread_self() get current thread id
printf("In function \nthread id = %ld\n", pthread_self());
pthread_exit(NULL);
return NULL;
}
int main()
{
pthread_t thread; // declare thread
pthread_create(&thread, NULL, calls, NULL);
printf("In main \nthread id = %ld\n", thread);
pthread_join(thread, NULL);
return 0;
}
Output in my system is:
In main
thread id = 140289852200704
In function
thread id = 140289852200704
From pthread.h file (above), pthread is a structure, thread in the code is a pointer to the structure pthread (since pthread_t is typdef struct pthread*). Why does printing this pointer gives us the thread ID?
From pthread.h file (above), pthread is a structure, thread in the
code is a pointer to the structure pthread (since pthread_t is typdef
struct pthread*).
To be clear: in that implementation, pthread_t is a pointer-to-structure type. I imagine that's very common for pthreads implementations, but do be careful to avoid mistaking the details of a particular implementation for a general characteristic of the specifications or of all implementations. For example, it could just as well be an integer index in some other implementation, among various other possibilities.
Why does printing this pointer gives us the thread
ID?
Because it is the thread ID. And because you're lucky that the undefined behavior arising from printing it with a %d formatting directive manifested the same way in both places.
You probably have done yourself a disservice by looking under the covers at the definition of your implementation's pthread_t. You don't need to know those details to use pthreads, and in fact they do not help you in the slightest. The type is meant to be treated as opaque.
All you really need to understand to answer the question is that the value written into variable thread by pthread_create() is the created thread's ID, and the value returned by pthread_self() is the calling thread's thread ID. Naturally, each mechanism for obtaining a thread ID yields the same ID for the same thread.

process id of threads in the same process

The following code is for printing the process id of the 2 threads linux(ubuntu 14.04)
#include<pthread.h>
#include<stdio.h>
#include <unistd.h>
void* thread_function (void* arg)
{
fprintf (stderr, "child thread pid is %d\n", (int) getpid ());
/* Spin forever. */
while (1);
return NULL;
}
int main ()
{
pthread_t thread;
fprintf (stderr, "main thread pid is %d\n", (int) getpid ());
pthread_create (&thread, NULL, &thread_function, NULL);
/* Spin forever. */
while (1);
return 0;
}
And the output is
main thread pid is 3614
child thread pid is 3614
But shouldn't be the process id be different since GNU/Linux,threads are implemented as processes??
There's a terminology conflict here. It's true that each thread is a separate process as far as the Linux kernel is concerned. And therefore Linux assigns a new PID to each thread.
But that's not how POSIX works: according to POSIX all threads in a process should share the same PID. The Linux kernel calls this "Thread Group IDs" (TGID), and the getpid() function actually returns the TGID, in order to be POSIX-compliant.
Three separate concepts: the process id (getpid), the pthreads thread id (pthread_self) and the underlying linux thread id (gettid).
There is no glibc wrapper for gettid so it amounts to this
pid_t gettid(void)
{
return(syscall(SYS_gettid));
}
On the common pthreads programming level you shouldn't be concerned how the threads are implemented (though pdw explained it well enough). All this stuff is made deliberately opaque. There is no use for gettid in any pthreads function. They all require the pthreads thread id.
There are two main reasons why people ask about linux thread ids. One, they want to understand the relationship of linux thread ids in relation to some system utility e.g. ps, htop. Two, there are literally a handful of linux specific system calls in which the linux tid is useful.

pthread_create timing of writeback

In the call pthread_create(&id, NULL, &start_routine, arg), is the thread id guaranteed to be written to id before start_routine starts running? The manpages are clear that the start_routine may but will not necessarily begin executing before the call to pthread_create returns, but they are silent on when the thread id gets written back to the passed thread argument.
My specific case is that I have a wrapper around pthread_create:
int mk_thread(pthread_t *id) {
pthread_t tid;
pthread_create(&tid,NULL,ThreadStart,NULL);
if (id == NULL) {
pthread_detach(tid);
} else {
*id=lid;
}
}
which can obviously run the start routine before writing back. I changed it to
int mk_thread(pthread_t *id) {
pthread_t tid,tidPtr=id?id:&tid;
pthread_create(tidPtr,NULL,ThreadStart,NULL);
if (id == NULL) {
pthread_detach(tid);
}
}
This rewrite is much more stable in practice, but is it actually a fix or just a smaller window for the race condition?
The thread id is definitely written before pthread_create returns. If you think about it, it would be impossible for pthread_create to work any other way. It could not delegate writing the thread id to the new thread, because the pthread_t variable might be out of scope by the time the new thread runs.
The relevant text is:
Upon successful completion, pthread_create() shall store the ID of the created thread in the location referenced by thread.
(From http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_create.html) Note that it says "on successful completion" of the function, not "at an indeterminate time after successful completion".
The more interesting question, and I'm unclear on this one, is whether pthread_create must have finished writing the thread id to its destination before the new thread start function begins, i.e. whether the new thread can immediately see its own thread id, e.g. if it's to be stored in a global variable. I suspect the answer is no.
Edit: Upon rereading your question, it seems like you might really have been asking about this latter, more interesting question. In any case, there's no reason for the new thread's start function to use the thread-id written out by pthread_create. Your new thread can (and should) just use pthread_self to get its own thread id.
I believe that nothing in the spec requires pthread_create to assign its output parameter pthread_t *thread before code in start_routine begins to execute.
As a matter of practicality, the following program succeeds on many pthreads implementations (freebsd8 i386 and debian gnu/linux amd64) but fails on one of interest to me (debian/kfreebsd9 amd64):
#include <pthread.h>
#include <assert.h>
#include <stdint.h>
#include <stdio.h>
pthread_t th;
void *asserter(void* unused) {
pthread_t self = pthread_self(), th_=th;
printf("th=%jd self=%jd\n", (intmax_t)th_, (intmax_t)self);
assert(pthread_equal(th_, self));
}
int main() {
int i;
for(i=0; i<1000; i++) {
pthread_create(&th, NULL, asserter, NULL);
pthread_join(th, NULL);
}
return 0;
}
that said, I am not sure I understand how this detail of behavior is relevant to the two code alternatives you offer in the original question. Though it occurs to me that if pthread_create writes other values to *thread during its execution, and you're using the value of *id in the other thread, it could be relevant. The standard does not specify that no other 'intermediate' values are written to *thread during successful execution of pthread_create.

How does POSIX Threads work in linux?

I thought pthread uses clone to spawn one new thread in linux. But if so, all of the threads should have their seperate pid. Otherwise, if they have the same pid, the global variables in the libc seem to be shared. However, as I ran the following program, I got the same pid but the different address of errno.
extern errno;
void*
f(void *arg)
{
printf("%u,%p\n", getpid(), &errno);
fflush(stdin);
return NULL;
}
int
main(int argc, char **argv)
{
pthread_t tid;
pthread_create(&tid, NULL, f, NULL);
printf("%u,%p\n", getpid(), &errno);
fflush(stdin);
pthread_join(tid, NULL);
return 0;
}
Then, why?
I'm not sure exactly how clone() is used when pthread_create() is called. That said, looking at the clone() man page, it looks like there is a flag called CLONE_THREAD which:
If CLONE_THREAD is set, the child is
placed in the same thread group as the
calling process. To make the remainder
of the discussion of CLONE_THREAD more
readable, the term "thread" is used to
refer to the processes within a thread
group.
Thread groups were a feature added in
Linux 2.4 to support the POSIX threads
notion of a set of threads that share
a single PID. Internally, this shared
PID is the so-called thread group
identifier (TGID) for the thread
group. Since Linux 2.4, calls to
getpid(2) return the TGID of the
caller.
It then goes on to talk about a gettid() function for getting the unique ID of an individual thread within a process. Modifying your code:
#include <stdio.h>
#include <pthread.h>
#include <sys/types.h>
#include <sys/syscall.h>
#include <unistd.h>
int errno;
void*
f(void *arg)
{
printf("%u,%p, %u\n", getpid(), &errno, syscall(SYS_gettid));
fflush(stdin);
return NULL;
}
int
main(int argc, char **argv)
{
pthread_t tid;
pthread_create(&tid, NULL, f, NULL);
printf("%u,%p, %u\n", getpid(), &errno, syscall(SYS_gettid));
fflush(stdin);
pthread_join(tid, NULL);
return 0;
}
(make sure to use "-lpthread"!) we can see that the individual thread id is indeed unique, while the pid remains the same.
rascher#coltrane:~$ ./a.out
4109,0x804a034, 4109
4109,0x804a034, 4110
Global variables: your mistake is that errno is not a global variable but a macro that expands to an lvalue of type int. In practice, it expands to (*__errno_location()) or similar.
getpid is a library function that returns the process id in the POSIX sense of process, not the bogus Linux per-clone pid. Nowadays Linux has the minimal kernel-level functionality necessary to make near-POSIX-compliance possible with respect to threads, but most of it still depends on ugly hacks at the userspace libc level.

Resources