Recently I've been learning about pthread. Then I suddenly came out of an idea that how does gdb know I create a new thread. Then I wrote down a test code below and started up gdb. I step into pthread_create() function, but instead of letting it return normally, I use return 0 to return pthread_create() function. But gdb still shows that I have only one thread. At first, I thought that gdb got thread information from the return value from the pthread_create() function then I thought gdb might also use child process info to the get thread info so I edited my test code. But the result wasn't what I thought of.
So how does gdb get thread info? What kind of information it needs to know how many threads the main thread have and which thread I'm on.
Code
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <stdlib.h>
#include "pthread.h"
void *foo(void *bar) {
while(1) {
printf("hello from thread: %d\n", pthread_self());
sleep(2);
}
}
int main() {
printf("Before fake pthread_create");
pid_t pid;
if ((pid = fork()) == -1) {
perror("fork error");
exit(errno);
}
if (pid == 0) {
while(1) {
sleep(3);
}
}
if (pid > 0) {
pthread_t thread;
pthread_create(&thread, NULL, foo, NULL);
while(1) {
printf("hello from thread: %d\n", pthread_self());
sleep(2);
}
return 0;
}
}
How does gdb detect pthread?
GDB sets internal breakpoint on _dl_debug_state, which allows it track which shared libraries are loaded (this is necessary for debugging shared libraries).
When it observes that libpthread.so is loaded, it loads libthread_db.so.1 into its own process space (into GDB itself, not into the program being debugged), and asks that library to notify GDB when new threads are created and when they are destroyed. Documentation.
The libthread_db has intimate knowledge of the internals of libpthread, and installs appropriate hooks to achieve such notification.
There are 2 main mechanisms that debuggers use on linux, and neither of them are very pretty. There is so much detail here that I can only point you there and hope.
One is ptrace which allows the debugger to follow along as the program does things, such as executing system commands like pthread_create, or specific events happen, such as new threads starting, and control the monitored program: http://man7.org/linux/man-pages/man2/ptrace.2.html
The other is the /proc/ file system which reveals lots of information about a process: http://man7.org/linux/man-pages/man5/proc.5.html
In particular ls -l /proc/self/tasks shows you what threads ls has (only 1).
Related
In my destructor I want to destroy a thread cleanly.
My goal is to wait for a thread to finish executing and THEN destroy the thread.
The only thing I found about querying the state of a pthread is pthread_attr_setdetachstate but this only tells you if your thread is:
PTHREAD_CREATE_DETACHED
PTHREAD_CREATE_JOINABLE
Both of those have nothing to do with whether the thread is still running or not.
How do you query a pthread to see if it is still running?
It sounds like you have two questions here:
How can I wait until my thread completes?
Answer: This is directly supported by pthreads -- make your thread-to-be-stopped JOINABLE (when it is first started), and use pthread_join() to block your current thread until the thread-to-be-stopped is no longer running.
How can I tell if my thread is still running?
Answer: You can add a "thread_complete" flag to do the trick:
Scenario: Thread A wants to know if Thread B is still alive.
When Thread B is created, it is given a pointer to the "thread_complete" flag address. The "thread_complete" flag should be initialized to NOT_COMPLETED before the thread is created. Thread B's entry point function should immediately call pthread_cleanup_push() to push a "cleanup handler" which sets the "thread_complete" flag to COMPLETED.
See details about cleanup handlers here: pthread cleanup handlers
You'll want to include a corresponding pthread_cleanup_pop(1) call to ensure that the cleanup handler gets called no matter what (i.e. if the thread exits normally OR due to cancellation, etc.).
Then, Thread A can simply check the "thread_complete" flag to see if Thread B has exited yet.
NOTE: Your "thread_complete" flag should be declared "volatile" and should be an atomic type -- the GNU compilers provide the sig_atomic_t for this purpose. This allows the two threads consistent access the same data without the need for synchronization constructs (mutexes/semaphores).
pthread_kill(tid, 0);
No signal is sent, but error checking is still performed so you can use that to check
existence of tid.
CAUTION: This answer is incorrect. The standard specifically prohibits passing the ID of a thread whose lifetime has ended. That ID might now specify a different thread or, worse, it might refer to memory that has been freed, causing a crash.
I think all you really need is to call pthread_join(). That call won't return until the thread has exited.
If you only want to poll to see whether the thread is still running or not (and note that is usually not what you should be wanting to do!), you could have the thread set a volatile boolean to false just before it exits... then your main-thread could read the boolean and if it's still true, you know the thread is still running. (if it's false, on the other hand, you know the thread is at least almost gone; it may still be running cleanup code that occurs after it sets the boolean to false, though, so even in this case you should still call pthread_join before trying to free any resources the thread might have access to)
There is not fully portable solution, look if your platform supports pthread_tryjoin_np or pthread_timedjoin_np. So you just check if thread can be joined (of course created with PTHREAD_CREATE_JOINABLE).
Let me note on the "winning" answer, which has a huge hidden flaw, and in some contexts it can lead to crashes. Unless you use pthread_join, it will coming up again and again. Assume you are having a process and a shared library. Call the library lib.so.
You dlopen it, you start a thread in it. Assume you don't want it join to it, so you set it detachable.
Process and shared lib's logic doing its work, etc...
You want to load out lib.so, because you don't need it any more.
You call a shutdown on the thread and you say, that you want to read a flag afterwards from your lib.so's thread, that it have finished.
You continue on another thread with dlclose, because you see, that you have saw, that the flag is now showing the thread as "finished"
dlclose will load out all stack and code related memory.
Whops, but dlclose does not stop threads. And you know, even when you are in the last line of the cleanup handler to set the "thread is finished" volatile atomic flag variable, you still have to return from a lot of methods on the stack, giving back values, etc. If a huge thread priority was given to #5+#6's thread, you will receive dlclose before you could REALLY stop on the thread. You will have some nice crashes sometimes.
Let me point out, that this is not a hipothetical problem, I had the same issue on our project.
I believe I've come up with a solution that at least works for Linux. Whenever I create a thread I have it save it's LWP (Light Weight Process ID) and assign it a unique name, eg.
int lwp = syscall(SYS_gettid);
prctl(PR_SET_NAME, (long)"unique name", 0, 0, 0);
Then, to check if the thread exists later I open /proc/pid/task/lwp/comm and read it. If the file exists and it's contents match the unique name I assigned, the thread exists. Note that this does NOT pass a possibly defunct/reused TID to any library function, so no crashes.
#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <pthread.h>
#include <sys/prctl.h>
#include <sys/file.h>
#include <stdbool.h>
#include <string.h>
#include <unistd.h>
#include <syscall.h>
pthread_t subthread_tid;
int subthread_lwp;
#define UNIQUE_NAME "unique name"
bool thread_exists (pthread_t thread_id)
{
char path[100];
char thread_name[16];
FILE *fp;
bool thread_exists = false;
// If the /proc/<pid>/task/<lwp>/comm file exists and it's contents match the "unique name" the
// thread exists, and it's the original thread (TID has NOT been reused).
sprintf(path, "/proc/%d/task/%d/comm", getpid(), subthread_lwp);
fp = fopen(path, "r");
if( fp != NULL ) {
fgets(thread_name, 16, fp);
fclose(fp);
// Need to trim off the newline
thread_name[strlen(thread_name)-1] = '\0';
if( strcmp(UNIQUE_NAME, thread_name) == 0 ) {
thread_exists = true;
}
}
if( thread_exists ) {
printf("thread exists\n");
} else {
printf("thread does NOT exist\n");
}
return thread_exists;
}
void *subthread (void *unused)
{
subthread_lwp = syscall(SYS_gettid);
prctl(PR_SET_NAME, (long)UNIQUE_NAME, 0, 0, 0);
sleep(10000);
return NULL;
}
int main (int argc, char *argv[], char *envp[])
{
int error_number;
pthread_create(&subthread_tid, NULL, subthread, NULL);
printf("pthread_create()\n");
sleep(1);
thread_exists(subthread_tid);
pthread_cancel(subthread_tid);
printf("pthread_cancel()\n");
sleep(1);
thread_exists(subthread_tid);
error_number = pthread_join(subthread_tid, NULL);
if( error_number == 0 ) {
printf("pthread_join() successful\n");
} else {
printf("pthread_join() failed, %d\n", error_number);
}
thread_exists(subthread_tid);
exit(0);
}
#include <string.h>
#include <stdio.h>
#include <pthread.h>
#include <signal.h>
#include <unistd.h>
void* thread1 (void* arg);
void* thread2 (void* arg);
int main()
{
pthread_t thr_id;
pthread_create(&thr_id, NULL, thread1, NULL);
sleep(10);
}
void* thread1 (void* arg)
{
pthread_t thr_id = 0;
pthread_create(&thr_id, NULL, thread2, NULL);
sleep(5);
int ret = 0;
if( (ret = pthread_kill(thr_id, 0)) == 0)
{
printf("still running\n");
pthread_join(thr_id, NULL);
}
else
{
printf("RIP Thread = %d\n",ret);
}
}
void* thread2 (void* arg)
{
// sleep(5);
printf("I am done\n");
}
I have the following C source:
#define _MULTI_THREADED
#include <pthread.h>
#include <stdio.h>
void* threadfunc(void* parm){
printf("Hello thread.\n");
pthread_exit(NULL);
}
int main(int argc, char* argv[]){
pthread_t t;
int rc;
rc = pthread_create(&t, NULL, threadfunc, NULL);
printf("Create return code: %i\n", rc);
if(!rc){
pthread_join(t, NULL);
}
return 0;
}
Compiled with crtbndc pgm(test) srcfile(myfile) srcmbr(test)
When called with call test, I get the output:
Create return code: 3029
What does this error code mean?
According to IBM i documentation, pthreads doesn't seem to be supported:
Thread creation (pthread_create()) fails with EBUSY or 3029
Because many parts of the operating system are not yet thread safe,
not every job can start threads. The pthread_create() API fails with
the EBUSY error when the process is not allowed to create threads. See
Running threaded programs for information about how to start a job
that can create threads.
And it suggests a few alternatives.
Error return codes can be interpreted most easily by looking the message description for the related message ID. Use the prefix 'CPE' with the character return code '3029'. So for this one, see this command:
DSPMSGD CPE3029
In this case, the 1st-level text is "Resource busy." This likely refers to the display file/device that is already in active use and is allocated to the job's primary thread (assuming the CALL was made in an interactive job).
In a program, you might review the Checking the Errno Value topic in the ILE C/C++ Programmer's Guide. The ERRNO member in the H source file in library QSYSINC should also be reviewed.
Also, a table of Errno Values for UNIX-Type Functions is in the Knowledge Center.
The child thread in this code blocks the shell, even after the main process exits. How do I make it run in the background and not block the shell? I see it is possible with fork(), but I do not want to create a whole new process.
Thank you.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void * myThreadFun (void *vargp)
{
while (1)
{
//Do useful work continuously
sleep (1);
}
}
int
main ()
{
pthread_t tid;
pthread_create (&tid, NULL, myThreadFun, NULL);
pthread_detach (tid);
printf ("After Thread\n");
pthread_exit (0);
}
In a multi-threaded program, there is no way for the main thread to actually exit and leave spawned threads running. If you need this program to continue running when you execute it from a shell but immediately return to a shell prompt, i.e., run in the background, you will have to use fork().
Calling tzset() after forking appears to be very slow. I only see the slowness if I first call tzset() in the parent process before forking. My TZ environment variable is not set. I dtruss'd my test program and it revealed the child process reads /etc/localtime for every tzset() invocation, while the parent process only reads it once. This file access seems to be the source of the slowness, but I wasn't able to determine why it's accessing it every time in the child process.
Here is my test program foo.c:
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <unistd.h>
void check(char *msg);
int main(int argc, char **argv) {
check("before");
pid_t c = fork();
if (c == 0) {
check("fork");
exit(0);
}
wait(NULL);
check("after");
}
void check(char *msg) {
struct timeval tv;
gettimeofday(&tv, NULL);
time_t start = tv.tv_sec;
suseconds_t mstart = tv.tv_usec;
for (int i = 0; i < 10000; i++) {
tzset();
}
gettimeofday(&tv, NULL);
double delta = (double)(tv.tv_sec - start);
delta += (double)(tv.tv_usec - mstart)/1000000.0;
printf("%s took: %fs\n", msg, delta);
}
I compiled and executed foo.c like this:
[muir#muir-work-mb scratch]$ clang -o foo foo.c
[muir#muir-work-mb scratch]$ env -i ./foo
before took: 0.002135s
fork took: 1.122254s
after took: 0.001120s
I'm running Mac OS X 10.10.1 (also reproduced on 10.9.5).
I originally noticed the slowness via ruby (Time#localtime slow in child process).
Ken Thomases's response may be correct, but I was curious about a more specific answer because I still find the slowness unexpected behavior for a single-threaded program performing such a simple/common operation after forking. After examining http://opensource.apple.com/source/Libc/Libc-997.1.1/stdtime/FreeBSD/localtime.c (not 100% sure this is the correct source), I think I have an answer.
The code uses passive notifications to determine if the time zone has changed (as opposed to stating /etc/localtime every time). It appears that the registered notification token becomes invalid in the child process after forking. Furthermore, the code treats the error from using an invalid token as a positive notification that the timezone has changed, and proceeds to read /etc/localtime every time. I guess this is the kind of undefined behavior you can get after forking? It would be nice if the library noticed the error and re-registered for the notification, though.
Here is the snippet of code from localtime.c that mixes the error value with the status value:
nstat = notify_check(p->token, &ncheck);
if (nstat || ncheck) {
I demonstrated that the registration token becomes invalid after fork using this program:
#include <notify.h>
#include <stdio.h>
#include <stdlib.h>
void bail(char *msg) {
printf("Error: %s\n", msg);
exit(1);
}
int main(int argc, char **argv) {
int token, something_changed, ret;
notify_register_check("com.apple.system.timezone", &token);
ret = notify_check(token, &something_changed);
if (ret)
bail("notify_check #1 failed");
if (!something_changed)
bail("expected change on first call");
ret = notify_check(token, &something_changed);
if (ret)
bail("notify_check #2 failed");
if (something_changed)
bail("expected no change");
pid_t c = fork();
if (c == 0) {
ret = notify_check(token, &something_changed);
if (ret) {
if (ret == NOTIFY_STATUS_INVALID_TOKEN)
printf("ret is invalid token\n");
if (!notify_is_valid_token(token))
printf("token is not valid\n");
bail("notify_check in fork failed");
}
if (something_changed)
bail("expected not changed");
exit(0);
}
wait(NULL);
}
And ran it like this:
muir-mb:projects muir$ clang -o notify_test notify_test.c
muir-mb:projects muir$ ./notify_test
ret is invalid token
token is not valid
Error: notify_check in fork failed
You're lucky you didn't experience nasal demons!
POSIX states that only async-signal-safe functions are legal to call in the child process after the fork() and before a call to an exec*() function. From the standard (emphasis added):
… the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.
…
There are two reasons why POSIX programmers call fork(). One reason is
to create a new thread of control within the same program (which was
originally only possible in POSIX by creating a new process); the
other is to create a new process running a different program. In the
latter case, the call to fork() is soon followed by a call to one of
the exec functions.
The general problem with making fork() work in a multi-threaded world
is what to do with all of the threads. There are two alternatives. One
is to copy all of the threads into the new process. This causes the
programmer or implementation to deal with threads that are suspended
on system calls or that might be about to execute system calls that
should not be executed in the new process. The other alternative is to
copy only the thread that calls fork(). This creates the difficulty
that the state of process-local resources is usually held in process
memory. If a thread that is not calling fork() holds a resource, that
resource is never released in the child process because the thread
whose job it is to release the resource does not exist in the child
process.
When a programmer is writing a multi-threaded program, the first
described use of fork(), creating new threads in the same program, is
provided by the pthread_create() function. The fork() function is thus
used only to run new programs, and the effects of calling functions
that require certain resources between the call to fork() and the call
to an exec function are undefined.
There are lists of async-signal-safe functions here and here. For any other function, if it's not specifically documented that the implementations on the platforms to which you're deploying add a non-standard safety guarantee, then you must consider it unsafe and its behavior on the child side of a fork() to be undefined.
In my destructor I want to destroy a thread cleanly.
My goal is to wait for a thread to finish executing and THEN destroy the thread.
The only thing I found about querying the state of a pthread is pthread_attr_setdetachstate but this only tells you if your thread is:
PTHREAD_CREATE_DETACHED
PTHREAD_CREATE_JOINABLE
Both of those have nothing to do with whether the thread is still running or not.
How do you query a pthread to see if it is still running?
It sounds like you have two questions here:
How can I wait until my thread completes?
Answer: This is directly supported by pthreads -- make your thread-to-be-stopped JOINABLE (when it is first started), and use pthread_join() to block your current thread until the thread-to-be-stopped is no longer running.
How can I tell if my thread is still running?
Answer: You can add a "thread_complete" flag to do the trick:
Scenario: Thread A wants to know if Thread B is still alive.
When Thread B is created, it is given a pointer to the "thread_complete" flag address. The "thread_complete" flag should be initialized to NOT_COMPLETED before the thread is created. Thread B's entry point function should immediately call pthread_cleanup_push() to push a "cleanup handler" which sets the "thread_complete" flag to COMPLETED.
See details about cleanup handlers here: pthread cleanup handlers
You'll want to include a corresponding pthread_cleanup_pop(1) call to ensure that the cleanup handler gets called no matter what (i.e. if the thread exits normally OR due to cancellation, etc.).
Then, Thread A can simply check the "thread_complete" flag to see if Thread B has exited yet.
NOTE: Your "thread_complete" flag should be declared "volatile" and should be an atomic type -- the GNU compilers provide the sig_atomic_t for this purpose. This allows the two threads consistent access the same data without the need for synchronization constructs (mutexes/semaphores).
pthread_kill(tid, 0);
No signal is sent, but error checking is still performed so you can use that to check
existence of tid.
CAUTION: This answer is incorrect. The standard specifically prohibits passing the ID of a thread whose lifetime has ended. That ID might now specify a different thread or, worse, it might refer to memory that has been freed, causing a crash.
I think all you really need is to call pthread_join(). That call won't return until the thread has exited.
If you only want to poll to see whether the thread is still running or not (and note that is usually not what you should be wanting to do!), you could have the thread set a volatile boolean to false just before it exits... then your main-thread could read the boolean and if it's still true, you know the thread is still running. (if it's false, on the other hand, you know the thread is at least almost gone; it may still be running cleanup code that occurs after it sets the boolean to false, though, so even in this case you should still call pthread_join before trying to free any resources the thread might have access to)
There is not fully portable solution, look if your platform supports pthread_tryjoin_np or pthread_timedjoin_np. So you just check if thread can be joined (of course created with PTHREAD_CREATE_JOINABLE).
Let me note on the "winning" answer, which has a huge hidden flaw, and in some contexts it can lead to crashes. Unless you use pthread_join, it will coming up again and again. Assume you are having a process and a shared library. Call the library lib.so.
You dlopen it, you start a thread in it. Assume you don't want it join to it, so you set it detachable.
Process and shared lib's logic doing its work, etc...
You want to load out lib.so, because you don't need it any more.
You call a shutdown on the thread and you say, that you want to read a flag afterwards from your lib.so's thread, that it have finished.
You continue on another thread with dlclose, because you see, that you have saw, that the flag is now showing the thread as "finished"
dlclose will load out all stack and code related memory.
Whops, but dlclose does not stop threads. And you know, even when you are in the last line of the cleanup handler to set the "thread is finished" volatile atomic flag variable, you still have to return from a lot of methods on the stack, giving back values, etc. If a huge thread priority was given to #5+#6's thread, you will receive dlclose before you could REALLY stop on the thread. You will have some nice crashes sometimes.
Let me point out, that this is not a hipothetical problem, I had the same issue on our project.
I believe I've come up with a solution that at least works for Linux. Whenever I create a thread I have it save it's LWP (Light Weight Process ID) and assign it a unique name, eg.
int lwp = syscall(SYS_gettid);
prctl(PR_SET_NAME, (long)"unique name", 0, 0, 0);
Then, to check if the thread exists later I open /proc/pid/task/lwp/comm and read it. If the file exists and it's contents match the unique name I assigned, the thread exists. Note that this does NOT pass a possibly defunct/reused TID to any library function, so no crashes.
#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <pthread.h>
#include <sys/prctl.h>
#include <sys/file.h>
#include <stdbool.h>
#include <string.h>
#include <unistd.h>
#include <syscall.h>
pthread_t subthread_tid;
int subthread_lwp;
#define UNIQUE_NAME "unique name"
bool thread_exists (pthread_t thread_id)
{
char path[100];
char thread_name[16];
FILE *fp;
bool thread_exists = false;
// If the /proc/<pid>/task/<lwp>/comm file exists and it's contents match the "unique name" the
// thread exists, and it's the original thread (TID has NOT been reused).
sprintf(path, "/proc/%d/task/%d/comm", getpid(), subthread_lwp);
fp = fopen(path, "r");
if( fp != NULL ) {
fgets(thread_name, 16, fp);
fclose(fp);
// Need to trim off the newline
thread_name[strlen(thread_name)-1] = '\0';
if( strcmp(UNIQUE_NAME, thread_name) == 0 ) {
thread_exists = true;
}
}
if( thread_exists ) {
printf("thread exists\n");
} else {
printf("thread does NOT exist\n");
}
return thread_exists;
}
void *subthread (void *unused)
{
subthread_lwp = syscall(SYS_gettid);
prctl(PR_SET_NAME, (long)UNIQUE_NAME, 0, 0, 0);
sleep(10000);
return NULL;
}
int main (int argc, char *argv[], char *envp[])
{
int error_number;
pthread_create(&subthread_tid, NULL, subthread, NULL);
printf("pthread_create()\n");
sleep(1);
thread_exists(subthread_tid);
pthread_cancel(subthread_tid);
printf("pthread_cancel()\n");
sleep(1);
thread_exists(subthread_tid);
error_number = pthread_join(subthread_tid, NULL);
if( error_number == 0 ) {
printf("pthread_join() successful\n");
} else {
printf("pthread_join() failed, %d\n", error_number);
}
thread_exists(subthread_tid);
exit(0);
}
#include <string.h>
#include <stdio.h>
#include <pthread.h>
#include <signal.h>
#include <unistd.h>
void* thread1 (void* arg);
void* thread2 (void* arg);
int main()
{
pthread_t thr_id;
pthread_create(&thr_id, NULL, thread1, NULL);
sleep(10);
}
void* thread1 (void* arg)
{
pthread_t thr_id = 0;
pthread_create(&thr_id, NULL, thread2, NULL);
sleep(5);
int ret = 0;
if( (ret = pthread_kill(thr_id, 0)) == 0)
{
printf("still running\n");
pthread_join(thr_id, NULL);
}
else
{
printf("RIP Thread = %d\n",ret);
}
}
void* thread2 (void* arg)
{
// sleep(5);
printf("I am done\n");
}