pthread_t to gdb thread id - c

Does anyone know a way to go from a pthread_t to what GDB displays with info threads?
So I have:
(gdb) info threads
37 Thread 22887 0xb7704422 in __kernel_vsyscall ()
36 Thread 22926 0xb7704422 in __kernel_vsyscall ()
35 Thread 22925 0xb7704422 in __kernel_vsyscall ()
34 Thread 22924 0xb7704422 in __kernel_vsyscall ()
33 Thread 22922 0xb7704422 in __kernel_vsyscall ()
32 Thread 22921 0xb7704422 in __kernel_vsyscall ()
(gdb) p m_messageQueue->m_creationThread
$3 = 2694822768
(gdb) p/x m_messageQueue->m_creationThread
$4 = 0xa09fbb70
Does anyone know how I figure out which thread this is? It would appear to be 22768, but none of my threads go that low.

New versions of GDB actually output the value of pthread_t in the info thread, making association of pthread_t with thread number trivial.
For example, using GDB 7.0:
cat t.c
#include <pthread.h>
void *fn(void *p)
{
sleep(180);
}
int main()
{
pthread_t pth1, pth2;
pthread_create(&pth1, 0, fn, 0);
pthread_create(&pth2, 0, fn, 0);
pthread_join(pth1, 0);
return 0;
}
gcc -g -m32 -pthread t.c && gdb -q ../a.out
(gdb) r
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0xf7e56b90 (LWP 25343)]
[New Thread 0xf7655b90 (LWP 25344)]
Program received signal SIGINT, Interrupt.
0xffffe405 in __kernel_vsyscall ()
(gdb) info thread
3 Thread 0xf7655b90 (LWP 25344) 0xffffe405 in __kernel_vsyscall ()
2 Thread 0xf7e56b90 (LWP 25343) 0xffffe405 in __kernel_vsyscall ()
* 1 Thread 0xf7e576b0 (LWP 25338) 0xffffe405 in __kernel_vsyscall ()
(gdb) up 2
#2 0x080484e2 in main () at t.c:13
13 pthread_join(pth1, 0);
(gdb) p/x pth1
$1 = 0xf7e56b90 ## this is thread #2 above
(gdb) p/x pth2
$2 = 0xf7655b90 ## this is thread #3 above

The value of pthread_t is not the same as that thread's system dependent thread id (in Linux gettid(2)) which you see in GDB.
AFAIK, there isn't any function to convert between the two. You need to keep track of that yourself.

Related

Segmentation Fault due to Address Space Randomization

I have a project in C which I am compiling with gcc. It sometimes shows a segmentation fault when run and sometimes it doesn't. When run with gcc $ gdb -q ./build/program, it does not show any errors. After doing a bit of research, I found this question on stack overflow. Setting (gdb) set disable-randomization off does allow me to see the segmentation fault in gdb. However, I have no clue what address space randomization does and where should I look to find the problem. Is there a particular tool or a particular type of construct that I should look into?
I am also including the backtrace here for more context:
(gdb) set disable-randomization off
(gdb) run -n 10 -s 10
Starting program: /path/to/code/build/exact_diag_simulation -n 10 -s 10
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Exact Diagonalization
---------------------
len: 10 nospin: 0 coupling: 0.00e+00
disorder: 1.00e+01 hopping: 1.00e+00
Starting Simulation for Exact Diagonalization...
Run 1 started...[New Thread 0x7f4e34bff6c0 (LWP 12972)]
[New Thread 0x7f4e343fe6c0 (LWP 12973)]
[New Thread 0x7f4e33bfd6c0 (LWP 12974)]
[New Thread 0x7f4e333fc6c0 (LWP 12975)]
[New Thread 0x7f4e32bfb6c0 (LWP 12976)]
[New Thread 0x7f4e323fa6c0 (LWP 12977)]
[New Thread 0x7f4e31bf96c0 (LWP 12978)]
Thread 1 "exact_diag_simu" received signal SIGSEGV, Segmentation fault.
0x00007f4e75934a02 in zgemv_n_SKYLAKEX () from /usr/lib/libblas.so.3
(gdb) bt
#0 0x00007f4e75934a02 in zgemv_n_SKYLAKEX () from /usr/lib/libblas.so.3
#1 0x00007f4e74ff550c in zgemv_ () from /usr/lib/libblas.so.3
#2 0x00007f4e76772142 in zlatrd_ () from /usr/lib/liblapack.so.3
#3 0x00007f4e766f3ba0 in zhetrd_ () from /usr/lib/liblapack.so.3
#4 0x00007f4e766ea105 in zheev_ () from /usr/lib/liblapack.so.3
#5 0x00007f4e77195e07 in LAPACKE_zheev_work () from /usr/lib/liblapacke.so.3
#6 0x00007f4e77195fba in LAPACKE_zheev () from /usr/lib/liblapacke.so.3
#7 0x00005612184a3287 in utils_get_eigh (matrix=0x7f4e3135c010, size=200, eigvals=0x561218f1a0c0)
at src/utils/utils.c:273
#8 0x00005612184a27c4 in run (params=0x7fffa9f5ffd0, create_neighbours=1, gfunc=0x7f4e74ee1010)
at src/exact_diag_simulation.c:155
#9 0x00005612184a2577 in main (argc=5, argv=0x7fffa9f60348) at src/exact_diag_simulation.c:103
I have compiled with the following flags in my makefile:
CFLAGS=-Wall -Wextra -g -fdiagnostics-color=always -fopenmp -ffast-math -fsanitize=address,undefined
LFLAGS=-llapacke -lm -lgsl -lcblas

How can I pause the execution of a pthread program, so that I can check the pid and tgid of the threads?

I am trying with a small program from Distinction between processes and threads in Linux
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
#include <pthread.h>
void* threadMethod(void* arg)
{
int intArg = (int) *((int*) arg);
int32_t pid = getpid();
uint64_t pti = pthread_self();
printf("[Thread %d] getpid() = %d\n", intArg, pid);
printf("[Thread %d] pthread_self() = %lu\n", intArg, pti);
}
int main()
{
pthread_t threads[2];
int thread1 = 1;
if ((pthread_create(&threads[0], NULL, threadMethod, (void*) &thread1))
!= 0)
{
fprintf(stderr, "pthread_create: error\n");
exit(EXIT_FAILURE);
}
int thread2 = 2;
if ((pthread_create(&threads[1], NULL, threadMethod, (void*) &thread2))
!= 0)
{
fprintf(stderr, "pthread_create: error\n");
exit(EXIT_FAILURE);
}
int32_t pid = getpid();
uint64_t pti = pthread_self();
printf("[Process] getpid() = %d\n", pid);
printf("[Process] pthread_self() = %lu\n", pti);
if ((pthread_join(threads[0], NULL)) != 0)
{
fprintf(stderr, "Could not join thread 1\n");
exit(EXIT_FAILURE);
}
if ((pthread_join(threads[1], NULL)) != 0)
{
fprintf(stderr, "Could not join thread 2\n");
exit(EXIT_FAILURE);
}
return 0;
}
On 64 bit Lubuntu 18.04, I compile it by the same command from the post:
$ gcc -pthread -o thread_test thread_test.c
I also try to follow what the post says:
By using scheduler locking in gdb, I can keep the program and its threads alive so I can capture what top
but because I am not familiar with gdb, the program runs to finish without pausing (see below). I also tried to set up breakpoint by break 43, but gdb says No line 40 in the current file. What shall I do to pause the execution, so that I can use top or ps to examine the threads' pid and tgid?
$ gdb thread_test
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from thread_test...(no debugging symbols found)...done.
(gdb) set scheduler-locking
Requires an argument. Valid arguments are off, on, step, replay.
(gdb) set scheduler-locking on
Target 'exec' cannot support this command.
(gdb) run
Starting program: /tmp/test/pthreads/thread_test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff77c4700 (LWP 4711)]
[New Thread 0x7ffff6fc3700 (LWP 4712)]
[Thread 1] getpid() = 4707
[Thread 1] pthread_self() = 140737345505024
[Process] getpid() = 4707
[Process] pthread_self() = 140737353951040
[Thread 0x7ffff77c4700 (LWP 4711) exited]
[Thread 2] getpid() = 4707
[Thread 2] pthread_self() = 140737337112320
[Thread 0x7ffff6fc3700 (LWP 4712) exited]
[Inferior 1 (process 4707) exited normally]
(gdb)
You have two problems:
you built your program without debugging info (add -g flag), and
you are trying to set scheduler-locking on before the program started (that doesn't work).
This should work:
gcc -g -pthread -o thread_test thread_test.c
gdb -q ./thread_test
(gdb) start
(gdb) set scheduler-locking on
However, you must be extra careful with this setting -- simply continuing from this point will get your program to block in pthread_join, as only the main thread will keep running.
the following is an example of using gdb with the posted code to pause everything:
note: this was compiled to find/fix compile problems via:
gcc -ggdb -Wall -Wextra -Wconversion -pedantic -std=gnu11 -c untitled.c
this was finally compiled/linked via:
gcc -ggdb -Wall -o untitled untitled.c -lpthread
then using the debugger: gdb, thereby showing my inputs and the gdb outputs:
$ gdb untitled
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from untitled...done.
(gdb) br main
Breakpoint 1 at 0x9a5: file untitled.c, line 20.
(gdb) br threadMethod
Breakpoint 2 at 0x946: file untitled.c, line 9.
(gdb) r
Starting program: /home/richard/Documents/forum/untitled
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, main () at untitled.c:20
20 {
(gdb) c
Continuing.
[New Thread 0x7ffff77c4700 (LWP 8645)]
[New Thread 0x7ffff6fc3700 (LWP 8646)]
[Switching to Thread 0x7ffff77c4700 (LWP 8645)]
Thread 2 "untitled" hit Breakpoint 2, threadMethod (arg=0x7fffffffdf4c)
at untitled.c:9
9 int intArg = (int) *((int*) arg);
(gdb)
then you can (in another terminal window) use ps etc to display info. However, the thread function will output (to stdout the information you might be interested in.
or you can (in gdb enter commands like:
(gdb) c
[Process] getpid() = 8641
[Process] pthread_self() = 140737353992000
[Switching to Thread 0x7ffff6fc3700 (LWP 8646)]
Thread 3 "untitled" hit Breakpoint 2, threadMethod (arg=0x7fffffffdf50)
at untitled.c:9
9 int intArg = (int) *((int*) arg);
(gdb) c
....
[Thread 1] getpid() = 8641
[Thread 1] pthread_self() = 140737345505024
....
[Thread 2] getpid() = 8641
[Thread 2] pthread_self() = 140737337112320

Inspecting caller frames with gdb

Suppose I have:
#include <stdlib.h>
int main()
{
int a = 2, b = 3;
if (a!=b)
abort();
}
Compiled with:
gcc -g c.c
Running this, I'll get a coredump (due to the SIGABRT raised by abort()), which I can debug with:
gdb a.out core
How can I get gdb to print the values of a and b from this context?
Here's the another way to specifically get a and b values by moving to the interested frame and then info locals would give you the values.
a.out was compiled with your code. (frame 2 is what you are interested in i.e., main()).
$ gdb ./a.out core
[ removed some not-so-interesting info here ]
Reading symbols from ./a.out...done.
[New LWP 14732]
Core was generated by `./a.out'.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007fac16269f5d in __GI_abort () at abort.c:90
#2 0x00005592862f266d in main () at f.c:7
(gdb) frame 2
#2 0x00005592862f266d in main () at f.c:7
7 abort();
(gdb) info locals
a = 2
b = 3
(gdb) q
You can also use print once frame 2:
(gdb) print a
$1 = 2
(gdb) print b
$2 = 3
Did you compile with debug symbols -g? The command should be bt for backtrace, you can also use bt full for a full backtrace.
More infos: https://sourceware.org/gdb/onlinedocs/gdb/Backtrace.html

gdb: thread debugging will not be available?

I'm using gdb-7.11.1 and I get this message on my embedded powerpc system. Some more background, the libpthread I use has been stripped off all the non-dynamic symbols, including nptl_version, which libthread_db uses to make sure it is compatible with libpthread.
Coming to my problem, gdb says it won't be able to debug threads, but it seemingly can as evidenced below. Am I simply misunderstanding what 'thread debugging' means? (The ?? you see are naturally due to the missing symbol table in libpthread)
(gdb) break fn2
Breakpoint 1 at 0x1000052c: file test.c, line 7.
(gdb) run
Starting program: /tmp/test
warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.
[New LWP 21312]
[New LWP 21313]
[New LWP 21314]
[New LWP 21315]
[New LWP 21316]
[New LWP 21317]
[Switching to LWP 21315]
Thread 5 hit Breakpoint 1, fn2 () at test.c:7
7 test.c: No such file or directory.
(gdb) thread apply all bt
Thread 7 (LWP 21317):
#0 0x0fdcf030 in ?? () from /lib/libpthread.so.0
#1 0x0fdc892c in pthread_mutex_lock () from /lib/libpthread.so.0
#2 0x00000000 in ?? ()
Thread 6 (LWP 21316):
#0 0x0fdcf030 in ?? () from /lib/libpthread.so.0
#1 0x0fdc892c in pthread_mutex_lock () from /lib/libpthread.so.0
#2 0x00000000 in ?? ()
Thread 5 (LWP 21315):
#0 fn2 () at test.c:7
#1 0x0fdc6d8c in ?? () from /lib/libpthread.so.0
#2 0x0fd26074 in clone () from /lib/libc.so.6
Thread 4 (LWP 21314):
#0 0x0fdcf030 in ?? () from /lib/libpthread.so.0
#1 0x0fdc892c in pthread_mutex_lock () from /lib/libpthread.so.0
#2 0x00000000 in ?? ()
Thread 3 (LWP 21313):
#0 0x0fdcf030 in ?? () from /lib/libpthread.so.0
#1 0x0fdc892c in pthread_mutex_lock () from /lib/libpthread.so.0
#2 0x00000000 in ?? ()
Thread 2 (LWP 21312):
#0 0x0fdcefdc in ?? () from /lib/libpthread.so.0
#1 0x0fdc892c in pthread_mutex_lock () from /lib/libpthread.so.0
#2 0x00000000 in ?? ()
Thread 1 (LWP 21309):
#0 0x0fd26038 in clone () from /lib/libc.so.6
#1 0x0fdc5f2c in ?? () from /lib/libpthread.so.0
#2 0x0fde6150 in ?? () from /lib/libpthread.so.0
#3 0x0fdc6424 in pthread_create () from /lib/libpthread.so.0
#4 0x100006a4 in main () at test.c:23
(gdb)
On Linux (at least, and others), an important part of the threading library is implemented in the kernel: that the "kernel-thread", called LWPs (for light-weight process).
GDB doesn't need libthread_db help to track them, as the OS itself can give the information the key information about them: their CPU registers (mainly IP, SP, FP).
I'm not sure what libthread_db provides in that context. The only thing I can think of is the LWP <-> Thread id mapping:
* 3 Thread 0x7ffff6d19700 (LWP 21571) "erato" primes_computer_runner2 (param=0x7fffffffca50) at erato.c:46
1 Thread 0x7ffff7fad700 (LWP 21565) "erato" 0x00007ffff7bc568d in pthread_join () from /usr/lib/libpthread.so.0
(gdb) print/x thread_handle
$1 = 0x7ffff6d19700
See, Thread 0x7ffff7fad700 maps to LWP 21565.
In comparison, without libthread_db it just gives the LWP id (in another run):
* 3 LWP 22060 "erato" primes_computer_runner2 (param=0x7fffffffca50) at erato.c:46
1 LWP 22058 "erato" 0x00007ffff76037b1 in clone () from /usr/lib/libc.so.6
If you want further details about pthread_db purpose, and why it's mandatory (or something equivalent) for user and hybrid threading libraries, you can take a look at this article I wrote several years ago:
User Level DB: a Debugging API for User-Level Thread Libraries
The common cause for this error message:
Unable to find libthread_db matching inferior's thread library, ...
is having libpthread.so.0 that is fully stripped. Don't do that.
In particular, libthread_db.so needs nptl_version (local) symbol. You can verify whether your libpthread.so.0 has it with:
nm /path/to/libpthread.so.0 | grep version
which should produce something like:
0000000000012cc6 r nptl_version

libspotify crashes at logout. Possible leak of threads?

I've been messing around with libspotify for a while now but I get a segmentation fault every once in a while when I call the sp_session_logout() API.
I did a test that does the following loop:
Create a session
Login
Logout
Release the session
It usually crashes after 3-4 iterations.
The code and the stdout can be found here:
https://gist.github.com/jonas-lundqvist/4bb759a1f1ed09de46a1
Compiled with:
gcc -std=gnu99 -g -Wall `pkg-config --cflags libspotify` -o test_login_logout login_logout.c appkey.c cred.c -lpthread `pkg-config --libs libspotify`
I noticed however that two threads was created in the start of each iteration, but only one exited.
This is the output from gdb
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7fbf700 (LWP 31308)]
[New Thread 0x7ffff7f3e700 (LWP 31309)]
[Thread 0x7ffff7f3e700 (LWP 31309) exited]
[New Thread 0x7ffff7f3e700 (LWP 31310)]
[New Thread 0x7ffff7ebd700 (LWP 31311)]
[Thread 0x7ffff7ebd700 (LWP 31311) exited]
[New Thread 0x7ffff7ebd700 (LWP 31313)]
[New Thread 0x7ffff6638700 (LWP 31314)]
[Thread 0x7ffff6638700 (LWP 31314) exited]
[New Thread 0x7ffff6638700 (LWP 31315)]
[New Thread 0x7ffff65b7700 (LWP 31316)]
Program received signal SIGSEGV, Segmentation fault.
Inspecting the threads I can see:
Id Target Id Frame
9 Thread 0x7ffff65b7700 (LWP 31316) "Network Thread" 0x00007ffff744b50d in poll () at ../sysdeps/unix/syscall-template.S:81
8 Thread 0x7ffff6638700 (LWP 31315) "Dns Thread" 0x00007ffff744b50d in poll () at ../sysdeps/unix/syscall-template.S:81
6 Thread 0x7ffff7ebd700 (LWP 31313) "Dns Thread" sem_wait ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
4 Thread 0x7ffff7f3e700 (LWP 31310) "Dns Thread" sem_wait ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
2 Thread 0x7ffff7fbf700 (LWP 31308) "Dns Thread" sem_wait ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
* 1 Thread 0x7ffff7fc1700 (LWP 31304) "test_login_logo" 0x00007ffff77eb5ac in ?? () from /home/jonas/lib/libspotify/lib/libspotify.so.12
And if I pick on of the threds created in one of the iterations I can see this:
[Switching to thread 4 (Thread 0x7ffff7f3e700 (LWP 31310))]
#0 sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
85 ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S: No such file or directory.
#0 sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1 0x00007ffff780c1f8 in ?? () from /home/jonas/lib/libspotify/lib/libspotify.so.12
#2 0x00007ffff77c608d in ?? () from /home/jonas/lib/libspotify/lib/libspotify.so.12
#3 0x00007ffff7bc70a4 in start_thread (arg=0x7ffff7f3e700) at pthread_create.c:309
#4 0x00007ffff745404d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Unfortunatley the libspotify development seems quite dead, but does anybody know if there is a work around?
I need to be able to do multiple sp_session_create()/sp_session_release() in my application...
Edit: bt of the crashing thread (from another, yet similar, run)
#0 0x00007ffff77eb5ac in ?? () from /home/jonas/lib/libspotify/lib/libspotify.so.12
#1 0x00007ffff7860b5a in ?? () from /home/jonas/lib/libspotify/lib/libspotify.so.12
#2 0x00007ffff781683f in ?? () from /home/jonas/lib/libspotify/lib/libspotify.so.12
#3 0x00007ffff7817141 in ?? () from /home/jonas/lib/libspotify/lib/libspotify.so.12
#4 0x00007ffff78add7d in sp_session_logout () from /home/jonas/lib/libspotify/lib/libspotify.so.12
#5 0x0000000000400d51 in main (argc=1, argv=0x7fffffffe3a8) at login_logout.c:99

Resources