I want to debug the c function of an R package using R -d gdb, but I get the following after setting breakpoint at c function C_MIM(), I got the following information and also "cannot find bound of the current function" so I could not print out any variable value in this case. Is there something I am doing wrong? Or for some R package, it is not possible to debug?
Breakpoint 1, 0x00007fffdee0035f in C_MIM ()
from /home/sunxd/R/x86_64-pc-linux-gnu-library/3.4/praznik/libs/praznik.so
(gdb) list
76 in ../sysdeps/unix/syscall-template.S
(gdb) n
Single stepping until exit from function C_MIM,
which has no line number information.
^C
Program received signal SIGINT, Interrupt.
[Switching to Thread 0x7fffdddfa700 (LWP 21179)]
---Type <return> to continue, or q <return> to quit---
0x00007ffff45c707e in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
it turns out one must have the source code and compile the R package using specific gcc/cc options.
Related
I'm trying to debug a hard fault in a C++ firmware project for the microbit v1.5 .
The issue at hand is that after a hard fault I would like to reset the microcontroller
and start anew but issuing the dreaded monitor reset halt does not work and execution never restarts properly after a hard fault.
I'm using pyocd (v0.33.1) as my gdb debugserver and a custom built gdb (v8.2.1) with proper support for the nrf51 series.
This is an example interaction with gdb. I set a breakpoint on HardFault_Handler and start execution. The firmware correctly spawns tasks but eventually one of the tasks faults and the HardFault handler gets called. After this I would like to reset the microcontroller and start anew.
I expect the microcontroller to spawn the same set of tasks but this never happens and it also never goes back to main so I'm thinking there must be a specific way to reset it correctly.
What command should I issue to reset the flow of execution to start with main or one of the routines from gcc_startup?
(gdb) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y 0x000290e2 ../support/libs/nrfx/mdk/gcc_startup_nrf51.S:234
(gdb) c
Continuing.
[New Thread 2]
[New Thread 536884080]
[New Thread 536880760]
[New Thread 536884152]
Thread 2 "Handler mode" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 2]
0x000006b0 in ?? ()
(gdb) info threads
Id Target Id Frame
* 2 Thread 2 "Handler mode" (HardFault) 0x000006b0 in ?? ()
3 Thread 536884080 "IDL" (Ready; Priority 0) prvIdleTask (pvParameters=0x0)
at ../support/freertos/tasks.c:3225
4 Thread 536880760 "KNL" (Ready; Priority 1) starlight::sys::Task::<lambda(void*)>::_FUN(void *)
() at ../include/starlight/sys/task.hpp:154
5 Thread 536884152 "Tmr" (Running; Priority 2) __DSB ()
at ../support/libs/CMSIS-Core/Include/cmsis_gcc.h:946
(gdb) monitor reset halt
Resetting target with halt
Successfully halted device on reset
(gdb) c
Continuing.
[New Thread 1]
Thread 6 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1]
0x000006b0 in ?? ()
(gdb) info threads
Id Target Id Frame
* 6 Thread 1 (HardFault) 0x000006b0 in ?? ()
(gdb) monitor reset halt
Resetting target with halt
Successfully halted device on reset
(gdb) c
Continuing.
Thread 6 received signal SIGSEGV, Segmentation fault.
0x000006b0 in ?? ()
(gdb) backtrace
#0 0x000006b0 in ?? ()
#1 <signal handler called>
Backtrace stopped: Cannot access memory at address 0x4b0547f8
I have a program having over 300 threads to which I have attached gdb. I need to identify one particular thread whose call stack has a frame containing a variable whose value I want to use for matching. Can I script this in gdb?
(gdb) thread 3
[Switching to thread 3 (Thread 0x7f16c1eeb700 (LWP 18833))]
#4 0x00007f17f3a3bdd5 in start_thread () from /lib64/libpthread.so.0
(gdb) backtrace
#0 0x00007f17f3a3fd12 in pthread_cond_timedwait##GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f17e72838be in __afr_shd_healer_wait (healer=healer#entry=0x7f17e05203d0) at afr-self-heald.c:101
#2 0x00007f17e728392d in afr_shd_healer_wait (healer=healer#entry=0x7f17e05203d0) at afr-self-heald.c:125
#3 0x00007f17e72848e8 in afr_shd_index_healer (data=0x7f17e05203d0) at afr-self-heald.c:572
#4 0x00007f17f3a3bdd5 in start_thread () from /lib64/libpthread.so.0
#5 0x00007f17f3302ead in clone () from /lib64/libc.so.6
(gdb) frame 3
#3 0x00007f17e72848e8 in afr_shd_index_healer (data=0x7f17e05203d0) at afr-self-heald.c:572
572 afr_shd_healer_wait (healer);
(gdb) p this->name
$6 = 0x7f17e031b910 "testvol-replicate-0"
For example, can I run a macro to loop over each thread, go to frame 3 in each of it, inspect the variable this->name and print the thead number only if the value matches testvol-replicate-0 ?
It's possible to integrate Python into GDB. Then, with the Python GDB API, you could loop over threads and search for a match. Below two examples of debugging threads with GDB and Python.
https://www.linuxjournal.com/article/11027
https://fy.blackhats.net.au/blog/html/2017/08/04/so_you_want_to_script_gdb_with_python.html
Consider the following program.
#include <unistd.h>
int main(){
sleep(1000);
}
If we run strace on this program, the last line that appears before the long sleep is the following.
nanosleep({1000, 0},
While the program is asleep, the code is executing (likely blocked) inside the OS kernel.
When I run the program under gdb, if I send SIGINT in the middle of the sleep, I can collect various information about the main thread, such as its backtrace and various register values.
Is there is some expression in gdb that evaluates to true iff the thread must cross a syscall boundary before executing code in userspace again?
Ideally, there would be a cross-platform solution, but platform-specific solutions are also useful.
Clarification: I do not care whether the thread is actually executing; only whether its most recent program counter value was in kernel code or user code.
Put another way, can gdb tell us whether a particular thread has entered the kernel but not yet exited the kernel?
Is there is some expression in gdb that evaluates to true if the
thread must cross a syscall boundary before executing code in
userspace again?
You can try to use catch syscall nanosleep, see documentation.
catch syscall nanosleep stops on 2 events: the one on call to a system call and the one on return from a system call. You can use info breakpoints to see the number of hit times of this catchpoint. If it is even, then you should be in user space. If it is odd, then you you should be in kernel space:
$ gdb -q a.out
Reading symbols from a.out...done.
(gdb) catch syscall nanosleep
Catchpoint 1 (syscall 'nanosleep' [35])
(gdb) i b
Num Type Disp Enb Address What
1 catchpoint keep y syscall "nanosleep"
(gdb) r
Starting program: /home/ks1322/a.out
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.27-8.fc28.x86_64
Catchpoint 1 (call to syscall nanosleep), 0x00007ffff7adeb54 in nanosleep () from /lib64/libc.so.6
(gdb) i b
Num Type Disp Enb Address What
1 catchpoint keep y syscall "nanosleep"
catchpoint already hit 1 time
(gdb) c
Continuing.
Catchpoint 1 (returned from syscall nanosleep), 0x00007ffff7adeb54 in nanosleep () from /lib64/libc.so.6
(gdb) i b
Num Type Disp Enb Address What
1 catchpoint keep y syscall "nanosleep"
catchpoint already hit 2 times
(gdb) c
Continuing.
[Inferior 1 (process 19515) exited normally]
I'm just doing some experiments using GDB and playing around with the registers, but I encounter a problem when using the syscall gettimeofday() and a watchpoint on a register.
first let me expose a little example of what I am doing.
ok, here is the code which I am using (very simple):
#include <stdio.h>
main()
{
int num;
getchar();
num=190320;
printf("value: %d\n", num);
}
well, what I am doing is just run the program (which stop at the getchar() funtion until I press enter) and then attach the program to a gdb session in other shell:
gdb -p <pid>
now I just add a conditional watchpoint on the "rdi" register so I can check the status of the program when the variable "num" is assigned :
(gdb) watch $rdi == 190320
Watchpoint 1: $rdi == 190320
and now continue the program execution on gdb and push enter on the other shell where I am running the program, and as you can see gdb stop the program in the watchpoint just like I wanna.
(gdb) c
Continuing.
Watchpoint 1: $rdi == 190320
well, this is the version that works like I just expect, a simple application that runs ok and a watchpoint that stop in the right moment.
Ok, now go to the problem itself.
this is the same program I used before but with the difference that I use a gettimeofday() before the variable assignation:
#include <stdio.h>
#include <sys/time.h>
main()
{
int num;
struct timeval tim;
getchar();
gettimeofday(&tim, NULL); /* <---- Here is !!!*/
num=190320;
printf("value: %d\n", num);
}
and now repeat the same steps I did before:
-run the program in a shell
-attach the program to a gdb session in another shell
-set the conditional watchpoint on "rdi" register
but now when I continue the execution in gdb and push enter in the shell where the program is running, the program just get stuck at the gettimeofday() function.
if I press "Ctrl+C" on gdb I can check that the program is stuck in this function
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x00007ffc88b85e3c in gettimeofday ()
now if I disable the watchpoint and try to continue the execution again, all goes fine, and the program ends with no problem (obviously the watchpoint is disable and gdb doesn't stop the program in the moment that I want to).
(gdb) info breakpoint
Num Type Disp Enb Address What
1 watchpoint keep y $rdi == 190320
(gdb) disable 1
(gdb) c
Continuing.
[Inferior 1 (process 4151) exited with code 016]
so I can verify that the cause of the program get stuck is the watchpoint set in the register...
So the question is, can someone explain why is this happening? and, is there any way to solve this issue and do the program doesn't get stuck in the gettimeofday() function and reach the watchpoint?
PD: I know that I can stop the program in the variable assignation using other methods but this is just an experiment and I just want the explanation of why is this happening
PD2: Sorry for my bad english, it's not my mattern language.
I'd like to know if my program is accessing NULL pointers or stale memory.
The backtrace looks like this:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x2b0fa4c8 (LWP 1333)]
0x299a6ad4 in pthread_mutex_lock () from /lib/libpthread.so.0
(gdb) bt
#0 0x299a6ad4 in pthread_mutex_lock () from /lib/libpthread.so.0
#1 0x0058e900 in ?? ()
With GDB 7 and higher, you can examine the $_siginfo structure that is filled out when the signal occurs, and determine the faulting address:
(gdb) p $_siginfo._sifields._sigfault.si_addr
If it shows (void *) 0x0 (or a small number) then you have a NULL pointer dereference.
Run your program under GDB. When the segfault occurs, GDB will inform you of the line and statement of your program, along with the variable and its associated address.
You can use the "print" (p) command in GDB to inspect variables. If the crash occurred in a library call, you can use the "frame" series of commands to see the stack frame in question.