Is there an equivalent of dds on lldb - lldb

I am trying to debug an issue on OSX and lldb is getting in my way. I think my program has a corrupted stack, and I would like to be able to manually walk the stack.
In WinDBG, there is a command called dds that I can use to dump all the pointers on the stack (basically, walking from rsp, walking towards higher addresses) and resolve all pointers to symbols (and print nothing if it does not correspond to code), I am looking for a similar command on lldb. I know I could memory read --format x manually one by one and then look them up using image lookup, but that would be too time consuming.

There isn't a built-in command to do the walk itself, so you will have to page through the memory up from rsp by hand.
But you might find the "A" format helpful for this task. That will print the memory as a list of address-sized words, and for any values that point into TEXT or DATA it will print the symbol's name. Like:
(lldb) mem read -fA `$rsp - 16 * 8` `$rsp` -fA
0x7ffeefbff660: 0x0000000000000000
0x7ffeefbff668: 0x00007ffeefbff660
0x7ffeefbff670: 0x0000003002000000
0x7ffeefbff678: 0x00007fff6e2ee568 libsystem_platform.dylib`__platform_sigaction + 103
0x7ffeefbff680: 0x0000000000000000
0x7ffeefbff688: 0x0000000000000000
0x7ffeefbff690: 0x0000000000013dc9
0x7ffeefbff698: 0x0000000000000000
0x7ffeefbff6a0: 0x00007fff6e238fe2 libsystem_kernel.dylib`__sigaction + 10
0x7ffeefbff6a8: 0x0000000000000000
0x7ffeefbff6b0: 0x000000000000001e
0x7ffeefbff6b8: 0x0000000000013dc9
0x7ffeefbff6c0: 0x00007ffeefbff700
0x7ffeefbff6c8: 0x0000000100002020 _dyld_private
0x7ffeefbff6d0: 0x000000000000000e
0x7ffeefbff6d8: 0x0000000100000f45 signals`main + 53 at signals.c:13:3

Related

printing stack with variable names with gdb?

I was just reading this article.
In the article, the author uses gdb to look around in a c executable.
At one point, when a breakpoint is hit, the author says to have a look at the stack, and shows this output:
STACK:
0x00007fffffffdf40│+0x0000: 0x00007fffffffe058 → 0x00007fffffffe380
0x00007fffffffdf48│+0x0008: 0x0000000100401050
0x00007fffffffdf50│+0x0010: 0x00007fffffffe050 → 0x0000000000000001
0x00007fffffffdf58│+0x0018: 0x0000000000402004 → “p#ssw0rD”
0x00007fffffffdf60│+0x0020: 0x0000000000000000 ← $rbp
0x00007fffffffdf68│+0x0028: 0x00007ffff7ded0b3 → <__libc_start_main+243> mov edi, eax
0x00007fffffffdf70│+0x0030: 0x00007ffff7ffc620 → 0x0005043700000000
0x00007fffffffdf78│+0x0038: 0x00007fffffffe058 → 0x00007fffffffe380 →
This is nice, but how do I generate this output in gdb?
I've been googling for a while with no luck
Also, in this output there is two different columns of hex adresses, I'm guessing one points to the stack, what is the other one? and which is which?
The author doesn't state it explicitly, but in their gdb output you can see the prompt gef>. This indicates they are likely making use of the gef addon for gdb.
I have never used this addon myself, but you can see in some of the example output on the gef site that the addon has a stack view identical to the output you gave above.
The gef addon makes use of gdb's Python API to provide additional features for gdb, one of which appears to be the alternative stack view.

Why do syslog and gdb show different load address for the same shared library?

I'm facing a segmentation fault. Syslog reports the following:
segfault at 0 ip 00000000f71ff256 sp 00000000f44fee50 error 4 in libprotobuf-c.so.0.0.0[f71f8000+f000]
So, libprotobuf-c.so is loaded at 0xf71f8000. When I loaded the respective core file in gdb and tried info sharedlibrary, it shows FROM address as 0xf71f9f70 which is different from what syslog showed. I'm not able to understand this mismatch. Could someone please help?
0xf71f9f70 0xf7204028 Yes (*) /usr/lib/libprotobuf-c.so.0
So, libprotobuf-c.so is loaded at 0xf71f8000. When I loaded the respective core file in gdb and tried info sharedlibrary, it shows FROM address as 0xf71f9f70 which is different from what syslog showed.
Actually they are the same. GDB shows start of .text as the From address.
If you do readelf -WS /usr/lib/libprotobuf-c.so.0 | grep '\.text', you'll discover that .text starts at 0xf71f9f70 - 0xf71f8000 == 0x1f70.

Translate Instruction Pointer Address (in shared library) to Source Instruction

Are there any tools or libraries one can use on Linux to get the original (source) instruction only from the PID and the current instruction pointer address, even if the IP currently points into a shared library?
AFAIK it should be possible, since the location of the library mapping is available through /proc/[PID]/maps, though I haven't found any applications or examples doing so.
Any suggestions?
EDIT: an assembly instruction or the nearest symbol suffice (source code line is not necessarily needed)
I found a way to do this with GDB:
Interactive:
$ gdb --pid 1566
(gdb) info symbol 0x7fe28b8a2b79
pselect + 89 in section .text of /lib/x86_64-linux-gnu/libc.so.6
(gdb) info symbol 0x5612550f14a4
copy_word_list + 20 in section .text of /usr/bin/bash
(gdb) info symbol 0x7fe28b878947
execve + 7 in section .text of /lib/x86_64-linux-gnu/libc.so.6
Shows exactly what I wanted!
It can also be scripted:
gdb -q --pid PID --batch -ex 'info symbol HEX_SYMBOL_ADDR'

Why is _init from glibc's csu/init-first.c called before _start even if _start is the ELF entry point?

I first noticed it while playing with GDB's rbreak ., and then made a minimal example:
(gdb) file hello_world.out
Reading symbols from hello_world.out...done.
(gdb) b _init
Breakpoint 1 at 0x4003e0
(gdb) b _start
Breakpoint 2 at 0x400440
(gdb) run
Starting program: /home/ciro/bak/git/cpp/cheat/gdb/hello_world.out
Breakpoint 1, _init (argc=1, argv=0x7fffffffd698, envp=0x7fffffffd6a8) at ../csu/init-first.c:52
52 ../csu/init-first.c: No such file or directory.
(gdb) continue
Continuing.
Breakpoint 2, 0x0000000000400440 in _start ()
(gdb) continue
Continuing.
Breakpoint 1, 0x00000000004003e0 in _init ()
(gdb) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y <MULTIPLE>
breakpoint already hit 2 times
1.1 y 0x00000000004003e0 <_init>
1.2 y 0x00007ffff7a36c20 in _init at ../csu/init-first.c:52
2 breakpoint keep y 0x0000000000400440 <_start>
breakpoint already hit 1 time
Note that there are 2 _init: one in csu/init-first.c, and the other seems to come from sysdeps/x86_64/crti.S. I'm talking about the csu one.
Isn't _start supposed to be the entry point set by the linker, and stored in the ELF header? What mechanism makes _init run first? What is its purpose?
Tested on GCC 4.8, glibc 2.19, GDB 7.7.1 and Ubuntu 14.04.
Where the debugger halts first in your example isn't the real beginning of the process.
In the ELF header there is an entry for the program interpreter (dynamic linker). On Linux 64 bit its value is /lib64/ld-linux-x86-64.so.2. The kernel sets the initial instruction pointer to the entry point of this program interpreter. The symbol name of it is _start too, like the programs _start.
After the dynamic linker has done its work, calling also functions in the program, like _init in glibc, it calls the entry point of the program.
The breakpoint at _start doesn't work for the dynamic linker because it takes only the address of the program's _start.
You can find the entry point address with readelf -h /lib64/ld-linux-x86-64.so.2.
You could also set a breakpoint at _dl_start and print a backtrace to see that this function is called from dynamic linker's _start.
If you download glibc's current source code you can find the entry point of the dynamic loader at glibc-2.21/sysdeps/x86_64/dl-machine.h starting on line 121.

How to make good use of stack trace (from kernel or core dump)?

If you are lucky when your kernel module crashes, you would get an oops with a log with a lot of information, such as values in the registers etc. One such information is the stack trace (The same is true for core dumps, but I had originally asked this for kernel modules). Take this example:
[<f97ade02>] ? skink_free_devices+0x32/0xb0 [skin_kernel]
[<f97aba45>] ? cleanup_module+0x1e5/0x550 [skin_kernel]
[<c017d0e7>] ? __stop_machine+0x57/0x70
[<c016dec0>] ? __try_stop_module+0x0/0x30
[<c016f069>] ? sys_delete_module+0x149/0x210
[<c0102f24>] ? sysenter_do_call+0x12/0x16
My guess is that the +<number1>/<number2> has something to do with the offset from function in which the error has occurred. That is, by inspecting this number, perhaps looking at the assembly output I should be able to find out the line (better yet, instruction) in which this error has occurred. Is that correct?
My question is, what are these two numbers exactly? How do you make use of them?
skink_free_devices+0x32/0xb0
This means the offending instruction is 0x32 bytes from the start of the function skink_free_devices() which is 0xB0 bytes long in total.
If you compile your kernel with -g enabled, then you can get the line number inside functions where the control jumped using the tool addr2line or our good old gdb
Something like this
$ addr2line -e ./vmlinux 0xc01cf0d1
/mnt/linux-2.5.26/include/asm/bitops.h:244
or
$ gdb ./vmlinux
...
(gdb) l *0xc01cf0d1
0xc01cf0d1 is in read_chan (include/asm/bitops.h:244).
(...)
244 return ((1UL << (nr & 31)) & (((const volatile unsigned int *) addr)[nr >> 5])) != 0;
(...)
So just give the address you want to inspect to addr2line or gdb and they shall tell you the line number in the source file where the offending function is present
See this article for full details
EDIT: vmlinux is the uncompressed version of the kernel used for debugging and is generally found # /lib/modules/$(uname -r)/build/vmlinux provided you have built your kernel from sources. vmlinuz that you find at /boot is the compressed kernel and may not be that useful in debugging
For Emacs users, here's is a major mode to easily jump around within the stack trace (uses addr2line internally).
Disclaimer: I wrote it :)
regurgitating this answer you need to use faddr2line
In my case I had the following truncated call trace:
[ 246.790938][ T35] Call trace:
[ 246.794075][ T35] __switch_to+0x10c/0x180
[ 246.798348][ T35] __schedule+0x278/0x6e0
[ 246.802531][ T35] schedule+0x44/0xd0
[ 246.806368][ T35] rpm_resume+0xf4/0x628
[ 246.810463][ T35] __pm_runtime_resume+0x94/0xc0
[ 246.815257][ T35] macb_open+0x30/0x2b8
[ 246.819265][ T35] __dev_open+0x10c/0x188
and ran the following in the mainline linux kernel:
./scripts/faddr2line vmlinux macb_open+0x30/0x2b8
giving the output
macb_open+0x30/0x2b8:
pm_runtime_get_sync at include/linux/pm_runtime.h:386
(inlined by) macb_open at drivers/net/ethernet/cadence/macb_main.c:2726

Resources