I'm trying to debug kernel module. I suspect to have there some memory leaks. To check it I have prepared build with enabled Memory leak debugging for kernel and modules. And I got some warning from that:
[11839.429168] slab error in verify_redzone_free(): cache `size-64': memory outside object was overwritten
[11839.438659] [<c005575c>] (unwind_backtrace+0x0/0x164) from [<c0116ca0>] (kfree+0x278/0x4d8)
[11839.447357] [<c0116ca0>] (kfree+0x278/0x4d8) from [<bf083f48>] (some_function+0x18/0x1c [my_module])
[11839.457214] [<bf083f48>] (some_function+0x18/0x1c [my_module]) from [<bf08762c>] (some_function+0x174/0x718 [my_module])
[11839.470184] [<bf08762c>] (some_function+0x174/0x718 [my_module]) from [<bf0a56b8>] (some_function+0x12c/0x16c [my_module])
[11839.483917] [<bf0a56b8>] (some_function+0x12c/0x16c [my_module]) from [<bf085790>] (some_function+0x8/0x10 [my_module])
[11839.496368] [<bf085790>] (some_function+0x8/0x10 [my_module]) from [<bf07b74c>] (some_function+0x358/0x6d4 [my_module])
[11839.507476] [<bf07b74c>] (some_function+0x358/0x6d4 [my_module]) from [<c00a60f8>] (worker_thread+0x1e8/0x284)
[11839.517211] [<c00a60f8>] (worker_thread+0x1e8/0x284) from [<c00a9edc>] (kthread+0x78/0x80)
[11839.525543] [<c00a9edc>] (kthread+0x78/0x80) from [<c004f8fc>] (kernel_thread_exit+0x0/0x8)
There is no problem to translate addresses which points to kernel:
$ addr2line -f -e vmlinux.kmeml c0116ca0
verify_redzone_free
/[...]/kernel/mm/slab.c:2922
But I can't do that if addresses are from my_module:
$ addr2line -f -e vmlinux.kmeml bf0a56b8
??
??:0
I was also trying with module file:
$ addr2line -f -e my_module.ko bf0a56b8
??
??:0
How can I translate this addresses to files and line numbers?
I suppose the module is built with debug info included. If so, you can use gdb or objdump to find out which source file and line each address belongs to. Something like this:
$ gdb "$(modinfo -n my_module)"
(gdb) list *(some_function+0x12c)
Gdb will now tell the name of the source file and the line in it.
You can also do a similar thing with objdump but it is a bit more difficult. First, disassemble the module:
objdump -dSlr my_module.ko > my_module.disasm
When called with -S option, objdump will include the source lines in the resulting listing where appropriate.
You can now scroll the listing down to the code of some_function, find the instruction at offset 0x12c from the beginning of the function. The source line will be indicated above it.
EDIT:
After many experiments, I found that although addr2line can indeed be used for kernel modules, eu-addr2line (a similar tool from elfutils) seems to be more reliable. That is, sometimes addr2line output incorrect source lines but eu-add2line did things right.
To use eu-addr2line, one may need to install libdw and libebl libraries if they are not already installed along with elfutils.
The usage is similar to that of addr2line:
eu-addr2line -f -e <path_to_the_module> -j <section_name> <offset_in_section>
If the debug information for a kernel module is stored in separate file (this is often the case for the kernels provided by the major Linux distros), the path to that file should be used as <path_to_the_module>.
You indeed need to run addr2line on your kernel module and not kernel but there is a twist -
the kernel module file uses relative addresses, the crash address you have is actually composed of:
offset inside module + module load address is memory.
So what you need to do is find the kernel moduel load address is memory first by doing cat /proc/modules, finding to what module that address belongs to, in case you don't know, subtract the module load address from the crash address and feed that to addr2line
good luck
Maybe you should use -g parameter to compile the module.
Related
I'm trying to get the line of source code from addr2line for the rasbian 5.4.y kernel.
My host environment is ubuntu18.04.2 on virtualbox, and I'm compiling the kernel with arm-linux-gnueabihf- cross-compiler.
I compiled the kernel with bcm2711_defconfig configuration since the target machine is Raspberry Pi 4, following the official guide, https://www.raspberrypi.org/documentation/linux/kernel/building.md, with 32-bit arm arch configuration. I didn't modify any kernel configuration at all.
I obtained the address of a function (_local_bh_enable here, for an instance) from vmlinux by using objdump as below,
$ arm-linux-gnueabihf-objdump -x linux/vmlinux | grep _local_bh_enable
...
c0227a28 g F .text 00000098 _local_bh_enable
As you can see above, I got the address for _local_bh_enable as 0xc0227a28.
Then I ran addr2line to get the line of the address but some strange result I've got as below.
$ arm-linux-gnueabihf-addr2line -fe linux/vmlinux -a 0xc0227a28
0xc0227a28
_local_bh_enable
.tmp_vmlinux.kallsyms2.o:?
I don't get what that means. Isn't it supposed to give the source file name with the line number on it?
I've also tried with many other functions but ended up all the same with the ".tmp_vmlinux.kallsyms2.o:?", not the code line I'm expecting to get.
Am I missing something here? Please give me any help for this.
Thanks in advance.
I checked the kernel configuration, and I've found out that the CONFIG_DEBUG_INFO was not set. It seems the cause.
I'm compiling the kernel with CONFIG_DEBUG_INFO set, let me see whether it's working.
I am trying to use addr2line with a archive file libdpdk.a
I have a backtrace:
backtrace returned: 7
0: 0x46fd05 ./build/ip_pipeline(bt+0x25) [0x46fd05]
1: 0x42a163 ./build/ip_pipeline() [0x42a163]
2: 0x46ff21 ./build/ip_pipeline(rte_eal_init+0x171) [0x46ff21]
3: 0x439629 ./build/ip_pipeline(app_init+0x709) [0x439629]
4: 0x42b3ff ./build/ip_pipeline(main+0x5f) [0x42b3ff]
5: 0x7f101166b830 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f101166b830]
6: 0x42d009 ./build/ip_pipeline(_start+0x29) [0x42d009]
I tried the following command:
addr2line 0x46fd05 -f -e ../../build/lib/librte_eal.a
addr2line: ../../build/lib/librte_eal.a: cannot get addresses from archive
The expected output should be a name of the function in the backtrace at address 0x46fd05 or 0x46fd05 depending on which address I pass. Currently there is no symbol name associated with this address.
Any suggesstions.
I have compiled the code using -rdynamic
Putting a side the reason for choosing .a/.so, The 'addr2line' should be used with the binary that was executed. The reason is that the backtrace addresses are specific to a binary.
The same static (.a) library will usually be loaded into different addresses in different binaries. This is true with '.so' (especially, position-independent code) - but in many cases, Linux will attempt to reuse already mapped '.so' files, so that the actual addresses are the same.
Bottom line - from the man page - use the executable name.
--exe=filename
Specify the name of the executable for which addresses should be translated.
The default file is a.out.
Some practical note - when using '.so' - you want to execute the addr2line on a system that has the same executable, shared objects, and LD_LIBRARY_PATH. If there are different '.so' on your development and on production, the addresses may not match.
Consider the following code in mono/domain.c:
static MonoDomain *mono_root_domain = NULL;
...
MonoDomain* mono_get_root_domain (void)
{
return mono_root_domain;
}
My task is to read the struct data pointed by the mono_root_domain pointer in runtime from another process. (Attaching, reading, locating dylibs, etc. from this other process is solved already)
Looking into the generated libmono dylib I can find the corresponding symbol:
This symbol points to the address of 0x2621A8 which in the local relocation section (__DATA, __bss):
This points to the address of 0x1A7690 (__TEXT, __symbol_stub):
The target is
so 0x1A7DF8 (__TEXT, __stub_helper):
At this point I am completely lost of how to retrieve the actual pointer to the MonoDomain struct. Any help is appreciated.
For security reasons and to prevent buffer overflow attacks and other exploits, you can't know that, because of a security measure called PIE or ASLR (address space layout randomization). However, this can be disabled for debugging purposes. LLDB and GDB do/did it in order to debug executables. The way this can be done with a CLI app is as follows:
Copy or download this python script from GitHub
https://github.com/thlorenz/chromium-build/blob/master/mac/change_mach_o_flags.py
Save the python script, for example, next to your executable
If so, open Terminal and cd to where your executable is
enter chmod +x ./change_mach_o_flags.py to make the script executable
enter ./change_mach_o_flags.py --no-pie ./YourExecutable
Now the addresses of your executable should not be randomized anymore. Because of that, to calculate the addresses of your static / global variables is possible. To do that, do the following in Terminal (I am assuming you are using a 64-bit machine):
otool -v -l ./YourExecutable | open -f(this will generate a file text with the commands inside your executable of how to layout DATA, TEXT, etc. in memory)
Look for the section you are interested in. Look at the addr field. If it contains let's say 0x0000000100001020 then the variable will be placed exactly there with ASLR disabled.
I am not sure if this works with dylibs but you can try it. Now I ran out of time, but I can try at home and see if this is doable with dylibs.
Consider the following Linux kernel dump stack trace; e.g., you can trigger a panic from the kernel source code by calling panic("debugging a Linux kernel panic");:
[<001360ac>] (unwind_backtrace+0x0/0xf8) from [<00147b7c>] (warn_slowpath_common+0x50/0x60)
[<00147b7c>] (warn_slowpath_common+0x50/0x60) from [<00147c40>] (warn_slowpath_null+0x1c/0x24)
[<00147c40>] (warn_slowpath_null+0x1c/0x24) from [<0014de44>] (local_bh_enable_ip+0xa0/0xac)
[<0014de44>] (local_bh_enable_ip+0xa0/0xac) from [<0019594c>] (bdi_register+0xec/0x150)
In unwind_backtrace+0x0/0xf8 what does +0x0/0xf8 stand for?
How can I see the C code of unwind_backtrace+0x0/0xf8?
How to interpret the panic's content?
It's just an ordinary backtrace, those functions are called in reverse order (first one called was called by the previous one and so on):
unwind_backtrace+0x0/0xf8
warn_slowpath_common+0x50/0x60
warn_slowpath_null+0x1c/0x24
ocal_bh_enable_ip+0xa0/0xac
bdi_register+0xec/0x150
The bdi_register+0xec/0x150 is the symbol + the offset/length there's more information about that in Understanding a Kernel Oops and how you can debug a kernel oops. Also there's this excellent tutorial on Debugging the Kernel
Note: as suggested below by Eugene, you may want to try addr2line first, it still needs an image with debugging symbols though, for example
addr2line -e vmlinux_with_debug_info 0019594c(+offset)
Here are two alternatives for addr2line. Assuming you have the proper target's toolchain, you can do one of the following:
Use objdump:
locate your vmlinux or the .ko file under the kernel root directory, then disassemble the object file :
objdump -dS vmlinux > /tmp/kernel.s
Open the generated assembly file, /tmp/kernel.s. with a text editor such as vim. Go to
unwind_backtrace+0x0/0xf8, i.e. search for the address of unwind_backtrace + the offset. Finally, you have located the problematic part in your source code.
Use gdb:
IMO, an even more elegant option is to use the one and only gdb. Assuming you have the suitable toolchain on your host machine:
Run gdb <path-to-vmlinux>.
Execute in gdb's prompt: list *(unwind_backtrace+0x10).
For additional information, you may checkout the following resources:
Kernel Debugging Tricks.
Debugging The Linux Kernel Using Gdb
In unwind_backtrace+0x0/0xf8 what the +0x0/0xf8 stands for?
The first number (+0x0) is the offset from the beginning of the function (unwind_backtrace in this case). The second number (0xf8) is the total length of the function. Given these two pieces of information, if you already have a hunch about where the fault occurred this might be enough to confirm your suspicion (you can tell (roughly) how far along in the function you were).
To get the exact source line of the corresponding instruction (generally better than hunches), use addr2line or the other methods in other answers.
A little similar with Where are static variables stored (data segment or heap or BSS)?,but not the same one.
Now I get a other process's variable's address like:0x10fb90,where is this variable stored(data segment or heap or BSS), could i get the location just from the process's pid and the variable's address?
I am working on osx using obj-c and c.
You have 2 options.
1. Use objdump
Something like
objdump -x a.out | grep YOUR_VARIABLE_ADDRESS
2. Use gcc's map option to generate a map file
Compile something like this in gcc
$ gcc -o foo.exe -Wl,-Map,foo.map foo.c
and now
$ grep YOUR_VARIABLE_ADDRESS foo.map
Both these methods will show your variable's location, if at all the address you supplied exits.
PS: The link I've added for the map file shows an example map file generated by Visual Studio linkers, but the format is typically similar in most of the map file formats generated by various linkers