GDB is complaining that my source file is more recent than the executable, and it appears the debugging information is indeed related to an older version of the source file, because gdb is stopping on a blank line:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) up
#1 0x00007ffff7ba2d88 in CBKeyPairGenerate (keyPair=0x602010) at library/src/CBHDKeys.c:246
warning: Source file is more recent than executable.
246
(gdb) list
241 if (versionBytes == CB_HD_KEY_VERSION_TEST_PUBLIC
242 || versionBytes == CB_HD_KEY_VERSION_TEST_PRIVATE)
243 return CB_NETWORK_TEST;
244
245 return CB_NETWORK_UNKNOWN;
246
247 }
248
249 uint8_t * CBHDKeyGetPrivateKey(CBHDKey * key) {
250
But the executable is more recent than the source file, see here:
$ ls -l library/src/CBHDKeys.c
-rw-r--r-- 1 matt matt 9249 Apr 29 22:40 library/src/CBHDKeys.c
$ ls -l bin/noLowerAddressGenerator
-rwxr-xr-x 1 matt matt 17845 Apr 30 15:52 bin/noLowerAddressGenerator
I tried rebuilding after make clean and ccache -C but the same problem occurs. When I updated the source file I only added whitespace, so the program logic remains equal.I feel that has something to do with it, but since I cleared the ccache and cleaned the build and bin directory with make clean I'm not sure what is going on.
Versions:
GNU Make 3.81
gcc (Debian 4.8.2-16) 4.8.2
GNU gdb (GDB) 7.6.2 (Debian 7.6.2-1)
ccache version 3.1.9
SolydXK - SMP Debian 3.13.5-1 (2014-03-04)
Perhaps you're not using the most recent compiled version of the code, if it's in a shared library. You could use ldd noLowerAddressGenerator to see the library dependencies of your program; I don't know if it's possible from within GDB to locate the relevant library, but there ought to be a way (please comment or edit if you know how).
If this is indeed the case, you might want to set environment LD_LIBRARY_PATH in GDB prior to running the program, to place your newly-built library ahead of any installed ones. You could look into setting the RPATH ELF variable when linking, but that's likely to be less help.
Another possibility is to run your debugger on a system where you know the library isn't installed. I've had good results using schroot to keep build/debug/install environments separated.
Related
Why does gdb not show the error context when I run gdb /bin/ls ?
I implemented my malloc using the malloc-tutorial (https://danluu.com/malloc-tutorial/), compiled it in a shared library and I want to replace the current malloc on system with my malloc using LD_PRELOAD to practice debugging.
But when I run it through gdb, I can't see the detailed error context like in the tutorial. Why?
My:
$ gdb /usr/bin/ls
Reading symbols from /usr/bin/ls...
Reading symbols from /usr/lib/debug/.build-id/2f/15ad836be3339dec0e2e6a3c637e08e48aacbd.debug...
(gdb) set environment LD_PRELOAD=/home/ays/work/malloc-tutorial/malloc.so
(gdb) run
Starting program: /usr/bin/ls
During startup program terminated with signal SIGSEGV, Segmentation fault.
(gdb) list
1435 src/ls.c: No such file or directory.
(gdb)
In tutorial:
$ gdb /bin/ls
(gdb) set environment LD_PRELOAD=./malloc.so
(gdb) run
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7bd7dbd in free (ptr=0x0) at malloc.c:113
113 assert(block_ptr->free == 0);
I noticed that when I run gdb /bin/ls, the symbols are read from usr/lib/debug/.... Is this ok?
gdb /bin/ls
Reading symbols from /bin/ls...
Reading symbols from /usr/lib/debug/.build-id/2f/15ad836be3339dec0e2e6a3c637e08e48aacbd.debug...
If I need source code besides debug symbols, how do I install it properly?
By the way, it is not mentioned here (https://wiki.ubuntu.com/DebuggingProgramCrash) that I need to install the source code for debug.
Here some info about my coreutils and coreutils-dbgsym:
$ sudo apt-get install coreutils-dbgsym
Reading package lists... Done
Building dependency tree
Reading state information... Done
coreutils-dbgsym is already the newest version (8.30-3ubuntu2).
0 upgraded, 0 newly installed, 0 to remove and 64 not upgraded.
$ apt-cache policy coreutils
coreutils:
Installed: 8.30-3ubuntu2
Candidate: 8.30-3ubuntu2
Version table:
*** 8.30-3ubuntu2 500
500 http://archive.ubuntu.com/ubuntu focal/main amd64 Packages
100 /var/lib/dpkg/status
I'm trying to compile the same lib on two x86 separate machines.
Both use the same toolchain (exactly same set of files) but have different Glibc versions.
When I run command LD_DEBUG=libs /lib64/ld-linux-x86-64.so.2 --list ./libl2ps.so I notice the following discrepancy between the 2 Linux loaders:
Machine 1 (with Glibc 2.12):
19943: find library=libm.so.6 [0]; searching
19943: search path=/ebs/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64:...:/ebs/frperies/repo/gnb/uplane/build_bbp/l2_ps/build/. (RPATH from file ./libl2ps.so)
19943: trying file=/ebs/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64/libm.so.6
19943:
19943: find library=libgcc_s.so.1 [0]; searching
...
In this case the Linux loader selects lib libm.so.6 from the toolchain path based on RPATH of lib libl2ps.so.
Machine 2 (with Glibc 2.17):
10699: find library=libm.so.6 [0]; searching
10699: search path=/home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64:/home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/lib64:/home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib:/home/frperies/repo/gnb/uplane/build_bbp/l2_ps/build/. (RPATH from file ./libl2ps.so)
10699: trying file=/home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64/libm.so.6
10699: trying file=/home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/lib64/libm.so.6
10699: trying file=/home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib/libm.so.6
10699: trying file=/home/frperies/repo/gnb/uplane/build_bbp/l2_ps/build/./libm.so.6
10699: search cache=/etc/ld.so.cache
10699: trying file=/lib64/libm.so.6
As for Machine 1, the loader attempts from RPATH of libl2ps.so to select lib libm.so.6 from toolchain path but skip it for some reason and try further other paths. Finally it selects libm.so.6from the system path /lib64/.
The RPATH of the 2 libs lib2ps.so are exactly the same. The two files libm.so.6 are also exactly the same on both machines (checked with md5sum).
I don't understand this differences of behavior between the 2 Linux loaders.
Do you see any reason what would explain this discrepancy ?
Thank you very much for your answers.
Update:
Thank you yugr for your answer.
Output of readelf -h gives only differences on fields "Entry point address" and "Start of section headers" and there is no other differences so I think it will not help.
Regarding using dlopen()/dlerror(), I've done a little executable with the following statement:
dlopen("/home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64/libm-2.28.so", RTLD_LAZY);
On machine 1 it works as expected:
C++ dlopen demo
Opening libm-2.28.so...
Closing library...
On machine 2 it fails and dlerror() gives the following output:
Cannot open library: /home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64/libm-2.28.so: cannot open shared object file: No such file or directory
but the file libm-2-28.so really exists on my file system:
$ ls -l /home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64/libm-2.28.so
-rwxr-xr-x 1 frperies linseeusers_lte_espoo 1682944 Oct 5 13:50 /home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic- linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64/libm-2.28.so
This is very weird, what could lead to this situation ???
Thanks
Update 2:
That is true that I haven't pointed out that machine 1 is a RHEL6.8 distro while machine 2 is RHEL7.4 distro. I (naively?) didn't think this really matters...
On machine 1:
$ cat /proc/sys/kernel/osrelease
4.4.115-1.NSN.el6.x86_64
$ uname -a
Linux sq24-3 4.4.115-1.NSN.el6.x86_64 #1 SMP Mon Feb 12 12:35:46 CET 2018 x86_64 x86_64 x86_64 GNU/Linux
$ readelf -n libl2ps.so
Notes at offset 0x00000270 with length 0x00000024:
Owner Data size Description
GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring)
Build ID: b598468830fdf2f61eda25553b9a367c4d28cdc9
On machine 2:
$ cat /proc/sys/kernel/osrelease
3.10.0-693.el7.x86_64
$ uname -a
Linux localhost.localdomain 3.10.0-693.el7.x86_64 #1 SMP Thu Jul 6 19:56:57 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
$ readelf -n libl2ps.so
Displaying notes found at file offset 0x00000270 with length 0x00000024:
Owner Data size Description
GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring)
Build ID: 5829181bc0502233748149369108915ea7b10e8f
Does it help ?
Thanks
Update 3:
$ readelf -n libm.so.6
Notes at offset 0x00000238 with length 0x00000024:
Owner Data size Description
GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring)
Build ID: 0d84c7247dd76008c096719043e5592735a1c4bd
Notes at offset 0x0000025c with length 0x00000020:
Owner Data size Description
GNU 0x00000010 NT_GNU_ABI_TAG (ABI version tag)
OS: Linux, ABI: 4.4.0
So, how to interpret this ABI version number set to 4.4.0 ??
Thanks
Thank you yugr and Employed Russian for your answers!!
I will give it a try by upgrading my Kernel version on Machine 2.
Thanks
Regards
The error message that you see is the infamously confusing ENOENT errno. I see two instances of it in dl-load.c:
checking OS compatibility
loading non-setuid to setuid process
I suspect the first one fails in your case which would mean that OS kernel is incompatible between two machines. ld.so manpage indeed says that
Each shared object can inform the dynamic linker of the
minimum kernel ABI version that it requires. (This
requirement is encoded in an ELF note section that is viewable
via readelf -n as a section labeled NT_GNU_ABI_TAG.) At run
time, the dynamic linker determines the ABI version of the
running kernel and will reject loading shared objects that
specify minimum ABI versions that exceed that ABI version.
NT_GNU_ABI_TAG is 4.4.0 which means that you run a program expecting a minimum 4.4 kernel on a 3.10 kernel. Theoretically newer Glibc should run on older kernels as well but in your case Glibc was probly built with explicit --enable-kernel flag which prevents it's usage on kernels before 4.4 (see e.g. this explanation of --enable-kernel).
As a workaround, you may try to fool Glibc by overriding kernel version on machine 2 via
export LD_ASSUME_KERNEL=4.4.0
but it may not work if libm makes 4.4-specific syscalls that are not really present on 3.10.
I downloaded the source for libc6 and completed the build process successfully. (Though I did not performed a make install deliberately).
With the new linker built in buil-dir/elf/ld.so I ran a program supplying it as the argument to the newly built linker.
The test code prints some string and then malloc(sizeof(char)*1024).
On running the test binary as an argument to the newly built linker I get a Seg Fault at elf/dl-addr.c:132 which is:
131 /* Protect against concurrent loads and unloads. */
132 __rtld_lock_lock_recursive (GL(dl_load_lock));
This is the last frame before the seg fault and is called through malloc() call from the test program.
Stack Trace at that point :
#0 0x0000000000000000 in ?? ()
#1 0x00007f11a6a94928 in __GI__dl_addr (address=0x7f11a69e67a0 <ptmalloc_init>, info=0x7fffe9393be0, mapp=0x7fffe9393c00, symbolp=0x0) at dl-addr.c:132
#2 0x00007f11a69e64d7 in ptmalloc_init () at arena.c:381
#3 0x00007f11a69e72b8 in ptmalloc_init () at arena.c:371
#4 malloc_hook_ini (sz=<optimized out>, caller=<optimized out>) at hooks.c:32
#5 0x00000000004005b3 in main () at test.c:20
On running the same program with the default installed linker on the machine the program runs fine.
I am not able to understand what can be the issue behind this? (Is it faulting because I am using the newly built linker without installing it first)
-Any suggestions or pointers are highly appreciated.
Thanks
(System details GCC 4.8.22, eglibc-2.15 Ubuntu 12.10 64bit
With the new linker built in buil-dir/elf/ld.so I ran a program supplying it as the argument to the newly built linker.
It that's all you did, then the crash is expected, because you are mixing newly-built loader with system libraries (which doesn't work: all parts of glibc must come from the same build of it).
What you need to do is:
buil-dir/elf/ld.so \
--library-path buil-dir:buil-dir/dlfcn:buil-dir/nptl:... \
/path/to/a.out
The list of directories to search must include all the libraries (parts of glibc) that your program uses.
For some reason i am not able to attach to my very own processes?! Works fine if i try strace as root.
$ ./list8 &
[1] 3141
$ child4 starts...
$ strace -p 3141
attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted
Could not attach to process. If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf
$ cat /proc/sys/kernel/yama/ptrace_scope
1
Running on lubuntu 13.10
Linux goal 3.8.0-19-generic #29-Ubuntu SMP Wed Apr 17 18:19:42 UTC 2013 i686 i686 i686 GNU/Linux
So then how does gdb attach to user's own processes without having to muck around with kernel settings (ptrace_scope)??
Looks like you answered your own question -- you have ptrace_scope set to 1, so you can only trace direct children. To allow tracing any process belonging to the same user, set it to 0. This is also required to use the gdb attach command.
READ the /etc/sysctl.d/10-ptrace.conf file as your error message suggested...
If strace fails as root, try checking whether... gdb or strace is not running in the background (that was my case).
Command: ps aux | grep "gdb\|strace"
If this fails as root, I had a problem stracing enlightenment (e17) and the reason was that you can't strace a process already being straced or run under gdb, which some programs will do so that they can get their own debugging info.
My program loads a dynamic library, but after it tries to load it (it doesn't seem to, or at least something's amiss with the loading. A free() throws an error, and I commented out that line.)
I get the following in gdb.
Program received signal SIGSEGV, Segmentation fault.
__strlen_ia32 () at ../sysdeps/i386/i686/multiarch/../../i586/strlen.S:99
99 ../sysdeps/i386/i686/multiarch/../../i586/strlen.S: No such file or directory.
in ../sysdeps/i386/i686/multiarch/../../i586/strlen.S
How would I go about addressing this?
EDIT1:
The above issue was due to me not having an xml file where it should have been.
Here's the first error that I covered up to get to the initial error I showed.
(gdb) s
__dlopen (file=0xbfffd03c "/usr/lib/libvisual-0.5/actor/actor_AVS.so", mode=1)
at dlopen.c:76
76 dlopen.c: No such file or directory.
in dlopen.c
(gdb) bt
#0 __dlopen (file=0xbfffd03c "/usr/lib/libvisual-0.5/actor/actor_AVS.so",
mode=1) at dlopen.c:76
#1 0xb7f8680d in visual_plugin_get_references (
pluginpath=0xbfffd03c "/usr/lib/libvisual-0.5/actor/actor_AVS.so",
count=0xbfffd020) at lv_plugin.c:834
#2 0xb7f86168 in plugin_add_dir_to_list (list=0x804e428,
dir=0x804e288 "/usr/lib/libvisual-0.5/actor") at lv_plugin.c:609
#3 0xb7f86b2b in visual_plugin_get_list (paths=0x804e3d8,
ignore_non_existing=1) at lv_plugin.c:943
#4 0xb7f9c5db in visual_init (argc=0xbffff170, argv=0xbffff174)
at lv_libvisual.c:370
#5 0x080494b7 in main (argc=2, argv=0xbffff204) at client.c:32
(gdb) quit
A debugging session is active.
Inferior 1 [process 3704] will be killed.
Quit anyway? (y or n) y
starlon#lyrical:client$ ls /usr/lib/libvisual-0.5/actor/actor_AVS.so
/usr/lib/libvisual-0.5/actor/actor_AVS.so
starlon#lyrical:client$
The file exists. Not sure what's up. Not sure what code to provide either.
Edit2: More info on the file. Permissions are ok.
816K -rwxr-xr-x 1 root root 814K 2011-11-08 15:06 /usr/lib/libvisual-0.5/actor/actor_AVS.so
You didn't tell what dynamic library it is.
If it is a free dynamic library -or a library whose source is accessible to you- you can compile it and use it with debugging enabled.
Several Linux distributions -notably Debian & Ubuntu- provide debugging variant of many libraries (e.g. GLibc, GTK, Qt, etc...), so you don't need to rebuild them. For example, Debian has libgtk-3-0 package (the binary libraries mostly), libgtk-3-dev the development files for it (headers, etc...) and libgtk-3-0-dbg (the debugging variant of the library). You need to set LD_LIBRARY_PATH appropriately to use it (since it is in /usr/lib/debug/usr/lib/libgdk-3.so.0.200.1).
Sometimes, using the debugging variants of system libraries help you to find bugs in your own code. (Of course, you also need to compile with -g -Wall your own code)
Turned out this was due to a faulty hard drive. Looks like I need a new one.