linker issue or other? dynamically loaded lib - c

My program loads a dynamic library, but after it tries to load it (it doesn't seem to, or at least something's amiss with the loading. A free() throws an error, and I commented out that line.)
I get the following in gdb.
Program received signal SIGSEGV, Segmentation fault.
__strlen_ia32 () at ../sysdeps/i386/i686/multiarch/../../i586/strlen.S:99
99 ../sysdeps/i386/i686/multiarch/../../i586/strlen.S: No such file or directory.
in ../sysdeps/i386/i686/multiarch/../../i586/strlen.S
How would I go about addressing this?
EDIT1:
The above issue was due to me not having an xml file where it should have been.
Here's the first error that I covered up to get to the initial error I showed.
(gdb) s
__dlopen (file=0xbfffd03c "/usr/lib/libvisual-0.5/actor/actor_AVS.so", mode=1)
at dlopen.c:76
76 dlopen.c: No such file or directory.
in dlopen.c
(gdb) bt
#0 __dlopen (file=0xbfffd03c "/usr/lib/libvisual-0.5/actor/actor_AVS.so",
mode=1) at dlopen.c:76
#1 0xb7f8680d in visual_plugin_get_references (
pluginpath=0xbfffd03c "/usr/lib/libvisual-0.5/actor/actor_AVS.so",
count=0xbfffd020) at lv_plugin.c:834
#2 0xb7f86168 in plugin_add_dir_to_list (list=0x804e428,
dir=0x804e288 "/usr/lib/libvisual-0.5/actor") at lv_plugin.c:609
#3 0xb7f86b2b in visual_plugin_get_list (paths=0x804e3d8,
ignore_non_existing=1) at lv_plugin.c:943
#4 0xb7f9c5db in visual_init (argc=0xbffff170, argv=0xbffff174)
at lv_libvisual.c:370
#5 0x080494b7 in main (argc=2, argv=0xbffff204) at client.c:32
(gdb) quit
A debugging session is active.
Inferior 1 [process 3704] will be killed.
Quit anyway? (y or n) y
starlon#lyrical:client$ ls /usr/lib/libvisual-0.5/actor/actor_AVS.so
/usr/lib/libvisual-0.5/actor/actor_AVS.so
starlon#lyrical:client$
The file exists. Not sure what's up. Not sure what code to provide either.
Edit2: More info on the file. Permissions are ok.
816K -rwxr-xr-x 1 root root 814K 2011-11-08 15:06 /usr/lib/libvisual-0.5/actor/actor_AVS.so

You didn't tell what dynamic library it is.
If it is a free dynamic library -or a library whose source is accessible to you- you can compile it and use it with debugging enabled.
Several Linux distributions -notably Debian & Ubuntu- provide debugging variant of many libraries (e.g. GLibc, GTK, Qt, etc...), so you don't need to rebuild them. For example, Debian has libgtk-3-0 package (the binary libraries mostly), libgtk-3-dev the development files for it (headers, etc...) and libgtk-3-0-dbg (the debugging variant of the library). You need to set LD_LIBRARY_PATH appropriately to use it (since it is in /usr/lib/debug/usr/lib/libgdk-3.so.0.200.1).
Sometimes, using the debugging variants of system libraries help you to find bugs in your own code. (Of course, you also need to compile with -g -Wall your own code)

Turned out this was due to a faulty hard drive. Looks like I need a new one.

Related

Attaching to a process and call `dup2` on aarch64?

I tried attaching to a running process with gdb to redirect its stdout to an external file with these commands:
#Attaching
gdb -p 123456
#Redirecting (within GDB)
(gdb) p dup2(open("/tmp/my_stdout", 1089, 0777), 1)
I used the number 1089 because it represents O_WRONLY | O_CREAT | O_APPEND.
Firts, GDB just complained about some missing return types:
'open64' has unknown return type; cast the call to its declared return type
So I modified my command to
#Redirecting (within GDB)
(gdb) p (int)dup2((int)open("/tmp/my_stdout", 1089, 0777), 1)
This was successfully executed, and also works.
I'm trying to figure out how can I write a small utility that does the exact same thing as the above:
attaches to a process by PID
calls this (int)dup2((int)open("/tmp/my_stdout", 1089, 0777), 1)
Part2 seems easy, however part1 doesn't seem to work on aarch64. I could manage to work it on arm though.
There are a quite a few solutions which tries to solve this problem:
reptyr (doesn't work on process started by systemctl)
reredirect (doesn't support aarch64 at all)
injcode (doesn't support 64bit at all)
neercs (for sure no support for aarch64)
retty (for sure no support for aarch64)
If GDB can work, this is surely possible, but GDB is huge to analyze, and I hope I have some better solution which would not take weeks or months, like digging myself into GDB's source.

How to fix GDB not finding file: "../sysdeps/unix/sysv/linux/raise.c:50"

We're learning to use GDB in my Computer Architecture class. To do this we do most of our work by using SSH to connect to a raspberry pi. When running GDB on some code he gave us to debug though it ends with an error message on how it can't find raise.c
I've tried:
installing libc6, libc6-dbg (says they're already up-to-date)
apt-get source glibc (gives me: "You must put some 'source' URIs in your sources.list")
https://stackoverflow.com/a/48287761/12015458 (apt source returns same thing as the apt-get source above, the "find $PWD" command the user gave returns nothing)
I've tried looking for it manually where told it may be? (/lib/libc doesn't exist for me)
This is the code he gave us to try debugging on GDB:
#include <stdio.h>
main()
{
int x,y;
y=54389;
for (x=10; x>=0; x--)
y=y/x;
printf("%d\n",y);
}
However, whenever I run the code in GDB I get the following error:
Program received signal SIGFPE, Arithmetic exception.
__GI_raise (sig=8) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
I asked him about it and he didn't really have any ideas on how to fix it.
It does not really matter that the source for raise() is not found. It would only show you the line where the exception is finally raised, but not the place where the error is triggered.
Run the erroneous program again in GDB. And when the exception is raised, investigate the call stack and the stackframes with GBDs commands. This is the point in your task, so I won't give you more than this hint.
If you're clever you can see the error in the given source just by looking at it. ;-)
When GDB does not know any symbol, you need to compile with the option -g to get debugger support.
EDIT
Now on a Windows system this is my log (please excuse the colouring, I didn't found a language selector for pure text):
D:\tmp\StackOverflow\so_027 > type crash1.c
#include <stdio.h>
main()
{
int x,y;
y=54389;
for (x=10; x>=0; x--)
y=y/x;
printf("%d\n",y);
}
D:\tmp\StackOverflow\so_027 > gcc crash1.c -g -o crash1.out
crash1.c:2:1: warning: return type defaults to 'int' [-Wimplicit-int]
main()
^~~~
D:\tmp\StackOverflow\so_027 > dir
[...cut...]
04.09.2019 08:33 144 crash1.c
04.09.2019 08:40 54.716 crash1.out
D:\tmp\StackOverflow\so_027 > gdb crash1.out
GNU gdb (GDB) 8.1
[...cut...]
This GDB was configured as "x86_64-w64-mingw32".
[...cut...]
Reading symbols from crash1.out...done.
(gdb) run
Starting program: D:\tmp\StackOverflow\so_027\crash1.out
[New Thread 4520.0x28b8]
[New Thread 4520.0x33f0]
Thread 1 received signal SIGFPE, Arithmetic exception.
0x0000000000401571 in main () at crash1.c:7
7 y=y/x;
(gdb) backtrace
#0 0x0000000000401571 in main () at crash1.c:7
(gdb) help stack
Examining the stack.
The stack is made up of stack frames. Gdb assigns numbers to stack frames
counting from zero for the innermost (currently executing) frame.
At any time gdb identifies one frame as the "selected" frame.
Variable lookups are done with respect to the selected frame.
When the program being debugged stops, gdb selects the innermost frame.
The commands below can be used to select other frames by number or address.
List of commands:
backtrace -- Print backtrace of all stack frames
bt -- Print backtrace of all stack frames
down -- Select and print stack frame called by this one
frame -- Select and print a stack frame
return -- Make selected stack frame return to its caller
select-frame -- Select a stack frame without printing anything
up -- Select and print stack frame that called this one
Type "help" followed by command name for full documentation.
Type "apropos word" to search for commands related to "word".
Command name abbreviations are allowed if unambiguous.
(gdb) next
Thread 1 received signal SIGFPE, Arithmetic exception.
0x0000000000401571 in main () at crash1.c:7
7 y=y/x;
(gdb) next
[Inferior 1 (process 4520) exited with code 030000000224]
(gdb) next
The program is not being run.
(gdb) quit
D:\tmp\StackOverflow\so_027 >
Well, it marks directly the erroneous source line. That is different to your environment as you use a Raspi. However, it shows you some GDB commands to try.
Concerning your video:
It is clear that inside raise() you can't access x. That's why GDB moans about it.
If an exception is raised usually the program is about to quit. So there is no value in stepping forward.
Instead, as shown in my log, use GDB commands to investigate the stack frames. I think this is the issue you are about to learn.
BTW, do you know that you should be able to copy the screen content? This will make reading so much easier for us.
From a practical standpoint the other answer is correct, but if you do want the libc sources:
apt-get source is the right way to get the sources of libc, but yes, you do need to have source repositories configured in /etc/apt/sources.list.
If you're using Ubuntu, see the deb-src lines in https://help.ubuntu.com/community/Repositories/CommandLine
For debian, see https://wiki.debian.org/SourcesList#Example_sources.list
Then apt-get source should work. Remember to tell GDB where those sources are using the "directory" command.

Debugging functions in __libc_start_main

I'm writing a library that hooks some CUDA functions to add some functionality. The "constructor" hooks the CUDA functions and set up message queue and shared memory to communicate with other hooked CUDA binaries. When launching several hooked CUDA binaries (by python subprocess.Popen('<path-to-binary>', shell=True)) some processes hangs. So I used gdb -p <pid> to attach one suspended process, hoping to figure out what's going wrong. Here's the result:
Attaching to process 7445
Reading symbols from /bin/dash...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libc-2.27.so...done.
done.
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/ld-2.27.so...done.
done.
0x00007f9cefe8b76a in wait4 () at ../sysdeps/unix/syscall-template.S:78
78 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
#0 0x00007f9cefe8b76a in wait4 () at ../sysdeps/unix/syscall-template.S:78
#1 0x000055fff93be8a0 in ?? ()
#2 0x000055fff93c009d in ?? ()
#3 0x000055fff93ba6d8 in ?? ()
#4 0x000055fff93b949e in ?? ()
#5 0x000055fff93b9eda in ?? ()
#6 0x000055fff93b7944 in ?? ()
#7 0x00007f9cefdc8b97 in __libc_start_main (main=0x55fff93b7850, argc=3, argv=0x7ffca7c7beb8, init=<optimized out>,
fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffca7c7bea8) at ../csu/libc-start.c:310
#8 0x000055fff93b7a4a in ?? ()
I've added -g flag but it seems that the program hangs on wait4 before entering main.
Thanks for any insights on:
How can I load these debug symbols to get rid of ??
Where is ../csu/libc-start.c:310 located?
What else can I do to locate the bug?
System Info: gcc 6.5.0, Ubuntu 18.04 with 4.15.0-54-generic.
How can I load these debug symbols to get rid of ??
You appear to need the debug symbols for /bin/dash, which are probably going to be in a package called dash-dbg or dash-dbgsym or something like that.
Also, I suspect your stack trace would make more sense if you compiled your library with -fno-optimize-sibling-calls.
Where is ../csu/libc-start.c:310 located?
See this answer.
What else can I do to locate the bug?
You said that you are writing a library that uses __attribute__((constructor)), but you showed a stack trace for /bin/dash (which I presume is DASH and not a program you wrote) that does not appear to involve symbols from your library. I infer from this, that your library is loaded with LD_PRELOAD into programs that are not expecting it to be there.
Both of those things -- LD_PRELOAD and __attribute__((constructor)) -- break the normal expectations of both whatever unsuspecting program is involved, and the C library. You should only do those things if you have no other choice, and you should try to do as little as possible within the injected code. (In particular, I do not think any design that involves spawning processes from a constructor function will be workable, period.) If you tell us about your larger goals we may be able to suggest alternative means that are less troublesome.
EDIT:
subprocess.Popen('<path-to-binary>', shell=True)
With shell=True, Python doesn't invoke the program directly, it runs a command of the form /bin/sh -c 'string passed to Popen'. In many cases this will naturally produce a /bin/dash process sleeping (not hung) in a wait syscall for the entire lifetime of the actual binary. Unless you actually need to evaluate some shell code before running the program, try the default shell=False instead and see if that makes your problem go away. (If you do need to evaluate shell code, try Popen('<shell code>; exec <binary>', shell=True).)

many core files( e.g core.1678 etc ),how to find where the exactly error is. By using gdb?

core.1678,core.1689, how can i resolve this problem using gdb.i have tried gdb bt option but it is not resolving the error.
gdb -bt core.1678
(gdb) core
No core file now.
(gdb) n
The program is not being run.
(gdb) r
Starting program:
No executable file specified.
Use the "file" or "exec-file" command.
(gdb) core.1678
/home/deepak/deepak/mss/.1678: No such file or directory.
(gdb) /home/deepak/deepak/mss/core.1678
help me out
many core files( e.g core.1678 etc )...
This indicates that your same program or different programs in that particular directory is continuously crashing. When your machine is configured to generate dump file, it creates the file in the form of core.(PID). You may refer many useful article regarding the core dump file. You may refer my blog as well which explains about core dump analysis and its internal.
http://mantoshopensource.blogspot.sg/2011/02/core-dump-analysis-part-ii.html
The basic command to load and analyze the core dump file using GDB is as follows:
mantosh#ubuntu:~$ gdb
// This is how you would open the core dump file.
(gdb) core core.23515
(no debugging symbols found)
Core was generated by `./otest LinuxWorldRocks 10'.
Program terminated with signal 11, Segmentation fault.
[New process 23515]
==> Signal 11(SIGSEGV) was the reason for this core-dump file
==> pid of a program is 23515
#0 0x080485f8 in ?? ()
// Load the debug symbol of your program(build with -g option)
(gdb) symbol ./otest
Reading symbols from /home/mantosh/Desktop/otest...done.
// Now you can execute any normal command which you perform while debugging(except breakpoints).
(gdb) bt
#0 0x080485f8 in printf_info (info=0x8ec5008 "LinuxWorld") at test.c:58
#1 0x080485c2 in my_memcpy (dest=0x8ec5012 "", source=0xbfb9c6fe "Rocks",
length=10) at test.c:47
#2 0x0804855d in main (argc=3, argv=0xbfb9b3f4) at test.c:33
EDIT
cored-ump file is the snapshot of that particular program at the time of exception/segmentation fault. So once you load core-dump in GDB you would only be able to execute the command to read
the memory information. You can not use the debugging commands like breakpoints, continue, run ...etc..........

Seg fault on running a program through linker?

I downloaded the source for libc6 and completed the build process successfully. (Though I did not performed a make install deliberately).
With the new linker built in buil-dir/elf/ld.so I ran a program supplying it as the argument to the newly built linker.
The test code prints some string and then malloc(sizeof(char)*1024).
On running the test binary as an argument to the newly built linker I get a Seg Fault at elf/dl-addr.c:132 which is:
131 /* Protect against concurrent loads and unloads. */
132 __rtld_lock_lock_recursive (GL(dl_load_lock));
This is the last frame before the seg fault and is called through malloc() call from the test program.
Stack Trace at that point :
#0 0x0000000000000000 in ?? ()
#1 0x00007f11a6a94928 in __GI__dl_addr (address=0x7f11a69e67a0 <ptmalloc_init>, info=0x7fffe9393be0, mapp=0x7fffe9393c00, symbolp=0x0) at dl-addr.c:132
#2 0x00007f11a69e64d7 in ptmalloc_init () at arena.c:381
#3 0x00007f11a69e72b8 in ptmalloc_init () at arena.c:371
#4 malloc_hook_ini (sz=<optimized out>, caller=<optimized out>) at hooks.c:32
#5 0x00000000004005b3 in main () at test.c:20
On running the same program with the default installed linker on the machine the program runs fine.
I am not able to understand what can be the issue behind this? (Is it faulting because I am using the newly built linker without installing it first)
-Any suggestions or pointers are highly appreciated.
Thanks
(System details GCC 4.8.22, eglibc-2.15 Ubuntu 12.10 64bit
With the new linker built in buil-dir/elf/ld.so I ran a program supplying it as the argument to the newly built linker.
It that's all you did, then the crash is expected, because you are mixing newly-built loader with system libraries (which doesn't work: all parts of glibc must come from the same build of it).
What you need to do is:
buil-dir/elf/ld.so \
--library-path buil-dir:buil-dir/dlfcn:buil-dir/nptl:... \
/path/to/a.out
The list of directories to search must include all the libraries (parts of glibc) that your program uses.

Resources