I am currently working on a small project which captures packets and triggers an external application or script when a specific limit exceeds to handle following steps (which could be for example alerting, null routing etc).
I tried to create a really simple trigger with this code ("trigger" contains the path to binary or script):
char * trigger_complete;
sprintf(trigger_complete, "%s %u %u %s %s %Lf", trigger, data[II].count, data[II].proto, inet_ntoa(data[II].src_ip), inet_ntoa(data[II].dst_ip), rate);
system (trigger_complete);
On my Ubuntu 12.04.1 LTS it seems to work without issues, I tested with the "echo" application.
linux-gate.so.1 => (0xb779f000)
libpcap.so.0.8 => /usr/lib/i386-linux-gnu/libpcap.so.0.8 (0xb7752000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb75a8000)
/lib/ld-linux.so.2 (0xb77a0000)
On a Debian Wheezy and an instance of "grml" I receive Segmentation Fault when executing the binary. (I verified that this issue is caused by the piece of code above by commenting it out and re-try.)
linux-vdso.so.1 => (0x00007fff875ff000)
libpcap.so.0.8 => /usr/lib/x86_64-linux-gnu/libpcap.so.0.8 (0x00007fa9c6048000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa9c5cbe000)
/lib64/ld-linux-x86-64.so.2 (0x00007fa9c608d000)
The only difference I see is the architecture, the Ubuntu system is a 32 bit os while the Debian is a 64 bit os.
I am not sure if this is the issue, but it seems like it is.
Can anyone help me out with this?
Thank you in advance!
Since feature requests to mark a comment as an answer remain declined, I copy the above solution here.
... might want to allocate some memory for trigger_complete! – Thomas
Related
I tried attaching to a running process with gdb to redirect its stdout to an external file with these commands:
#Attaching
gdb -p 123456
#Redirecting (within GDB)
(gdb) p dup2(open("/tmp/my_stdout", 1089, 0777), 1)
I used the number 1089 because it represents O_WRONLY | O_CREAT | O_APPEND.
Firts, GDB just complained about some missing return types:
'open64' has unknown return type; cast the call to its declared return type
So I modified my command to
#Redirecting (within GDB)
(gdb) p (int)dup2((int)open("/tmp/my_stdout", 1089, 0777), 1)
This was successfully executed, and also works.
I'm trying to figure out how can I write a small utility that does the exact same thing as the above:
attaches to a process by PID
calls this (int)dup2((int)open("/tmp/my_stdout", 1089, 0777), 1)
Part2 seems easy, however part1 doesn't seem to work on aarch64. I could manage to work it on arm though.
There are a quite a few solutions which tries to solve this problem:
reptyr (doesn't work on process started by systemctl)
reredirect (doesn't support aarch64 at all)
injcode (doesn't support 64bit at all)
neercs (for sure no support for aarch64)
retty (for sure no support for aarch64)
If GDB can work, this is surely possible, but GDB is huge to analyze, and I hope I have some better solution which would not take weeks or months, like digging myself into GDB's source.
I have the same exact versions of numpy (1.9.3) and scipy (0.17.0.dev0+7dd2b91) installed on my laptop and on a computing cluster.
When I run scipy.test on my laptop, it completes without any failures. But when I run scipy.test on the computing cluster, it completes with a single failure, reported in this question.
I've traced the cause of this failure to the file scipy/linalg/_decomp_update.so, which is a C file (I believe . . . or C++?), not a Python file.
Hence, I've concluded that the C software on my laptop differs from that on the cluster.
My question is, what is the relevant C software? Which compiler does scipy use by default? How do I check what version of C I have installed?
Update #1
Note that the .so file is compiled. The original files in the Git repo from which I installed scipy are _decomp_update.c, _decomp_update.pyx, and _decomp_update.pyx.in.
Perhaps the relevant difference between my laptop and the cluster isn't in the C code, but in the Python package that translates between C and Python (which in this case appears to be cython)?
My laptop has cython version 0.23.2.
The cluster has cython version 0.22.1.
I am currently updating the cluster's version and re-running the tests.
Update #2
I now have cython version 0.23.3 installed on both my laptop and the cluster.
The failure persists on the cluster; it continues not to occur on my laptop.
Hence, the difference seems to be in the C implementation itself, not in Python or cython.
Because the cython docs mention gcc as the standard C compiler it uses, it makes sense to me to check this.
On the cluster, I have gcc version 4.4.7.
On my laptop, I have 4.8.4.
In the future I may want to update the cluster's version and re-run the tests.
Update #3
I aborted the update to gcc in order to invetigate whether the cluster's versions of LAPACK and BLAS differ from those on my laptop (see the comment below). I followed this answer.
This is what I see for my laptop:
$ cd /usr/local/lib/python2.7/dist-packages/scipy/linalg/
$ ldd cython_lapack.so
linux-gate.so.1 => (0xb7792000)
liblapack.so.3 => /usr/lib/liblapack.so.3 (0xb7159000)
libblas.so.3 => /usr/lib/libblas.so.3 (0xb6a7e000)
libgfortran.so.3 => /usr/lib/i386-linux-gnu/libgfortran.so.3 (0xb697f000)
libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xb6939000)
libgcc_s.so.1 => /lib/i386-linux-gnu/libgcc_s.so.1 (0xb691c000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb676e000)
libpthread.so.0 => /lib/i386-linux-gnu/libpthread.so.0 (0xb6752000)
libquadmath.so.0 => /usr/lib/i386-linux-gnu/libquadmath.so.0 (0xb66d6000)
/lib/ld-linux.so.2 (0xb7793000)
This is what I see on the cluster:
dbliss#nx3[~]> cd lib/python2.7/site-packages/scipy/linalg/
dbliss#nx3[linalg]> ldd cython_lapack.so
linux-vdso.so.1 => (0x00007ffe6bbec000)
liblapack.so.3 => /usr/lib64/atlas/liblapack.so.3 (0x00007fceb3f35000)
libblas.so.3 => /usr/lib64/libblas.so.3 (0x00007fceb3cdd000)
libpython2.7.so.1.0 => not found
libgfortran.so.3 => /usr/lib64/libgfortran.so.3 (0x00007fceb39eb000)
libm.so.6 => /lib64/libm.so.6 (0x00007fceb3766000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fceb3550000)
libc.so.6 => /lib64/libc.so.6 (0x00007fceb31bc000)
libf77blas.so.3 => /usr/lib64/atlas/libf77blas.so.3 (0x00007fceb2f9c000)
libcblas.so.3 => /usr/lib64/atlas/libcblas.so.3 (0x00007fceb2d7c000)
/lib64/ld-linux-x86-64.so.2 (0x000000377cc00000)
libatlas.so.3 => /usr/lib64/atlas/libatlas.so.3 (0x00007fceb2720000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fceb2502000)
There are differences here for sure, but are there any important differences?
I have implemented SNMP subagent functionality in my application using net-snmp library (http://www.net-snmp.org/tutorial/tutorial-5/toolkit/demon/).
The application crashes at init_agent() call.
GDB-BackTrace for the same:
#0 0x00002b123483aaa1 in init_traps () from /usr/lib64/libnetsnmpagent.so.10
#1 0x00002b1234835cd0 in init_agent () from /usr/lib64/libnetsnmpagent.so.10
...
The error message at "/var/log/messages":
sample_app.exe[6642]: segfault at 0000000000659de0 rip 00002ac2749c2aa1 rsp 00007fff38c6ec48 error 7
I am using "NET-SNMP version: 5.3.2.2" on CentOS 5.5(elf5) 64Bit. The sample_app code is same as provided in tutorial(http://www.net-snmp.org/tutorial/tutorial-5/toolkit/demon/example-demon.c)
The init_agent() is supposed to take config file name as argument, I have tried passing config file name which has 'correct configuration'/'incorrect configuration'/'file not present', In each case the application crashes with same error.
Please suggest any tools/links which will help me identify the actual cause of the crash. Any link for resolution of similar issue will also be helpful.
Thanks
Edit-
The issue has been resolved. The variable 'snmptrap_oid_len' was being declared/used in application's MIB C code, which is already part of net-snmp library 'agent_trap.c'. This was causing the conflict and hence crash.
PS: If you face similar issue, ensure that variables 'snmptrap_oid' and 'snmptrap_oid_len' are 'not redeclared'/'used correctly' in MIB C code.
I downloaded the source for libc6 and completed the build process successfully. (Though I did not performed a make install deliberately).
With the new linker built in buil-dir/elf/ld.so I ran a program supplying it as the argument to the newly built linker.
The test code prints some string and then malloc(sizeof(char)*1024).
On running the test binary as an argument to the newly built linker I get a Seg Fault at elf/dl-addr.c:132 which is:
131 /* Protect against concurrent loads and unloads. */
132 __rtld_lock_lock_recursive (GL(dl_load_lock));
This is the last frame before the seg fault and is called through malloc() call from the test program.
Stack Trace at that point :
#0 0x0000000000000000 in ?? ()
#1 0x00007f11a6a94928 in __GI__dl_addr (address=0x7f11a69e67a0 <ptmalloc_init>, info=0x7fffe9393be0, mapp=0x7fffe9393c00, symbolp=0x0) at dl-addr.c:132
#2 0x00007f11a69e64d7 in ptmalloc_init () at arena.c:381
#3 0x00007f11a69e72b8 in ptmalloc_init () at arena.c:371
#4 malloc_hook_ini (sz=<optimized out>, caller=<optimized out>) at hooks.c:32
#5 0x00000000004005b3 in main () at test.c:20
On running the same program with the default installed linker on the machine the program runs fine.
I am not able to understand what can be the issue behind this? (Is it faulting because I am using the newly built linker without installing it first)
-Any suggestions or pointers are highly appreciated.
Thanks
(System details GCC 4.8.22, eglibc-2.15 Ubuntu 12.10 64bit
With the new linker built in buil-dir/elf/ld.so I ran a program supplying it as the argument to the newly built linker.
It that's all you did, then the crash is expected, because you are mixing newly-built loader with system libraries (which doesn't work: all parts of glibc must come from the same build of it).
What you need to do is:
buil-dir/elf/ld.so \
--library-path buil-dir:buil-dir/dlfcn:buil-dir/nptl:... \
/path/to/a.out
The list of directories to search must include all the libraries (parts of glibc) that your program uses.
My program loads a dynamic library, but after it tries to load it (it doesn't seem to, or at least something's amiss with the loading. A free() throws an error, and I commented out that line.)
I get the following in gdb.
Program received signal SIGSEGV, Segmentation fault.
__strlen_ia32 () at ../sysdeps/i386/i686/multiarch/../../i586/strlen.S:99
99 ../sysdeps/i386/i686/multiarch/../../i586/strlen.S: No such file or directory.
in ../sysdeps/i386/i686/multiarch/../../i586/strlen.S
How would I go about addressing this?
EDIT1:
The above issue was due to me not having an xml file where it should have been.
Here's the first error that I covered up to get to the initial error I showed.
(gdb) s
__dlopen (file=0xbfffd03c "/usr/lib/libvisual-0.5/actor/actor_AVS.so", mode=1)
at dlopen.c:76
76 dlopen.c: No such file or directory.
in dlopen.c
(gdb) bt
#0 __dlopen (file=0xbfffd03c "/usr/lib/libvisual-0.5/actor/actor_AVS.so",
mode=1) at dlopen.c:76
#1 0xb7f8680d in visual_plugin_get_references (
pluginpath=0xbfffd03c "/usr/lib/libvisual-0.5/actor/actor_AVS.so",
count=0xbfffd020) at lv_plugin.c:834
#2 0xb7f86168 in plugin_add_dir_to_list (list=0x804e428,
dir=0x804e288 "/usr/lib/libvisual-0.5/actor") at lv_plugin.c:609
#3 0xb7f86b2b in visual_plugin_get_list (paths=0x804e3d8,
ignore_non_existing=1) at lv_plugin.c:943
#4 0xb7f9c5db in visual_init (argc=0xbffff170, argv=0xbffff174)
at lv_libvisual.c:370
#5 0x080494b7 in main (argc=2, argv=0xbffff204) at client.c:32
(gdb) quit
A debugging session is active.
Inferior 1 [process 3704] will be killed.
Quit anyway? (y or n) y
starlon#lyrical:client$ ls /usr/lib/libvisual-0.5/actor/actor_AVS.so
/usr/lib/libvisual-0.5/actor/actor_AVS.so
starlon#lyrical:client$
The file exists. Not sure what's up. Not sure what code to provide either.
Edit2: More info on the file. Permissions are ok.
816K -rwxr-xr-x 1 root root 814K 2011-11-08 15:06 /usr/lib/libvisual-0.5/actor/actor_AVS.so
You didn't tell what dynamic library it is.
If it is a free dynamic library -or a library whose source is accessible to you- you can compile it and use it with debugging enabled.
Several Linux distributions -notably Debian & Ubuntu- provide debugging variant of many libraries (e.g. GLibc, GTK, Qt, etc...), so you don't need to rebuild them. For example, Debian has libgtk-3-0 package (the binary libraries mostly), libgtk-3-dev the development files for it (headers, etc...) and libgtk-3-0-dbg (the debugging variant of the library). You need to set LD_LIBRARY_PATH appropriately to use it (since it is in /usr/lib/debug/usr/lib/libgdk-3.so.0.200.1).
Sometimes, using the debugging variants of system libraries help you to find bugs in your own code. (Of course, you also need to compile with -g -Wall your own code)
Turned out this was due to a faulty hard drive. Looks like I need a new one.