Unable to find stack smashing function using GDB - c

I have the following C application:
#include <stdio.h>
void smash()
{
int i;
char buffer[16];
for(i = 0; i < 17; i++) // <-- exceeds the limit of the buffer
{
buffer[i] = i;
}
}
int main()
{
printf("Starting\n");
smash();
return 0;
}
I cross-compiled using the following version of gcc:
armv5l-linux-gnueabi-gcc -v
Using built-in specs.
Target: armv5l-linux-gnueabi
Configured with: /home/tarjeif/svn/builder/build_armv5l-linux-gnueabi/gcc-4.4.1/gcc-4.4.1/configure --target=armv5l-linux-gnueabi --host=i486-linux-gnu --build=i486-linux-gnu --prefix=/home/tarjeif/svn/builder/build_armv5l-linux-gnueabi/toolchain --with-sysroot=/home/tarjeif/svn/builder/build_armv5l-linux-gnueabi/toolchain --with-headers=/home/tarjeif/svn/builder/build_armv5l-linux-gnueabi/toolchain/include --enable-languages=c,c++ --with-gmp=/home/tarjeif/svn/builder/build_armv5l-linux-gnueabi/gmp-5.0.0/gmp-host-install --with-mpfr=/home/tarjeif/svn/builder/build_armv5l-linux-gnueabi/mpfr-2.4.2/mpfr-host-install --disable-nls --disable-libgcj --disable-libmudflap --disable-libssp --disable-libgomp --enable-checking=release --with-system-zlib --with-arch=armv5t --with-gnu-as --with-gnu-ld --enable-shared --enable-symvers=gnu --enable-__cxa_atexit --disable-nls --without-fp --enable-threads
Thread model: posix
gcc version 4.4.1 (GCC)
Invoked like this:
armv5l-linux-gnueabi-gcc -ggdb3 -fstack-protector-all -O0 test.c
When run on target, it outputs:
Starting
*** stack smashing detected ***: ./a.out terminated
Aborted (core dumped)
I load the resulting core dump in gdb yielding the following backtrace:
GNU gdb (GDB) 7.0.1
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=i486-linux-gnu --target=armv5l-linux-gnueabi".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/andersn/workspace/stacktest/a.out...done.
Reading symbols from /home/andersn/workspace/stacktest/linux/toolchain/lib/libc.so.6...done.
Loaded symbols for /home/andersn/workspace/stacktest/linux/toolchain/lib/libc.so.6
Reading symbols from /home/andersn/workspace/stacktest/linux/toolchain/lib/ld-linux.so.3...done.
Loaded symbols for /home/andersn/workspace/stacktest/linux/toolchain/lib/ld-linux.so.3
Reading symbols from /home/andersn/workspace/stacktest/linux/toolchain /lib/libgcc_s.so.1...done.
Loaded symbols for /home/andersn/workspace/stacktest/linux/toolchain/lib/libgcc_s.so.1
Core was generated by `./a.out'.
Program terminated with signal 6, Aborted.
#0 0x40052d4c in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:67
67 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
in ../nptl/sysdeps/unix/sysv/linux/raise.c
(gdb) bt
#0 0x40052d4c in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:67
#1 0x40054244 in *__GI_abort () at abort.c:92
#2 0x40054244 in *__GI_abort () at abort.c:92
#3 0x40054244 in *__GI_abort () at abort.c:92
#4 0x40054244 in *__GI_abort () at abort.c:92
#5 0x40054244 in *__GI_abort () at abort.c:92
#6 0x40054244 in *__GI_abort () at abort.c:92
... and so on ...
Now, the question:
I'm totally unable to find the function causing the stack smashing from GDB even though the smash() function don't overwrite any structural data of the stack, only the stack protector itself. What should I do?

The problem is that the version of GCC which compiled your target libc.so.6 is buggy and did not emit correct unwind descriptors for __GI_raise. With incorrect unwind descriptors, GDB gets into a loop while unwinding the stack.
You can examine the unwind descriptors with
readelf -wf /home/andersn/workspace/stacktest/linux/toolchain/lib/libc.so.6
I expect you'll get exact same result in GDB from any program calling abort, e.g.
#include <stdlib.h>
void foo() { abort(); }
int main() { foo(); return 0; }
Unfortunately, there isn't much you can do, other than trying to build newer version of GCC, and then rebuilding the whole "world" with it.

It's not the case that GDB can always work out what happened to a smashed stack even with -fstack-protector-all (and even with -Wstack-protector to warn about functions with frames that weren't protected). Example.
In these cases the stack protector has done its job (killed a misbehaving app) but hasn't done the debugger any favors. (The classic example is a stack smash where a write has occurred with a large enough stride that it jumps the canary.) In these cases it may become necessary to binary search through the code via breakpoints to narrow down which region of the code is causing the smash, then single step through the smash to see how it happened.

Have you tried resolving this complaint: "../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory." to see if actually being able to resolve symbols would help?

Related

GDB symbols missing - libc claimed to be wrong library or version mismatch

I am having trouble showing proper debug symbols in the backtrace in GDB in an ARM cross-compiled system, built using Yocto.
abc.c is a simple printf("Hello world\n"); program in C (nothing tricky). On the build machine:
> yocto-dir/build/tmp-angstrom-glibc/sysroots/x86_64-linux/usr/bin/arm-angstrom-linux-gnueabi/arm-angstrom-linux-gnueabi-gcc abc --sysroot=yocto-dir/build/tmp-angstrom-glibc/sysroots/imx28scm -g -O0 -o abc
> scp abc root#DEVICE-IP:~
On the ARM target:
> gdbserver :2345 abc
Start GDB on the build machine (from installed Yocto SDK):
> /usr/local/oecore-x86_64/sysroots/x86_64-angstromsdk-linux/usr/bin/arm-angstrom-linux-gnueabi/arm-angstrom-linux-gnueabi-gdb abc
GNU gdb (Linaro GDB) 7.8-2014.09
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-angstromsdk-linux --target=arm-angstrom-linux-gnueabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://bugs.linaro.org>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from abc...done.
(gdb) target remote DEVICE-IP:2345
Remote debugging using DEVICE-IP:2345
warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers
and track explicitly loaded dynamic code.
Cannot access memory at address 0x0
0x4ae90a20 in ?? ()
(gdb) bt
#0 0x4ae90a20 in ?? ()
#1 0x00000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) set sysroot yocto-dir/build/tmp-angstrom-glibc/sysroots/imx28scm
Reading symbols from yocto-dir/build/tmp-angstrom-glibc/sysroots/imx28scm/lib/ld-linux.so.3...done.
Loaded symbols for yocto-dir/build/tmp-angstrom-glibc/sysroots/imx28scm/lib/ld-linux.so.3
Cannot access memory at address 0x0
After setting the sysroot, it still does not give symbols.
(gdb) bt
#0 0x4ae90a20 in ?? ()
#1 0x00000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) b main
Breakpoint 1 at 0x84a8: file abc.c, line 5.
(gdb) c
Continuing.
Breakpoint 1, main () at abc.c:5
5 printf("Hello world\n");
Okay, when it hits a breakpoint, it does display symbols.
(gdb) bt
Cannot access memory at address 0x0
#0 main () at abc.c:5
However, it goes weird stepping beyond there.
(gdb) n
Cannot access memory at address 0x1
0x4aea6ea0 in ?? ()
(gdb) bt
#0 0x4aea6ea0 in ?? ()
#1 0x0000a014 in do_lookup_unique (Cannot access memory at address 0x1
undef_map=0x1, ref=0x0, strtab=0x56ebb27 <error: Cannot access memory at address 0x56ebb27>, sym=0x84a0 <main>, type_class=-1224757248, result=0x1, map=<optimized out>,
new_hash=<optimized out>, undef_name=<optimized out>) at /usr/src/debug/glibc/2.24-r0/git/elf/dl-lookup.c:332
#2 do_lookup_x (undef_name=<optimized out>, new_hash=<optimized out>, old_hash=<optimized out>, ref=0x0, result=<optimized out>, scope=0x177ff8e, i=<optimized out>, version=<optimized out>,
flags=-1224757248, skip=0x1, type_class=100, undef_map=0x1) at /usr/src/debug/glibc/2.24-r0/git/elf/dl-lookup.c:544
#3 0x4aec0b10 in ?? ()
Cannot access memory at address 0x1
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
It can't find the proper version of libc.so.6.
(gdb) info sharedlibrary
warning: .dynamic section for "yocto-dir/build/tmp-angstrom-glibc/sysroots/imx28scm/lib/libc.so.6" is not at the expected address (wrong library or version mismatch?)
From To Syms Read Shared Object Library
0x000007d0 0x0001bee0 Yes yocto-dir/build/tmp-angstrom-glibc/sysroots/imx28scm/lib/ld-linux.so.3
0x4aee73c0 0x4afe2018 No yocto-dir/build/tmp-angstrom-glibc/sysroots/imx28scm/lib/libc.so.6
(gdb) n
Cannot find bounds of current function
It does not give an ideal debugging experience.
There is a gcc inside yocto-dir sysroot (as used above), as well as in /usr/local/oecore-x86_64. They both behave the same. The /usr/local/oecore-x86_64 SDK is freshly built and installed.
Similarly, there is an imx28scm sysroot inside yocto-dir (as used above), as well as in /usr/local/oecore-x86_64, and they both behave the same. However, they clearly do have different versions of libc.so.6 - yocto-dir's is 14.8MB, and /usr/local/oecore-x86_64's is 1.3MB. This is a concern, however setting either of these locations as the sysroot does not fix the problem.
One workaround is to link with -static. GDB does give symbols in this case:
(gdb) target remote DEVICE-IP:2345
Remote debugging using DEVICE-IP:2345
_start () at ../sysdeps/arm/start.S:79
79 ../sysdeps/arm/start.S: No such file or directory.
(gdb) set sysroot yocto-dir/build/tmp-angstrom-glibc/sysroots/imx28scm
(gdb) bt
#0 _start () at ../sysdeps/arm/start.S:79
(gdb) b main
Breakpoint 1 at 0x8480: file abc.c, line 5.
(gdb) c
Continuing.
Breakpoint 1, main () at abc.c:5
5 printf("Hello world\n");
(gdb) n
6 return 0;
(gdb) n
7 }
Linking with -Wl,--verbose seems to show it is linking with the library in the expected sysroot:
yocto-dir/build/tmp-angstrom-glibc/sysroots/x86_64-linux/usr/libexec/arm-angstrom-linux-gnueabi/gcc/arm-angstrom-linux-gnueabi/6.2.1/ld: Attempt to open yocto-dir/build/tmp-angstrom-glibc/sysroots/imx28scm/lib/libc.so.6 succeeded
The linker also finds this one, but it isn't referred to as libc.so.6, so presumably this is not interfering.
yocto-dir/build/tmp-angstrom-glibc/sysroots/x86_64-linux/usr/libexec/arm-angstrom-linux-gnueabi/gcc/arm-angstrom-linux-gnueabi/6.2.1/ld: Attempt to open yocto-dir/build/tmp-angstrom-glibc/sysroots/imx28scm/usr/lib/libc.so succeeded
Why is there a library version mismatch in this case? How can I get GDB to display symbols from the library which it expects? I do not wish to link statically.
Please make sure the libc in the box is same as the one in your build server.
sorry, this should be a comments, but currently, I don't have enough reputation.
Apparently GDB for ARM target has trouble with trying to load symbols before main() (Debugging shared libraries with gdbserver):
The problem I had was that gdbserver stops at the dynamic loader, before main, and the dynamic libraries are not yet loaded at that point, and so GDB does not know where the symbols will go in memory yet.
GDB appears to have some mechanisms to automatically load shared library symbols, and if I compile for host, and run gdbserver locally, running to main is not needed. But on the ARM target, that is the most reliable thing to do.
Therefore, set it to load shared symbols after main has been hit:
> b main
> c
<breakpoint hit>
> set sysroot <sysroot>
Or reload the symbols after you hit main.
> set sysroot <sysroot>
...
> b main
> c
<breakpoint hit>
> nosharedlibrary
> sharedlibrary
Or it might be useful in interfacing with IDE debuggers to set auto loading of symbols to be off on GDB startup:
> set auto-solib-add off

GDB debugging arguments passed through newlib

I am trying to use newlib on a TI CC2538 ARM Cortex M3 part. The objective is to use printf for debugging messages and I've actually got that working. However the system will segfault after a number of messages (ARM calls that a HardFault) and I have no idea why.
I used GDB to get the following stack trace:
(gdb) bt full
#0 FaultISR () at src/startup_gcc.c:307
fault_stat = 0x8200
hfault_stat = 0x40000000
mmfault_stat = 0xfffffff8
busfault_stat = 0xfffffff8
buf = "\000\000\000\000\371\377\377\377\026\000\000\000\000\000\000\000\n\000\000\000\000\000\000\000\b\f\000 \002\000\000\000\060\r\000 \320\f\000 \002\000\000\000\303w \000\000\000\000\000w?\032\000\360\v\000 \360\v\000 \002\000\000\000\027\000\000\000\003\000\000\000\277\022\035\000\b\f\000 \b\f\000 \r\000 \027\000\000\000\f\000\000\000\371\377\377\377T\025\000 a\217 \000\r\000\000\000\001\000\000\000\210\r\000 s\263\""
#1 <signal handler called>
No symbol table info available.
#2 0x002076dc in strlen ()
No symbol table info available.
#3 0x0020494a in _svfprintf_r ()
No symbol table info available.
#4 0x002045aa in _vsnprintf_r ()
No symbol table info available.
#5 0x002045fe in vsnprintf ()
No symbol table info available.
#6 0x00200fa8 in ws_debug_vfprintf (f=STDERR, fmt=0x208f24 "Received a packet of length %u\n", args=...) at src/os/ws_debug.c:220
len = 0x27
buf = 0x20001538 "DBG src/net/mac/packet_scheduler.c:123 "
#7 0x00200f4e in ws_debug_fprintf (f=STDERR, fmt=0x208f24 "Received a packet of length %u\n") at src/os/ws_debug.c:201
args = {__ap = 0x20000e50 <pui32Stack+440>}
#8 0x00201dea in packet_scheduler_timer () at src/net/mac/packet_scheduler.c:123
phy_len = 0xd
pkt = 0x8ef6
fcf = 0x0
buf = 0x20001738 "\200\220"
#9 0x00200c32 in ws_timer_update () at src/os/ws_timer.c:156
ptr = 0x20001728
new = 0x20001728
#10 0x002005c2 in main () at src/main.c:143
No locals.
As you can see, the newlib parts (#2 - #5) don't have any information with them making it hard to debug. I assume this is because newlib was stripped of debugging symbols, however I recompiled newlib and get the same result.
I am running Arch linux using the toolchain available in the official repos.
arm-none-eabi-gcc -v
Using built-in specs.
COLLECT_GCC=arm-none-eabi-gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-none-eabi/5.1.0/lto-wrapper
Target: arm-none-eabi
Configured with: /build/arm-none-eabi-gcc/src/gcc-5-20150519/configure --target=arm-none-eabi --prefix=/usr --with-sysroot=/usr/arm-none-eabi --with-native-system-header-dir=/include --libexecdir=/usr/lib --enable-languages=c,c++ --enable-plugins --disable-decimal-float --disable-libffi --disable-libgomp --disable-libmudflap --disable-libquadmath --disable-libssp --disable-libstdcxx-pch --disable-nls --disable-shared --disable-threads --disable-tls --with-gnu-as --with-gnu-ld --with-system-zlib --with-newlib --with-headers=/usr/arm-none-eabi/include --with-python-dir=share/gcc-arm-none-eabi --with-gmp --with-mpfr --with-mpc --with-isl --with-libelf --enable-gnu-indirect-function --with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm' --with-pkgversion='Arch Repository' --with-bugurl=https://bugs.archlinux.org/ --with-multilib-list=armv6-m,armv7-m,armv7e-m,armv7-r
Thread model: single
gcc version 5.1.0 (Arch Repository)
-
arm-none-eabi-gdb -v
GNU gdb (GDB) 7.9.1
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-unknown-linux-gnu --target=arm-none-eabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
I manually built the newlib package using the Arch Build System (ABS) and I think that the debugging symbols should be included (the function names show up in GDB. Is that enough?)
Any ideas on what I can try next?
Edit
I have revised the output so the fault status registers are actually readable.
Here's the decoded output:
fault_stat: 0x8200 (BFARV, PRECISE)
hfault_stat: 0x40000000 (FORCED)
busfault_stat: 0xfffffff8 (valid FAULTADDR)
Interpreting that, we've had a bus fault exception with the address of the offending instruction stored in FAULTADDR. I'm not sure what caused the CPU to try and call 0xffffff8 but I'm pretty sure that will have caused the problem.

call to ffi_call fails even though arguments look right

Consider this gist. I have checked and double checked this piece of code for defects and can't find any apparent flaws in the code. It also compiles fine when I use g++ -g -std=c++11 -Wall dynlibtest.cc -ldl -lffi -lstdc++ -odynlibtest && ./dynlibtest (the -ldl and -lffi switches are for the dynamic loading and FFI libraries, respectively).
However, when the highlighted line (l.96) executes it segfaults.
I have also tried pulling it through gdb, and after installing the libc debugging symbols it spits this message out when the ./dynlibtest bin segfaults:
(gdb) next
Program received signal SIGSEGV, Segmentation fault.
__memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:157
157 ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: No such file or directory.
Who can help me understand why this segfaults? Is it a bug of some kind or am I using one of the API's wrong?
For reference: the first part of the code calls gettimeofday directly to show that the code can indeed find it, and that even the args are correct when it is called directly.
EDIT: I have added the gdb output when the code segfaults with the output of bt also attached:
$ gdb ./dynlibtest
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./dynlibtest...done.
(gdb) break 96
Breakpoint 1 at 0x401032: file dynlibtest.cc, line 96.
(gdb) run
Starting program: /home/j/dev/elisp-ffi/dynlibtest
Test started...
Got main program handle
pre-alloc: tv.tv_sec = 140737340592552
Sleeping for 1 second
post-alloc: tv.tv_sec = 1432058412
Sleeping for 1 second
Fn ptr call: tv.tv_sec = 1432058413
FFI CIF preparation is OK
Sleeping for 1 second
Breakpoint 1, main () at dynlibtest.cc:96
96 ffi_call(&cif, FFI_FN(gettimeofday), &result, args);
(gdb) next
Program received signal SIGSEGV, Segmentation fault.
__memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:157
157 ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: No such file or directory.
(gdb) bt
#0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:157
#1 0x00007ffff79d34c2 in memcpy (__len=8, __src=0x0, __dest=0x7fffffffda48) at /usr/include/x86_64-linux-gnu/bits/string3.h:51
#2 ffi_call (cif=0x7fffffffdca0, fn=0x400ab0 , rvalue=0x7fffffffdc40, avalue=0x7fffffffdc00) at ../src/x86/ffi64.c:504
#3 0x000000000040104e in main () at dynlibtest.cc:96
(gdb)

MPI debugging with GDB - No symbol "i" in current context

I need to debug my MPI application written in C. I wanted to use the system with GDB attached manually to processes, as it's recommended here (paragraph 6).
The problem is, when I try to print the value of the variable "i", I get this error:
No symbol "i" in current context.
The same problem is with set var i=5. When i try to run info local, it simply states "no locales".
System Ubuntu 14.04
MPICC cc (Ubuntu 4.8.2-19ubuntu1) 4.8.2
GDB GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1.
I compile my code with the command
mpicc -o hello hello.c
and execute it with
mpiexec -n 2 ./hello
I've tried to look for this problem, but the solution is usually not to use any optimalization (-O) options in GCC, but it's not useful for me, because I don't use any of them here and I'm compiling with MPICC. I've already tried to declare "i" variable as volatile, and launch mpicc with -g and -O0, but nothing helps.
DBG message
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 3778
Reading symbols from /home/martin/Dokumenty/Programovani/mpi_trenink/hello...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libmpich.so.10...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/x86_64-linux-gnu/libmpich.so.10
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libc-2.19.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libc.so.6
Reading symbols from /usr/lib/x86_64-linux-gnu/libmpl.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/x86_64-linux-gnu/libmpl.so.1
Reading symbols from /lib/x86_64-linux-gnu/librt.so.1...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/librt-2.19.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/librt.so.1
Reading symbols from /usr/lib/libcr.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libcr.so.0
Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libpthread-2.19.so...done.
done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Loaded symbols for /lib/x86_64-linux-gnu/libpthread.so.0
Reading symbols from /lib/x86_64-linux-gnu/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libgcc_s.so.1
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/ld-2.19.so...done.
done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib/x86_64-linux-gnu/libdl.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libdl-2.19.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libdl.so.2
Reading symbols from /lib/x86_64-linux-gnu/libnss_files.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libnss_files-2.19.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libnss_files.so.2
0x00007f493e53c9a0 in __nanosleep_nocancel ()
at ../sysdeps/unix/syscall-template.S:81
81 ../sysdeps/unix/syscall-template.S: No such file or directory.
My code
#include <stdio.h>
#include <mpi.h>
#include <unistd.h> // sleep()
int main(){
MPI_Init(NULL, NULL);
/* DEBUGGING STOP */
int i = 0;
while(i == 0){
sleep(30);
}
int world_size;
MPI_Comm_size( MPI_COMM_WORLD, &world_size );
int process_id; // casto znaceno jako 'world_rank'
MPI_Comm_rank( MPI_COMM_WORLD, &process_id );
char processor_name[ MPI_MAX_PROCESSOR_NAME ];
int name_len;
MPI_Get_processor_name( processor_name, &name_len );
printf("Hello! - sent from process %d running on processor %s.\n\
Number of processors is %d.\n\
Length of proc name is %d.\n\
***********************\n",
process_id, processor_name, world_size, name_len);
MPI_Finalize();
return 0;
}
With a high probability GDB is to break the process while it is deep into the implementation of the sleep(3) function. You could check that by first issuing the bt (backtrace) command:
(gdb) bt
#0 0x00000030e0caca3d in nanosleep () from /lib64/libc.so.6
#1 0x00000030e0cac8b0 in sleep () from /lib64/libc.so.6
#2 0x0000000000400795 in main (argc=1, argv=0x7fff64ae4688) at sleeper.c:9
i is not present in the frame of nanosleep:
(gdb) info locals
No symbol table info available.
Select the stack frame of the main function by issuing the frame x command (where x is the frame number, 2 in the example shown).
(gdb) f 2
#2 0x0000000000400795 in main (argc=1, argv=0x7fff64ae4688) at sleeper.c:9
9 while(i == 0) { sleep(30); }
i should be there now:
(gdb) info locals
i = 0
You might also need to change the active thread if GDB happens to attach to the wrong one. Many MPI libraries spawn additional threads, e.g. with Intel MPI:
(gdb) info threads
3 Thread 0x7f8b9fada700 (LWP 39085) 0x00000030e0cdf1b3 in poll () from /lib64/libc.so.6
2 Thread 0x7f8b9f0d9700 (LWP 39087) 0x00000030e0cdf1b3 in poll () from /lib64/libc.so.6
* 1 Thread 0x7f8ba1b51700 (LWP 39066) 0x00000030e0caca3d in nanosleep () from /lib64/libc.so.6
The thread marked with * is the one being examined. If some other thread is active, switch to the main one with the thread 1 command.
I've finally solved this. The point is I had to examine the contents of the certain frame with up command, before trying to print the variable "i" up or changing its value.
Step-by-step solution
Compile this code with mpicc -o hello hello.c -g -O0.
Launch the program with mpiexec -n 2 ./hello.
Find the process ID (PID) out.
I use the command ps -e | grep hello.
Other option is to use simply pstree.
And finally, you can use the native Linux function getpid().
Next step is to open a new terminal and launch GDB with the command gdb --pid debugged_process_id.
Now, in debugger type bt.
The output will be similar to this one:
#0 0x00007f63667e09a0 in __nanosleep_nocancel ()
at ../sysdeps/unix/syscall-template.S:81
#1 0x00007f63667e0854 in __sleep (seconds=0)
at ../sysdeps/unix/sysv/linux/sleep.c:137
#2 0x00000000004009ec in main () at hello.c:20
As we can see, paragraph 2 points to the code hello.c, so we can look at it more in detail. Type up 2.
The output will be similar to this one:
#2 0x00000000004009ec in main () at hello.c:20
warning: Source file is more recent than executable.
20 sleep(30);
And finally, now we can print all the local variables in this block out. Type info local.
The output:
i = 0
world_size = 0
process_id = 0
processor_name = "\000\000\000\000\000\000\000\000 5\026gc\177\000\000\200\306Η\377\177\000\000p\306Η\377\177\000\000.N=\366\000\000\000\000\272\005#\000\000\000\000\000\377\377\377\377\000\000\000\000%0`\236\060\000\000\000\250\361rfc\177\000\000x\n\026gc\177\000\000\320\067`\236\060\000\000\000\377\377\377\177\376\377\377\377\001\000\000\000\000\000\000\000\335\n#\000\000\000\000\000\377\377\377\377\377\377\377\377\000\000\000\000\000\000\000"
name_len = 1718986550
Now we can free the stopper loop by set var i=1 and continue with debugging.

SDL - Segmentation Fault (core dumped), any thoughts?

Having this problem since I've installed SDL. First of all, I've tried to install it with the tar.gz file, didn't went ok when trying to compile (terminal couldn't find the dir for SDL lib), so after that I've installed the synpatic pack mng, and sucessfully downloaded the "libsdl1.2-dev" file.
I am following lazzy foo's tutorial for SDL, whenever I try to compile a simple code to create a screen and blit an image, i get the following message in the terminal:
(gcc -Wall -o teste teste.c -lSDL -lSDL_image)
"Segmentation fault (core dumped)"
Here it is my code in C:
#include <stdio.h>
#include <stdlib.h>
#include "SDL/SDL.h"
int main( int argc, char* args[] )
{
SDL_Surface* hello = NULL;
SDL_Surface* screen = NULL;
SDL_Init(SDL_INIT_EVERYTHING);
screen = SDL_SetVideoMode(640, 480, 32, SDL_SWSURFACE);
if (screen == NULL) {
printf("SDL_SetVideoMode failed: %s\n", SDL_GetError());
exit(1); /* Unrecoverable error */
}
hello = SDL_LoadBMP("hello.bmp");
SDL_BlitSurface(hello, NULL, screen, NULL);
SDL_Flip(screen);
SDL_Delay(2000);
SDL_FreeSurface(hello);
SDL_Quit();
return 0;
}
I've already made sure that hello.bmp is in the same dir of my teste.c file.
Here's a log using gdb to backtrace:
LOG
GNU gdb (Ubuntu 7.8-1ubuntu4) 7.8.0.20141001-cvs
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from teste...(no debugging symbols found)...done.
(gdb) run
Starting program: /home/lazzo/Documentos/Treino/teste
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff707c700 (LWP 5605)]
Program received signal SIGSEGV, Segmentation fault.
SDL_Flip (screen=0x0) at ./src/video/SDL_video.c:1109
1109 ./src/video/SDL_video.c: No such file or directory.
(gdb) bt
#0 SDL_Flip (screen=0x0) at ./src/video/SDL_video.c:1109
#1 0x00000000004009a2 in main ()
(gdb) c
Continuing.
[Thread 0x7ffff7fd8740 (LWP 5601) exited]
Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
(gdb) q
]0;lazzo#J-Ubuntu: ~/Documentos/Treinolazzo#J-Ubuntu:~/Documentos/Treino$ exit
exit
END OF LOG
Any help you guys could give me would be really appreciated, and I apologize for my bad english, I am from Brazil and still learning english.
UPDATE
After adding Klas suggestion to my code, I've got this from terminal:
"SDL_SetVideoMode failed: No avaible video device"
How is that even possible? (my videocard is a radeon HD 4850 btw)
Problem round 1 (compilation):
The target filename must follow immediately after the -o option, so you should change the order of the arguments:
gcc -Wall -o teste teste.c -lSDL -lSDL_image
This may not solve all your build problems, but it is a good start.
Problem round 2 (adding error handling):
The call to SDL_SetVideoMode returned null. If you get a return value of null you should call SDL_GetError immediately after to check what the error is:
screen = SDL_SetVideoMode(640, 480, 32, SDL_SWSURFACE);
if (screen == NULL) {
printf("SDL_SetVideoMode failed: %s\n", SDL_GetError());
exit(1); /* Unrecoverable error */
}
You should add similar handling for the other SDL calls.
Only thing that have worked out in my case was to format Ubuntu and try another distro. Right now I am using Linux Mint, and despite that fact that it's totally based on Ubuntu, everything is working as expected now. Just sharing my solution to the problem, in case somebody else have this very same problem someday.

Resources