I am trying to use newlib on a TI CC2538 ARM Cortex M3 part. The objective is to use printf for debugging messages and I've actually got that working. However the system will segfault after a number of messages (ARM calls that a HardFault) and I have no idea why.
I used GDB to get the following stack trace:
(gdb) bt full
#0 FaultISR () at src/startup_gcc.c:307
fault_stat = 0x8200
hfault_stat = 0x40000000
mmfault_stat = 0xfffffff8
busfault_stat = 0xfffffff8
buf = "\000\000\000\000\371\377\377\377\026\000\000\000\000\000\000\000\n\000\000\000\000\000\000\000\b\f\000 \002\000\000\000\060\r\000 \320\f\000 \002\000\000\000\303w \000\000\000\000\000w?\032\000\360\v\000 \360\v\000 \002\000\000\000\027\000\000\000\003\000\000\000\277\022\035\000\b\f\000 \b\f\000 \r\000 \027\000\000\000\f\000\000\000\371\377\377\377T\025\000 a\217 \000\r\000\000\000\001\000\000\000\210\r\000 s\263\""
#1 <signal handler called>
No symbol table info available.
#2 0x002076dc in strlen ()
No symbol table info available.
#3 0x0020494a in _svfprintf_r ()
No symbol table info available.
#4 0x002045aa in _vsnprintf_r ()
No symbol table info available.
#5 0x002045fe in vsnprintf ()
No symbol table info available.
#6 0x00200fa8 in ws_debug_vfprintf (f=STDERR, fmt=0x208f24 "Received a packet of length %u\n", args=...) at src/os/ws_debug.c:220
len = 0x27
buf = 0x20001538 "DBG src/net/mac/packet_scheduler.c:123 "
#7 0x00200f4e in ws_debug_fprintf (f=STDERR, fmt=0x208f24 "Received a packet of length %u\n") at src/os/ws_debug.c:201
args = {__ap = 0x20000e50 <pui32Stack+440>}
#8 0x00201dea in packet_scheduler_timer () at src/net/mac/packet_scheduler.c:123
phy_len = 0xd
pkt = 0x8ef6
fcf = 0x0
buf = 0x20001738 "\200\220"
#9 0x00200c32 in ws_timer_update () at src/os/ws_timer.c:156
ptr = 0x20001728
new = 0x20001728
#10 0x002005c2 in main () at src/main.c:143
No locals.
As you can see, the newlib parts (#2 - #5) don't have any information with them making it hard to debug. I assume this is because newlib was stripped of debugging symbols, however I recompiled newlib and get the same result.
I am running Arch linux using the toolchain available in the official repos.
arm-none-eabi-gcc -v
Using built-in specs.
COLLECT_GCC=arm-none-eabi-gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-none-eabi/5.1.0/lto-wrapper
Target: arm-none-eabi
Configured with: /build/arm-none-eabi-gcc/src/gcc-5-20150519/configure --target=arm-none-eabi --prefix=/usr --with-sysroot=/usr/arm-none-eabi --with-native-system-header-dir=/include --libexecdir=/usr/lib --enable-languages=c,c++ --enable-plugins --disable-decimal-float --disable-libffi --disable-libgomp --disable-libmudflap --disable-libquadmath --disable-libssp --disable-libstdcxx-pch --disable-nls --disable-shared --disable-threads --disable-tls --with-gnu-as --with-gnu-ld --with-system-zlib --with-newlib --with-headers=/usr/arm-none-eabi/include --with-python-dir=share/gcc-arm-none-eabi --with-gmp --with-mpfr --with-mpc --with-isl --with-libelf --enable-gnu-indirect-function --with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm' --with-pkgversion='Arch Repository' --with-bugurl=https://bugs.archlinux.org/ --with-multilib-list=armv6-m,armv7-m,armv7e-m,armv7-r
Thread model: single
gcc version 5.1.0 (Arch Repository)
-
arm-none-eabi-gdb -v
GNU gdb (GDB) 7.9.1
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-unknown-linux-gnu --target=arm-none-eabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
I manually built the newlib package using the Arch Build System (ABS) and I think that the debugging symbols should be included (the function names show up in GDB. Is that enough?)
Any ideas on what I can try next?
Edit
I have revised the output so the fault status registers are actually readable.
Here's the decoded output:
fault_stat: 0x8200 (BFARV, PRECISE)
hfault_stat: 0x40000000 (FORCED)
busfault_stat: 0xfffffff8 (valid FAULTADDR)
Interpreting that, we've had a bus fault exception with the address of the offending instruction stored in FAULTADDR. I'm not sure what caused the CPU to try and call 0xffffff8 but I'm pretty sure that will have caused the problem.
Related
I came to know about the __libc_start_main function. I have been thinking that __libc_start_main call the main function like this, but when I checked ret of main function of my own program, it is the address of __libc_start_call_main. What's the diffrence between __libc_start_main and __libc_start_call_main?
source code of my program, test.c
#include <stdio.h>
int main(void)
{
puts("Sunghyeon Lee");
}
gdb output:
──(kali㉿kali)-[~]
└─$ gdb test
GNU gdb (Debian 12.1-3) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from test...
(No debugging symbols found in test)
(gdb) b *main
Breakpoint 1 at 0x1139
(gdb) r
Starting program: /home/kali/test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, 0x0000555555555139 in main ()
(gdb) x/a $rsp
0x7fffffffdec8: 0x7ffff7dd920a <__libc_start_call_main+122>
Thank you for your help!
I have searched about the diffrence between __libc_start_main and __ibc_start_call_main, I have never found the explanation about it.
I have never found the explanation about it.
Take a look at the commit which created __libc_start_call_main.
Effectively a chunk of __libc_start_main was split out into a separate routine.
I am new to GDB, and need to examine content of function using gdb. The program need to be debugged is perf and function name is ___fprintf_chk() , and tried some thing like:
perf stat -I 1000 -e branch-misses
time counts unit events
1.000222090 1746 branch-misses
2.000486444 1986 branch-misses
3.000712797 1783 branch-misses
sudo ./gdb --pid $(pidof perf)
GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-none-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 38977
Reading symbols from /usr/bin/perf...
warning: Unable to determine the number of hardware watchpoints available.
warning: Unable to determine the number of hardware breakpoints available.
Reading symbols from /lib/aarch64-linux-gnu/libpthread.so.0...
(No debugging symbols found in /lib/aarch64-linux-gnu/libpthread.so.0)
Error while reading shared library symbols for /lib/aarch64-linux-gnu/libpthread.so.0:
Cannot find user-level thread for LWP 38977: generic error
Reading symbols from /lib/aarch64-linux-gnu/librt.so.1...
(No debugging symbols found in /lib/aarch64-linux-gnu/librt.so.1)
Reading symbols from /lib/aarch64-linux-gnu/libm.so.6...
(No debugging symbols found in /lib/aarch64-linux-gnu/libm.so.6)
Reading symbols from /lib/aarch64-linux-gnu/libdl.so.2...
(No debugging symbols found in /lib/aarch64-linux-gnu/libdl.so.2)
Reading symbols from /lib/libopencsd_c_api.so.1...
--Type <RET> for more, q to quit, c to continue without paging--q
Quit
(gdb) b __fprintf_chk
Breakpoint 1 at 0xaaaae55b2614
(gdb) c
Continuing.
Breakpoint 1, 0x0000aaaae55b2614 in __fprintf_chk#plt ()
(gdb) info args
No symbol table info available.
Now, need to know when break point is put at __fprintf_chk, does it on __fprintf_chk that is part of perf binary or from other shared objects (.so files, as I see Reading symbols from /lib/aarch64-linux-gnu/libpthread.so.0...)
Also, info args gives nothing, does it mean perf need to be compiled with debugging info (how)?
I have a problem where one or more threads lock each other. I dont know what going on there. The debugger cannot break (thread 1), breaks but cannot get a backtrace (thread 2+5) or shows the backtrace (thread 3)
Gdb native shows the same.
I learned that this is case because libc imlements this in assembler an gdb cannot walt the stack correctly. Sometimes (i dont know when), i can do a few steps in the assembly, then i see the backtrace.
I just tried a x64 program and it works.
See my sample code:
#include <time.h>
int main()
{
while(1)
{
struct timespec ts;
ts.tv_sec = 1;
ts.tv_nsec = 0;
clock_nanosleep(CLOCK_MONOTONIC, 0, &ts, 0);
}
return 1;
}
gdb output 32 bit:
vagrant#PC41388-spvm-4650:/tmp$ gdb main32
GNU gdb (Ubuntu
7.7.1-0ubuntu5~14.04.2) 7.7.1 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later
http://gnu.org/licenses/gpl.html This is free software: you are free
to change and redistribute it. There is NO WARRANTY, to the extent
permitted by law. Type "show copying" and "show warranty" for
details. This GDB was configured as "x86_64-linux-gnu". Type "show
configuration" for configuration details. For bug reporting
instructions, please see: http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/. For help, type
"help". Type "apropos word" to search for commands related to
"word"... Reading symbols from main32...(no debugging symbols
found)...done.
(gdb) r Starting program: /tmp/main32 [Thread
debugging using libthread_db enabled] Using host libthread_db library
"/lib/x86_64-linux-gnu/libthread_db.so.1". ^C Program received signal
SIGINT, Interrupt. 0x55579cd9 in ?? () (gdb) bt
#0 0x55579cd9 in ?? ()
#1 0x555b0af3 in __libc_start_main (main=0x80484dd , argc=1,
argv=0xffffcee4, init=0x8048520 <__libc_csu_init>,
fini=0x8048590 <__libc_csu_fini>, rtld_fini=0x55564160 <_dl_fini>,
stack_end=0xffffcedc) at libc-start.c:287
#2 0x08048401 in _start () (gdb)
gdb output 64 bit:
vagrant#PC41388-spvm-4650:/tmp$ gdb main64
GNU gdb (Ubuntu
7.7.1-0ubuntu5~14.04.2) 7.7.1 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later
http://gnu.org/licenses/gpl.html This is free software: you are free
to change and redistribute it. There is NO WARRANTY, to the extent
permitted by law. Type "show copying" and "show warranty" for
details. This GDB was configured as "x86_64-linux-gnu". Type "show
configuration" for configuration details. For bug reporting
instructions, please see: http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/. For help, type
"help". Type "apropos word" to search for commands related to
"word"... Reading symbols from main64...(no debugging symbols
found)...done.
(gdb) r Starting program: /tmp/main64 [Thread
debugging using libthread_db enabled] Using host libthread_db library
"/lib/x86_64-linux-gnu/libthread_db.so.1". b ^C Program received
signal SIGINT, Interrupt. 0x00002aaaaafe092a in __clock_nanosleep
(clock_id=1, flags=0,
req=0x7fffffffdc10, rem=0x2aaaaafe092a <__clock_nanosleep+58>)
at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:41 41 ../sysdeps/unix/sysv/linux/clock_nanosleep.c: No such file or
directory.
(gdb) bt
#0 0x00002aaaaafe092a in __clock_nanosleep (clock_id=1, flags=0,
req=0x7fffffffdc10, rem=0x2aaaaafe092a <__clock_nanosleep+58>)
at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:41
#1 0x0000000000400630 in main () (gdb)
set architecture i386 does not help either.
More news: info proc mapp shows the x32 app is in [vvar] whereas the x64 app is at libc. This would explain why gdb cant find the backtrace.
So my question is: Is there a different version of the libc, where this works? I am using ubuntu14.04.
I updated to a newer gdb version (currently the latest, 7.12.1). This fixed the problem.
Note that gbd:i386 did not work either on lubuntu x64, whereas it worked fine under lubuntu x32. Also note that both main32 and libc are binary identical on lubuntu x64 and x32.
Consider this gist. I have checked and double checked this piece of code for defects and can't find any apparent flaws in the code. It also compiles fine when I use g++ -g -std=c++11 -Wall dynlibtest.cc -ldl -lffi -lstdc++ -odynlibtest && ./dynlibtest (the -ldl and -lffi switches are for the dynamic loading and FFI libraries, respectively).
However, when the highlighted line (l.96) executes it segfaults.
I have also tried pulling it through gdb, and after installing the libc debugging symbols it spits this message out when the ./dynlibtest bin segfaults:
(gdb) next
Program received signal SIGSEGV, Segmentation fault.
__memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:157
157 ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: No such file or directory.
Who can help me understand why this segfaults? Is it a bug of some kind or am I using one of the API's wrong?
For reference: the first part of the code calls gettimeofday directly to show that the code can indeed find it, and that even the args are correct when it is called directly.
EDIT: I have added the gdb output when the code segfaults with the output of bt also attached:
$ gdb ./dynlibtest
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./dynlibtest...done.
(gdb) break 96
Breakpoint 1 at 0x401032: file dynlibtest.cc, line 96.
(gdb) run
Starting program: /home/j/dev/elisp-ffi/dynlibtest
Test started...
Got main program handle
pre-alloc: tv.tv_sec = 140737340592552
Sleeping for 1 second
post-alloc: tv.tv_sec = 1432058412
Sleeping for 1 second
Fn ptr call: tv.tv_sec = 1432058413
FFI CIF preparation is OK
Sleeping for 1 second
Breakpoint 1, main () at dynlibtest.cc:96
96 ffi_call(&cif, FFI_FN(gettimeofday), &result, args);
(gdb) next
Program received signal SIGSEGV, Segmentation fault.
__memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:157
157 ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: No such file or directory.
(gdb) bt
#0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:157
#1 0x00007ffff79d34c2 in memcpy (__len=8, __src=0x0, __dest=0x7fffffffda48) at /usr/include/x86_64-linux-gnu/bits/string3.h:51
#2 ffi_call (cif=0x7fffffffdca0, fn=0x400ab0 , rvalue=0x7fffffffdc40, avalue=0x7fffffffdc00) at ../src/x86/ffi64.c:504
#3 0x000000000040104e in main () at dynlibtest.cc:96
(gdb)
I have the following C application:
#include <stdio.h>
void smash()
{
int i;
char buffer[16];
for(i = 0; i < 17; i++) // <-- exceeds the limit of the buffer
{
buffer[i] = i;
}
}
int main()
{
printf("Starting\n");
smash();
return 0;
}
I cross-compiled using the following version of gcc:
armv5l-linux-gnueabi-gcc -v
Using built-in specs.
Target: armv5l-linux-gnueabi
Configured with: /home/tarjeif/svn/builder/build_armv5l-linux-gnueabi/gcc-4.4.1/gcc-4.4.1/configure --target=armv5l-linux-gnueabi --host=i486-linux-gnu --build=i486-linux-gnu --prefix=/home/tarjeif/svn/builder/build_armv5l-linux-gnueabi/toolchain --with-sysroot=/home/tarjeif/svn/builder/build_armv5l-linux-gnueabi/toolchain --with-headers=/home/tarjeif/svn/builder/build_armv5l-linux-gnueabi/toolchain/include --enable-languages=c,c++ --with-gmp=/home/tarjeif/svn/builder/build_armv5l-linux-gnueabi/gmp-5.0.0/gmp-host-install --with-mpfr=/home/tarjeif/svn/builder/build_armv5l-linux-gnueabi/mpfr-2.4.2/mpfr-host-install --disable-nls --disable-libgcj --disable-libmudflap --disable-libssp --disable-libgomp --enable-checking=release --with-system-zlib --with-arch=armv5t --with-gnu-as --with-gnu-ld --enable-shared --enable-symvers=gnu --enable-__cxa_atexit --disable-nls --without-fp --enable-threads
Thread model: posix
gcc version 4.4.1 (GCC)
Invoked like this:
armv5l-linux-gnueabi-gcc -ggdb3 -fstack-protector-all -O0 test.c
When run on target, it outputs:
Starting
*** stack smashing detected ***: ./a.out terminated
Aborted (core dumped)
I load the resulting core dump in gdb yielding the following backtrace:
GNU gdb (GDB) 7.0.1
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=i486-linux-gnu --target=armv5l-linux-gnueabi".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/andersn/workspace/stacktest/a.out...done.
Reading symbols from /home/andersn/workspace/stacktest/linux/toolchain/lib/libc.so.6...done.
Loaded symbols for /home/andersn/workspace/stacktest/linux/toolchain/lib/libc.so.6
Reading symbols from /home/andersn/workspace/stacktest/linux/toolchain/lib/ld-linux.so.3...done.
Loaded symbols for /home/andersn/workspace/stacktest/linux/toolchain/lib/ld-linux.so.3
Reading symbols from /home/andersn/workspace/stacktest/linux/toolchain /lib/libgcc_s.so.1...done.
Loaded symbols for /home/andersn/workspace/stacktest/linux/toolchain/lib/libgcc_s.so.1
Core was generated by `./a.out'.
Program terminated with signal 6, Aborted.
#0 0x40052d4c in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:67
67 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
in ../nptl/sysdeps/unix/sysv/linux/raise.c
(gdb) bt
#0 0x40052d4c in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:67
#1 0x40054244 in *__GI_abort () at abort.c:92
#2 0x40054244 in *__GI_abort () at abort.c:92
#3 0x40054244 in *__GI_abort () at abort.c:92
#4 0x40054244 in *__GI_abort () at abort.c:92
#5 0x40054244 in *__GI_abort () at abort.c:92
#6 0x40054244 in *__GI_abort () at abort.c:92
... and so on ...
Now, the question:
I'm totally unable to find the function causing the stack smashing from GDB even though the smash() function don't overwrite any structural data of the stack, only the stack protector itself. What should I do?
The problem is that the version of GCC which compiled your target libc.so.6 is buggy and did not emit correct unwind descriptors for __GI_raise. With incorrect unwind descriptors, GDB gets into a loop while unwinding the stack.
You can examine the unwind descriptors with
readelf -wf /home/andersn/workspace/stacktest/linux/toolchain/lib/libc.so.6
I expect you'll get exact same result in GDB from any program calling abort, e.g.
#include <stdlib.h>
void foo() { abort(); }
int main() { foo(); return 0; }
Unfortunately, there isn't much you can do, other than trying to build newer version of GCC, and then rebuilding the whole "world" with it.
It's not the case that GDB can always work out what happened to a smashed stack even with -fstack-protector-all (and even with -Wstack-protector to warn about functions with frames that weren't protected). Example.
In these cases the stack protector has done its job (killed a misbehaving app) but hasn't done the debugger any favors. (The classic example is a stack smash where a write has occurred with a large enough stride that it jumps the canary.) In these cases it may become necessary to binary search through the code via breakpoints to narrow down which region of the code is causing the smash, then single step through the smash to see how it happened.
Have you tried resolving this complaint: "../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory." to see if actually being able to resolve symbols would help?