GDB and trouble with core dumps - c

Meet my
$ uname -a
Linux hostmachine 4.1.2-2-ARCH #1 SMP PREEMPT Wed Jul 15 08:30:32 UTC 2015 x86_64 GNU/Linux
I'm trying to learn how to use GDB for debugging C programs. I think it would be particularly excellent if I could use GDB to ferret out bugs that lead to segfaults. I have a small program that I've written as a solution to K&R's exercise 1-13, and given an input string of a certain size it will generate a segfault:
$ ~/learning_c/KR_exercises/chapter_1/1.13.x`
--I provide a string from stdin, and...--
Segmentation fault (core dumped)
According to the Arch wiki, "Systemd's default behavior is to generate core dumps for all processes in /var/lib/systemd/coredump/."
Okie doke:
$ls /var/lib/systemd/coredump/core.1\x2e13\x2ex.1000.0da6be3a2b4647c8befe14e0e73af848.1719.1438627150000000.lz4
But when I run:
$ gdb -q ~/learning_c/KR_exercises/chapter_1/1.13.x /var/lib/systemd/coredump/core.1\\x2e13\\x2ex.1000.0da6be3a2b4647c8befe14e0e73af848.1719.1438627150000000.lz4
I get:
Reading symbols from /home/dean/learning_c/KR_exercises/chapter_1/1.13.x...done.
"/var/lib/systemd/coredump/core.1\x2e13\x2ex.1000.0da6be3a2b4647c8befe14e0e73af848.1719.1438627150000000.lz4" is not a core dump: File format not recognized
Trying to generate a core dump by attaching GDB to the process as detailed here only makes my terminal emulator start capturing control characters (^D, ^C, and ^Z won't work in emulator after attaching GDB), and if a segfault is occuring after attaching GDB it isn't being reported in the shell.
Help me to understand, oh merciful and beneficent lords of Stack Overflow!
ADDENDUM:
I've solved this particular issue, thanks largely to WhozCraig, whom suggested that GDB was behaving as it should have when being force-fed an lz4 compressed corefile. If Craig would be so kind as to post a solution saying something similar, I'd be happy to give him that big 'ol check mark.
The easist solution is to start gdb via a subroutine named coredumpctl along with the crashed program's PID, a la
$coredumpctl gdb *PID HERE*
This vexes me, Arch, and I may migrate over to Gentoo because of it.

I've solved this particular issue, thanks largely to WhozCraig, whom suggested that GDB was behaving as it should have when being force-fed an LZ4 compressed corefile. If Craig would be so kind as to post a solution saying something similar, I'd be happy to give him that big 'ol check mark I'm taking all the credit, though. Bwahahaha!
The easiest solution is to start gdb via a subroutine named coredumpctl along with the crashed program's PID, a la
$coredumpctl gdb PID HERE
This vexes me, Arch, and I may migrate over to Gentoo because of it.

I have same purpose with you. Just uncompress lz4 file by lz4 command, then you can debug by gdb crashed_C_executable_file uncompressed_coredump_file

Related

GCC compiler on Jailbroken AppleTV compiling error

Ok so I think this is probably out of the ballpark for most people here (including me :P ) but here's my problem...
I am trying to get together a basic compiling toolchain for an AppleTV 3rd gen. After a very long time of digging through archives and source code I got a decent set of tools together. (csu, gdb, gcc, headfile, ldid, real-libgcc!, make, odcctools, uuid, file, rsync, autoconf, gawk, python, coreutils, inetutils, git, less, nano, gettext) and while most of them are outdated they still operate decently. All except for gcc. Great, eh? So my problem is whenever I compile ANY C code, even something as simple as
#include <stdio.h>
int main() {
printf("Hello World!\n");
return 0;
}
it will always return Killed: 9, I don't understand why it would do this but I do know that it means the program was killed with the signal 9 which terminates a process instantly no matter what. However I am very new to C in general and any help would be appreciated.
Thanks in advance,
A\\/.
P.S here is the output from uname -a
Darwin Apple-TV 14.0.0 Darwin Kernel Version 14.0.0: Fri Jan 29 18:51:13 PST 2021; root:xnu-2784.40.6~93/MarijuanARM_S5L8947X AppleTV3,2 arm J33iAP Darwin

How to read, understand, analyze, and debug a Linux kernel panic?

Consider the following Linux kernel dump stack trace; e.g., you can trigger a panic from the kernel source code by calling panic("debugging a Linux kernel panic");:
[<001360ac>] (unwind_backtrace+0x0/0xf8) from [<00147b7c>] (warn_slowpath_common+0x50/0x60)
[<00147b7c>] (warn_slowpath_common+0x50/0x60) from [<00147c40>] (warn_slowpath_null+0x1c/0x24)
[<00147c40>] (warn_slowpath_null+0x1c/0x24) from [<0014de44>] (local_bh_enable_ip+0xa0/0xac)
[<0014de44>] (local_bh_enable_ip+0xa0/0xac) from [<0019594c>] (bdi_register+0xec/0x150)
In unwind_backtrace+0x0/0xf8 what does +0x0/0xf8 stand for?
How can I see the C code of unwind_backtrace+0x0/0xf8?
How to interpret the panic's content?
It's just an ordinary backtrace, those functions are called in reverse order (first one called was called by the previous one and so on):
unwind_backtrace+0x0/0xf8
warn_slowpath_common+0x50/0x60
warn_slowpath_null+0x1c/0x24
ocal_bh_enable_ip+0xa0/0xac
bdi_register+0xec/0x150
The bdi_register+0xec/0x150 is the symbol + the offset/length there's more information about that in Understanding a Kernel Oops and how you can debug a kernel oops. Also there's this excellent tutorial on Debugging the Kernel
Note: as suggested below by Eugene, you may want to try addr2line first, it still needs an image with debugging symbols though, for example
addr2line -e vmlinux_with_debug_info 0019594c(+offset)
Here are two alternatives for addr2line. Assuming you have the proper target's toolchain, you can do one of the following:
Use objdump:
locate your vmlinux or the .ko file under the kernel root directory, then disassemble the object file :
objdump -dS vmlinux > /tmp/kernel.s
Open the generated assembly file, /tmp/kernel.s. with a text editor such as vim. Go to
unwind_backtrace+0x0/0xf8, i.e. search for the address of unwind_backtrace + the offset. Finally, you have located the problematic part in your source code.
Use gdb:
IMO, an even more elegant option is to use the one and only gdb. Assuming you have the suitable toolchain on your host machine:
Run gdb <path-to-vmlinux>.
Execute in gdb's prompt: list *(unwind_backtrace+0x10).
For additional information, you may checkout the following resources:
Kernel Debugging Tricks.
Debugging The Linux Kernel Using Gdb
In unwind_backtrace+0x0/0xf8 what the +0x0/0xf8 stands for?
The first number (+0x0) is the offset from the beginning of the function (unwind_backtrace in this case). The second number (0xf8) is the total length of the function. Given these two pieces of information, if you already have a hunch about where the fault occurred this might be enough to confirm your suspicion (you can tell (roughly) how far along in the function you were).
To get the exact source line of the corresponding instruction (generally better than hunches), use addr2line or the other methods in other answers.

How do I find segmentation fault from multiple files using GDB

I have been asked in an interview how can you debug segmentation fault in C program using GDB.
I told them we can compile our program with -g option so as it add debugging information to binary file and can read core dump file but then interviewer told me if he we have 3 to 4 files compiled together but one of them causing segmentation fault then how do we debug in GDB?
$ gcc -ggdb s1.c s2.c s3.c -o myprog
$ gdb myprog
(gdb) run --arg1 --arg2
GDB will run the program as normal, when the segmentation fault occurs GDB will drop back to its prompt and it will be almost the same as running GDB with a core file. The major difference is there are some things you cannot do/print with a core file that you can when the program has crashed inside of GDB. (You can use print to call some functions inside the program, for example.)
You can also attach to an already running program using gdb --pid <the programs pid>.
Either with a core file or with one of the methods above, when you have the GDB prompt after the crash, type backtrace (or bt for short) and GDB will show you the stack at the time of the crash, including the file names and line numbers of each call and the currently executing line.
If you are working under Linux the easier way to find segmentation fault is by using the tool named VALGRIND: http://valgrind.org/ .
You just need to compile your code with -g flag and then run ./valgrind .
Then you will know exactly in which function and in which line of code there is an error-uninitialized memory/memory read out of allocated space or sth.
You just run the program under gdb, and the debugger with catch the SIGSEGV and show you the line and instruction that faulted. Then you just examine the variable and/or register values to see what's wrong. Usually it's a rogue pointer value, and trying to access it with GDB will give and error, so it's easy.
And yes, recompiling everything with -g would be helpful. The interviewer probably wanted you to describe how you'd figure out which file had the fault (gdb just tells you when it catches the signal) and just recompile that one with debug info. If there's 20,000 source files that might be useful, but with 3 or 4 files, what's the point? Even with larger projects, you usually end up chasing the bad pointer through 10 functions and 5 files anyway, so again, what's the point? Debug info doesn't cost anything at run time, although it costs disk space in an installation.
compile the code in normal way by giving gcc filename
you will get a .out file, start running that and get the process id by giving ps -aef | grep filename.out
in a another window type gdb and enter,inside gdb prompt give attach processid (processid you will get from above command),give c to continue.once the execution finishes give "bt" inside gdb.you will get the place where the segmentation is occurring.
Sounds like they are looking to set it up so that you can step through the code as it is running, you can do this with the command line version or I think you can get a GUI for GDB.
one can use the following steps to debug segmentation fault using gdb
$ gdb <exec name >
$ r //run the pgm
$ where
$ f <1> <0> //to view the function n variables
$ list
$ p <variable>

Debugging Segmentation Faults on a Mac?

I'm having some problems with a program causing a segmentation fault when run on a Mac. I'm putting together an entry for the IOCCC, which means the following things are true about my program:
It's a very small C program in a single file called prog.c
I won't post it here, because it won't help (and would probably render the contest entry invalid)
It compiles cleanly under gcc using "cc -o prog prog.c -Wall"
Despite (or, more accurately, because of) the fact it contains a bunch of really bizarre uses of C, it has been constructed extremely carefully. I don't know of any part of it which is careless with memory (which is not to say that there can't possibly be bugs, just that if there are they're not likely to be obvious ones)
I'm primarily a Windows user, but several years ago I successfully compiled and ran it on several windows machines, a couple of Macs and a Linux box, with no problems. The code hasn't changed since then, but I no longer have access to those machines.
I don't have a Linux machine to re-test on, but as one final test, I tried compiling and running it on a MacBook Pro - Mac OSX 10.6.7, Xcode 4.2 (i.e. GCC 4.2.1). Again, it compiles cleanly from the command line. It seems that on a Mac typing "prog" won't make the compiled program run, but "open prog" seems to. Nothing happens for about 10 seconds (my program takes about a minute to run when it's successful), but then it just says "Segmentation fault", and ends.
Here is what I've tried, to track down the problem, using answers mostly gleaned from this useful StackOverflow thread:
On Windows, peppered the code with _ASSERTE(_CrtCheckMemory()); - The code ran dog-slow, but ran successfully. None of the asserts fired (they do when I deliberately add horrible code to ensure that _CrtCheckMemory and _ASSERTE are working as expected, but not otherwise)
On the Mac, I tried Mudflap. I tried to build the code using variations of "g++ -fmudflap -fstack-protector-all -lmudflap -Wall -o prog prog.c", which just produces the error "cc1plus: error: mf-runtime.h: No such file or directory". Googling the matter didn't bring up anything conclusive, but there does seem to be a feeling that Mudflap just doesn't work on Macs.
Also on the Mac, I tried Valgrind. I installed and built it, and built my code using "cc -o prog -g -O0 prog.c". Running Valgrind with the command "valgrind --leak-check=yes prog" produces the error "valgrind: prog: command not found". Remembering you have you "open" an exectable on a Mac I tried "valgrind --leak-check=yes open prog", which appears to run the program, and also runs Valgrind, which finds no problems. However, Valgrind is failing to find problems for me even when I run it with programs which are designed specifically to make it trigger error messages. I this also broken on Macs?
I tried running the program in Xcode, with all the Diagnostics checkboxes ticked in the Product->Edit Scheme... menu, and with a symbolic breakpoint set in malloc_error_break. The breakpoint doesn't get hit, the code stops with a callstack containing one thing ("dlopen"), and the only thing of note that shows up in the output window is the following:
Warning: Unable to restore previously selected frame.
No memory available to program now: unsafe to call malloc
I'm out of ideas. I'm trying to get Cygwin set up (it's taking hours though) to see if any of the tools will work that way, but if that fails then I'm at a loss. Surely there must be SOME tools which are capable of tracking down the causes of Segmentation faults on a Mac?
For the more modern lldb flavor
$ lldb --file /path/to/program
...
(lldb) r
Process 89510 launched
...
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x726f00)
* frame #0: 0x00007fff73856e52 libsystem_platform.dylib`_platform_strlen + 18
...
Have you compiled with -g and run it inside gdb? Once the app crashes, you can get a backtrace with bt that should show you where the crash occurs
In many cases, macOS stores the recent program crash logs under ~/Library/Logs/DiagnosticReports/ folder.
Usually I will try the following steps when doing troubleshooting on macOS:
Clean the existing crash logs under the ~/Library/Logs/DiagnosticReports/
Run the program again to reproduce the issue
Wait for a few seconds, the crash log will appear under the folder. The crash log is named like {your_program}_{crashing_date}_{id}_{your_host}.crash
Open the crash log with your text editor, search for the keyword Crashed to locate the thread causing the crash. It will show you the stack trace during crash, and in many cases, the exact line of source code causing the crash will be recorded as well.
Some links:
[1] https://mac-optimization.bestreviews.net/analyze-mac-crash-reports/

Remote Debugging multi-threaded C program with GDB

I am using Eclipse to develop and remotely debug some software for an ARM Processor. Unfortunately the software I am writing is multi-threaded and I am unable to debug it. If I place a break-point in the thread code, i get the following message:
Child terminated with signal = 5
Child terminated with signal = 0x5
GDBserver exiting
After doing quite a bit of Googling, I found a "solution" that proposed using this:
strip --strip-debug libpthread.so.0
Unfortunately, I still get the termination errors.
I would really appreciate your help in getting this figured out!
Thanks!
First, this (and subsequent) error(s):
cc1.exe: error: unrecognized command line option "-fstrip-debug"
is caused by adding strip --strip-debug etc. to the GCC command line. That is obviously bogus thing to do, and not at all what your googling suggested. (You might want to clean up your question to remove references to these errors; they have nothing to do with your problem.)
What it did (or should have) suggested, is using strip --strip-debug libpthread.so.0 instead of using strip libpthread.so.0.
This is because GDB can not work with threads if your libpthread.so.0 is fully stripped.
It can be stripped of debug symbols (which is what strip --strip-debug libpthread.so.0 does), but stripping it of all symbols (which is what strip libpthread.so.0 does) is a bad idea(TM).
Since you are (apparently) not yourself building libpthread.so.0, you shouldn't need to strip it either.
You should however verify that the provider of your toolchain did not screw it up. The following command should not report no symbols, and should in fact print a matching nptl_version (as a defined symbol):
nm /path/to/target/libpthread.so.0 | grep nptl_version
Assuming all is well so far, we can now diagnose your problem, except ... you didn't provide sufficient info ;-( In particular, when you run GDB, it should print something like using /path/to/libthread_db.so.0. You might have to hunt for GDB console in Eclipse, or you might want to run GDB from command line, so you see exactly what it prints.
It is crucial that the version of libthread_db.so.0 (for host) matches the version of lipthread.so.0 (for target). They should both be provided by your toolchain vendor.
Your problem is most likely that either GDB can't find libthread_db.so.0 at all, or that it finds the wrong one.

Resources