Is it possible to get backtrace on application exit? - c

I have an linux(ARM) multithread application. How can I get the backtraces (for all threads) of the app on exit (not a crash).
Note: I can’t use gdb with catch syscall because of not supported architecture for this feature.
I tried to set breakpoints with gdb on exit and _exit, but no success.

Returning from main() is not guaranteed to call either exit() or _exit(). That's an internal implementation detail that isn't covered by the C standard.
You can add an exit handler function using atexit() and then set a gdb breakpoint in that function.
#include <stdlib.h>
.
.
.
static void myExitHandler( void )
{
write( STDERR_FILENO, "in exit handler\n",
strlen( "in exit handler\n" ) );
}
And in your main():
atexit( myExitHandler );
Then you should be able to set a breakpoint in myExitHandler() and it should be triggered when your process exits.
If you need to get backtraces for all threads programmatically without using gdb, if your program keeps track of its threads (it sure should...) you can pass the thread ids to your exit handler and use something like libunwind to get the backtraces. See Getting a backtrace of other thread for some ways to do that.
But as noted in the comments, if your threads are actually working, simply exiting the process entirely can cause problems such as data corruption.

Related

How does a thread in NPTL exit?

I'm curious how a single NPTL thread exits, from implementation perspective.
What I understand about glibc-2.30's implementation are:
NPTL thread is built on top of light weight process on Linux, with additional information stored in pthread object on user stack, to keep track of NPTL specific information such as join/detach status and returned object's pointer.
when a NPTL thread is finished, it is gone for good, only the user stack (and hence) pthread object is left to be collected (to be joined by other threads), unless it is a detached, in which case that space is directly freed.
_exit() syscall kills all threads in a thread group.
the user function that pthread_create() takes in is actually wrapped into another function start_thread(), which does some preparation before running the user function, and some cleaning up afterwards.
Questions are:
At the end of the wrapper function start_thread(), there are the following comment and code:
/* We cannot call '_exit' here. '_exit' will terminate the process.
The 'exit' implementation in the kernel will signal when the
process is really dead since 'clone' got passed the CLONE_CHILD_CLEARTID
flag. The 'tid' field in the TCB will be set to zero.
The exit code is zero since in case all threads exit by calling
'pthread_exit' the exit status must be 0 (zero). */
__exit_thread ();
but __exit_thread() seems to do syscall _exit() anyways:
static inline void __attribute__ ((noreturn, always_inline, unused))
__exit_thread (void)
{
/* some comments here */
while (1)
{
INTERNAL_SYSCALL_DECL (err);
INTERNAL_SYSCALL (exit, err, 1, 0);
}
}
so I'm confused here, since it shouldn't really do syscall _exit() because it will terminate all threads.
pthread_exit() should terminate a single thread, so it should do something similar to what the wrapper start_thread() does in the end, however it calls __do_cancel(), and TBH I am lost in tracing down that function. It does not seem to be related to the above __exit_thread(), nor does it call _exit().
I'm confused here, since it shouldn't really do syscall _exit()
The confusion here stems from mixing exit system call with _exit libc routine (there is no _exit system call on Linux).
The former terminates current Linux thread (as intended).
The latter (confusingly) doesn't execute exit system call. Rather, it executes exit_group system call, which terminates all threads.
thread_exit() should terminate a single thread
It does, indirectly. It unwinds current stack (similar to siglongjmp), performing control transfer to the point where cleanup_jmp_buf was set up. And that was in start_thread.
After the control transfer, start_thread cleans up resources, and calls __exit_thread to actually terminate the thread.

How to use exit() safely from any thread

According to the man page (2) the exit function is not thread safe : MT-Unsafe race:exit, this is because this function tries to clean up resources (flush data to the disk, close file descriptors, etc...) by calling callbacks registered using on_exit and atexit. And I want my program to do that ! (one of my thread keeps a fd open during the whole program's lifespan so _exit is not an option for me because I want all the data to be written to the output file)
My question is the following : if I'm being careful and I don't share any sensible data (like a fd) between my threads, is it "acceptable" to call exit in a multi-threaded program ? Note that I'm only calling exit if an unrecoverable error occurs. Yet, I can't afford having a segfault while the program tries to exit. The thing is, an unrecoverable error can happen from any thread...
I was thinking about using setjmp/longjmp to kill my threads "nicely" but this would be quite complex to do and would require many changes everywhere in my code.
Any suggestions would be greatly appreciated. Thanks ! :)
EDIT : Thanks to #Ctx enlightenment, I came up with the following idea :
#define EXIT(status) do { pthread_mutex_lock(&exit_mutex); exit(status); } while(0)
Of course the exit_mutex must be global (extern).
The manpage states that
The exit() function uses a global variable that is not protected, so it is not thread-safe.
so it won't help, if you are being careful in any way.
But the problem documented is a race condition: MT-Unsafe race:exit
So if you make sure, that exit() can never be called concurrently from two threads, you should be on the safe side! You can make this sure by using a mutex for example.
A modern cross-platform C++ solution could be:
#include <cstdlib>
#include <mutex>
std::mutex exit_mutex;
[[noreturn]] void exit_thread_safe(const int status)
{
exit_mutex.lock();
exit(status);
}
The mutex ensures that exit is never called by 2 (or more) different threads.
However, I still question the reason behind even caring about this. How likely is a multi-threaded call to exit() and which bad things can even realistically happen?
EDIT:
Using std::quick_exit avoids the clang diagnostic warning.
It can't be done: even if no data is shared between threads at first, data must be shared between a thread and its cleanup function. The function should run only after the thread has stopped or reached a safe point.

Prevent glibc from showing extra abort information [duplicate]

Some C++ libraries call abort() function in the case of error (for example, SDL). No helpful debug information is provided in this case. It is not possible to catch abort call and to write some diagnostics log output. I would like to override this behaviour globally without rewriting/rebuilding these libraries. I would like to throw exception and handle it. Is it possible?
Note that abort raises the SIGABRT signal, as if it called raise(SIGABRT). You can install a signal handler that gets called in this situation, like so:
#include <signal.h>
extern "C" void my_function_to_handle_aborts(int signal_number)
{
/*Your code goes here. You can output debugging info.
If you return from this function, and it was called
because abort() was called, your program will exit or crash anyway
(with a dialog box on Windows).
*/
}
/*Do this early in your program's initialization */
signal(SIGABRT, &my_function_to_handle_aborts);
If you can't prevent the abort calls (say, they're due to bugs that creep in despite your best intentions), this might allow you to collect some more debugging information. This is portable ANSI C, so it works on Unix and Windows, and other platforms too, though what you do in the abort handler will often not be portable. Note that this handler is also called when an assert fails, or even by other runtime functions - say, if malloc detects heap corruption. So your program might be in a crazy state during that handler. You shouldn't allocate memory - use static buffers if possible. Just do the bare minimum to collect the information you need, get an error message to the user, and quit.
Certain platforms may allow their abort functions to be customized further. For example, on Windows, Visual C++ has a function _set_abort_behavior that lets you choose whether or not a message is displayed to the user, and whether crash dumps are collected.
According to the man page on Linux, abort() generates a SIGABRT to the process that can be caught by a signal handler. EDIT: Ben's confirmed this is possible on Windows too - see his comment below.
You could try writing your own and get the linker to call yours in place of std::abort. I'm not sure if it is possible however.

Why does pthread_exit(0) hangs the program?

Running the following C code causes the program to hang, and does not respond to signals (including CTRL-C).
int main()
{
pthread_exit(0);
return 0;
}
Any idea why?
The behaviour is normal when other threads have been created and are running, but I would like to know if I always have to check that before using pthread_exit(0).
EDIT:
This is the complete code that hangs. However, I was building with glib (-lglib-2.0). Using simply cc -o foo foo.c works as expected.
Your entire use case is described in the notes of the pthread_exit man page.
In your case, as you correctly edited your OP, glib started another thread. You exited the main thread and the other thread kept running. You labeled this as a hang.
In general, if you want to exit the application in full, just use exit or return from main().
Only when you need additional magic (rarely) like detached threads, use pthread_exit() on the main thread.

C goto different function

I'm working with an embedded system where the exit() call doesn't seem to exist.
I have a function that calls malloc and rather than let the program crash when it fails I'd rather exit a bit more gracefully.
My initial idea was to use goto however the labels seem to have a very limited scope (I'm not sure, I've never used them before "NEVER USE GOTO!!1!!").
I was wondering if it is possible to goto a section of another function or if there are any other creative ways of exiting a C program from an arbitrary function.
void main() {
//stuff
a();
exit:
return;
}
void a() {
//stuff
//if malloc failed
goto exit;
}
Thanks for any help.
Options:
since your system is non-standard (or perhaps is standard but non-hosted), check its documentation for how to exit.
try abort() (warning: this will not call atexit handlers).
check whether your system allows you to send a signal to yourself that will kill yourself.
return a value from a() indicating error, and propagate that via error returns all the way back to main.
check whether your system has setjmp/longjmp. These are difficult to use correctly but they do provide what you asked for: the ability to transfer execution from anywhere in your program (not necessarily including a signal/interrupt handler, but then you probably wouldn't be calling malloc in either of those anyway) to a specific point in your main function.
if your embedded system is such that your program is the only code that runs on it, then instead of exiting you could call some code that goes into an error state: perhaps an infinite loop, that perhaps flashes an LED or otherwise indicates that badness has happened. Maybe you can provoke a reboot.
Why dont you use return values
if malloc failed
return 1;
else
return 0;
...........
if(!a())
return;
goto cannot possibly jump to another function.
Normally, you are advised please don't use goto! In this case what you are asking is not possible.
How to deal with this? There are few solutions.
Check return code or value of problematic functions and act accordingly.
Use setjmp/longjmp. This advice should be considered even more evil than using goto itself, but it does support jumping from one function to another.
Embedded systems rarely have any variation of exit(), as that function doesn't necessarily make any sense in the given context. Where does the controller of an elevator or a toaster exit to?
In multitasking embedded systems there could be a system call to exit or terminate a process, leaving only an idle process alive that does simply a busy loop: while (1); or in some cases call a privileged instruction to go to power saving mode: while (1) { asm("halt") };
In embedded systems one possible method to "recover" from error is to asm("trap #0"); or any equivalent of calling an interrupt vector, that implements graceful system shutdown with dumping core to flash drive or outputting an error code to UART.

Resources