Implementing traceback on i386

Implementing traceback on i386 - c

I am currently porting our code from an alpha (Tru64) to an i386 processor (Linux) in C.
Everything has gone pretty smoothly up until I looked into porting our
exception handling routine. Currently we have a parent process which
spawns lots of sub processes, and when one of these sub-processes
fatal's (unfielded) I have routines to catch the process.
I am currently struggling to find the best method of implementing a traceback routine which can list the function addresses in the error log, currently my routine just prints the the signal which caused the exception and the exception qualifier code.
Any help would be greatly received, ideally I would write error handling for all processors, however at this stage I only really care about i386, and x86_64.
Thanks
Mark

The glibc functions backtrace() and backtrace_symbols(), from execinfo.h, might be of use.

You might look at http://tlug.up.ac.za/wiki/index.php/Obtaining_a_stack_trace_in_C_upon_SIGSEGV. It covers the functionality you need. However you must link against libgdb and libdl, compile with -rdynamic (includes more symbols in the executable), and forgo the use of some optimizations.

There are two GNU (non-POSIX) functions that can help you - backtrace() and backtrace_symbols() - first returns array of function addresses and second resolves addresses to names. Unfortunately names of static functions cannot be resolved.
To get it working you need to compile your binary with -rdynamic flag.

Unfortunately, there isn't a "best" method since the layout of the stack can vary depending on the CPU, the OS and the compiler used to compile your code. But this article may help.
Note that you must implement this in the child process; the parent process just gets a signal that something is wrong; you don't get a copy of the child stack.

If a comment, you state you are using gcc. This http://gcc.gnu.org/onlinedocs/gcc-4.4.3/gcc/Return-Address.html#Return-Address could be useful.

If you're fine with only getting proper backtraces when running through valgrind, then this might be an option for you:
VALGRIND_PRINTF_BACKTRACE(format, ...):
It will give you the backtrace for all functions, including static ones.

Related

how to catch calls with LD_PRELOAD when unknown programs may be calling execve without passing environment

I know how to intercept system calls with LD_PRELOAD, that occur in compiled programs I may not have source for. For example, if I want to know about the calls to int fsync(int) of some unknown program foobar, I compile a wrapper
int fsync(int)
for
(int (*) (int))dlsym(RTLD_NEXT,"fsync");
into a shared library and then I can set the environment variable LD_PRELOAD to that and run foobar. Assuming that foobar is dynamically linked, which most programs are, I will know about the calls to fsync.
But now suppose there is another unknown program foobar1 and in the source of that program was a statement like this:
execve("foobar", NULL, NULL)
that is, the environment was not passed. Now the whole LD_PRELOAD scheme breaks down?
I checked by compiling the statemet above into foobar1, when that is run, the calls from foobar are not reported.
While one can safely assume most modern programs are dynamically linked, one cannot at all assume how they may or may not be using execve?
So then, the whole LD_PRELOAD scheme, which everybody says is such a great thing, is not really working unless you have the source to the programs concerned, in which case you can check the calls to execve and edit them if necessary. But in that case, there is no need for LD_PRELOAD, if you have sources to everything. LD_PRELOAD is specifically, supposed to be, useful when you don't have sources to the programs you are inspecting.
Where am I wrong here - how can people say, that LD_PRELOAD is useful for inspecting what unknown programs are doing??

I guess I could also write a wrapper for execve. In the wrapper, I add to the original envp argument, one more string: "LD_PRELOAD=my library" . This "seems" to work, I checked on simple examples.
I am not sure if I should be posting an "answer" which may very easily exceed my level of C experience.
Can somebody more experienced than me comment if this is really going to work in the long run?

Can I run GCC as a daemon (or use it as a library)?

I would like to use GCC kind of as a JIT compiler, where I just compile short snippets of code every now and then. While I could of course fork a GCC process for each function I want to compile, I find that GCC's startup overhead is too large for that (it seems to be about 50 ms on my computer, which would make it take 50 seconds to compile 1000 functions). Therefore, I'm wondering if it's possible to run GCC as a daemon or use it as a library or something similar, so that I can just submit a function for compilation without the startup overhead.
In case you're wondering, the reason I'm not considering using an actual JIT library is because I haven't found one that supports all the features I want, which include at least good knowledge of the ABI so that it can handle struct arguments (lacking in GNU Lightning), nested functions with closure (lacking in libjit) and having a C-only interface (lacking in LLVM; I also think LLVM lacks nested functions).
And no, I don't think I can batch functions together for compilation; half the point is that I'd like to compile them only once they're actually called for the first time.
I've noticed libgccjit, but from what I can tell, it seems very experimental.

My answer is "No (you can't run GCC as a daemon process, or use it as a library)", assuming you are trying to use the standard GCC compiler code. I see at least two problems:
The C compiler deals in complete translation units, and once it has finished reading the source, compiles it and exits. You'd have to rejig the code (the compiler driver program) to stick around after reading each file. Since it runs multiple sub-processes, I'm not sure that you'll save all that much time with it, anyway.
You won't be able to call the functions you create as if they were normal statically compiled and linked functions. At the least you will have to load them (using dlopen() and its kin, or writing code to do the mapping yourself) and then call them via the function pointer.
The first objection deals with the direct question; the second addresses a question raised in the comments.

I'm late to the party, but others may find this useful.
There exists a REPL (read–eval–print loop) for c++ called Cling, which is based on the Clang compiler. A big part of what it does is JIT for c & c++. As such you may be able to use Cling to get what you want done.
The even better news is that Cling is undergoing an attempt to upstream a lot of the Cling infrastructure into Clang and LLVM.
#acorn pointed out that you'd ruled out LLVM and co. for lack of a c API, but Clang itself does have one which is the only one they guarantee stability for: https://clang.llvm.org/doxygen/group__CINDEX.html

removing unneeded code from gcc andd mingw

i noticed that mingw adds alot of code before calling main(), i assumed its for parsing command line parameters since one of those functions is called __getmainargs(), and also lots of strings are added to the final executable, such as mingwm.dll and some error strings (incase the app crashed) says mingw runtime error or something like that.
my question is: is there a way to remove all this stuff? i dont need all these things, i tried tcc (tiny c compiler) it did the job. but not cross platform like gcc (solaris/mac)
any ideas?
thanks.

Yes, you really do need all those things. They're the startup and teardown code for the C environment that your code runs in.
Other than non-hosted environments such as low-level embedded solutions, you'll find pretty much all C environments have something like that. Things like /lib/crt0.o under some UNIX-like operating systems or crt0.obj under Windows.
They are vital to successful running of your code. You can freely omit library functions that you don't use (printf, abs and so on) but the startup code is needed.
Some of the things that it may perform are initialisation of atexit structures, argument parsing, initialisation of structures for the C runtime library, initialisation of C/C++ pre-main values and so forth.
It's highly OS-specific and, if there are things you don't want to do, you'll probably have to get the source code for it and take them out, in essence providing your own cut-down replacement for the object file.

You can safely assume that your toolchain does not include code that is not needed and could safely be left out.
Make sure you compiled without debug information, and run strip on the resulting executable. Anything more intrusive than that requires intimate knowledge of your toolchain, and can result in rather strange behaviour that will be hard to debug - i.e., if you have to ask how it could be done, you shouldn't try to do it.

Using ptrace to generate a stack dump

I am compiling C++ on *nix and I would like to generate a stack dump
a) at an arbitrary point in the program,
b) during any signal, particularly during SIGSEGV.
Google tells me that ptrace is probably the tool for the job, but I can't find any comprehensible examples of walking the stack. Getting the return address, yeah, but what about the NEXT return address? And what about extracting the symbolic name of the function at that point? Something to do with DWARF?
Many thanks if you can tell me where to go from here.

If you are using glibc, then the GNU functions backtrace() and backtrace_symbols() are the best way to do this. Walking the stack is going to be environment-specific anyway, so there's no downside to using the non-portable native functions on each platform to do it.

Optimized code on Unix?

What is the best and easiest method to debug optimized code on Unix which is written in C?
Sometimes we also don't have the code for building an unoptimized library.

This is a very good question. I had similar difficulties in the past where I had to integrate 3rd party tools inside my application. From my experience, you need to have at least meaningful callstacks in the associated symbol files. This is merely a list of addresses and associated function names. These are usually stripped away and from the binary alone you won't get them... If you have these symbol files you can load them while starting gdb or afterward by adding them. If not, you are stuck at the assembly level...
One weird behavior: even if you have the source code, it'll jump forth and back at places where you would not expect (statements may be re-ordered for better performance) or variables don't exist anymore (optimized away!), setting breakpoints in inlined functions is pointless (they are not there but part of the place where they are inlined). So even with source code, watch out these pitfalls.
I forgot to mention, the symbol files usually have the extension .gdb, but it can be different...

This question is not unlike "what is the best way to fix a passenger car?"
The best way to debug optimized code on UNIX depends on exactly which UNIX you have, what tools you have available, and what kind of problem you are trying to debug.
Debugging a crash in malloc is very different from debugging an unresolved symbol at runtime.
For general debugging techniques, I recommend this book.
Several things will make it easier to debug at the "assembly level":
You should know the calling
convention for your platform, so you
can tell what values are being passed
in and returned, where to find the
this pointer, which registers are "caller saved" and which are "callee saved", etc.
You should know your OS "calling convention" -- what a system call looks like, which register a syscall number goes into, the first parameter, etc.
You should
"master" the debugger: know how to
find threads, how to stop individual
threads, how to set a conditional
breakpoint on individual instruction, single-step, step into or skip over function calls,
etc.
It often helps to debug a working program and a broken program "in parallel". If version 1.1 works and version 1.2 doesn't, where do they diverge with respect to a particular API? Start both programs under debugger, set breakpoints on the same set of functions, run both programs and observe differences in which breakpoints are hit, and what parameters are passed.

Write small code samples by the same interfaces (something in its header), and call your samples instead of that optimized code, say simulation, to narrow down the code scope which you debug. Furthermore you are able to do error enjection in your samples.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight