Manipulating x64 Unwind Info To Match Assembly Hook

Manipulating x64 Unwind Info To Match Assembly Hook - c

Edit: I appear to have been mistaken, the backtrace works wonderfully from anywhere on Linux -- it is only when remote debugging from gdb on ubuntu to remote windows that the stacktrace gets absolutely destroyed after entering one of the memory allocation functions in msvcrt... dammit microsoft.
And this happens for both 64bit and 32bit windows, so I'm not sure this is related to the unwind information...
Edit: It appears adding -g3 and -Og has helped with part of the issue in some programs but the problem still persists in other programs, cannot post their source here as it is IP of my company -- sorry!
Background
I am using gcc to compile ubuntu->ubuntu and mingw to compile ubuntu->windows.
I have created a cross platform (linux + windows) memory tracking & leak detection library which hooks malloc/calloc/realloc/free with an assembly bytepatch on the first instructions (not IAT/PLT hooking).
The hook redirects to a gate which checks if the hooks are enabled in the current thread and redirects to the memory tracking hook function if they are, otherwise it just redirects to the trampoline of the real function if they are disabled for that thread.
The library works great and detects leaks on linux/windows (probably would work on mac but I don't have one).
I use the library to programmatically detect leaks from within my code, I can install callbacks on the memory allocation routines and programmatically raise breakpoints (by looping and waiting for debugger to attach then executing asm("int3")) inside the callbacks so that I can attach to my program while it's inside of a call that leaks memory.
Everything works great up until I try to view a backtrace from within my callback, I understand this is is probably because the unwind information is probably not matching my stack anymore because I have inserted new frames and data via the hook routines I have inserted.
Edit: If I am mistaken about the unwind info mismatching the stack being the cause of the incorrect backtrace then please correct me!
The Question
Is there any small hacks I can do to trick GDB into correctly rebuilding the backtrace from within my hook callbacks?
I understand that I can manually walk and edit the unwind info with libdwarf or something but I imagine that would be incredibly cumbersome and large.
So I am wondering if perhaps there is a hack or a cheat I can do which would trick GDB into properly rebuilding the backtrace?
If there are no easy hacks or tricks then what are all of my options for fixing this issue?
Edit: Just to clear up the exact call order of everything:
program
V
malloc
V
hook_malloc -> hooks are disabled -> return malloc trampoline -> real malloc > program
V
hooks are enabled
V
Call original malloc -> malloc trampoline -> real malloc -> returns to hook
V
Record memory size/info etc from malloc
V
Call user defined callback -> **User defined callback* -> returns to hook
V
return to program
It is the "User Defined Callback" where I want to capture a backtrace

Apparently this is the same problem GDB Windows ?? in Backtraces
And the solution was to simply add -g3 to the mingw compile flags and viola I have non-broken backtraces!
Edit: Nevermind, this isn't the whole answer. It appears like this fix worked for some test programs, but other programs still appear to show incorrect backtraces like:
(gdb) bt
#0 malloc_callback (s=38, rv=0x2c5058) at test_dll.c:729
#1 0x000000000040731d in hook_malloc_raw (file=0x410ea1 <__FUNCTION__.63079+55> "", function=0x410ea1 <__FUNCTION__.63079+55> "", line=0, s=38, rv=8791758343065)
#2 0x0000000000407367 in hook_malloc (s=38)
#3 0x000007fefda20b9e in ?? ()
#4 0x0000000000000026 in ?? ()
#5 0x0000000000410ea1 in __FUNCTION__.63079 ()
#6 0x0000000000000000 in ?? ()
Obviously Frame #4 isn't actually a stack frame, and I'm not sure why frame #5 is labeled "__FUNCTION__.63079".
Edit2: If people are going to downvote this at least leave a comment saying why

Related

How can I figure out the full call stack with the current IP and BP registers?

I am doing a simple experiment on Ubuntu LTS 16.04.1 X86_64 with GCC 5.4.
The experiment is to get full call stack of a running C programme.
What I have done is:
Using ptrace's PTRACE_ATTACH & PTRACE_GETREGS to suspend a running C programme and get its current IP and BP.
Using PTRACE_PEEKDATA to get data at [BP] and [BP+4] (or +8 for 64 bits target), so that I can have the calling function's BP and the return address.
Because the BPs are a chain, I should be able to get a sequence of return addresses. After that, by analyzing the address sequence with listing file or dwarf data, I should finally be able to figure the full call stack. Something like 'main --> funcA --> funcB --> funcC ...'.
My problem is, this works fine if the call stack is totally inside my test programme's code. I mean the case when every function is written by me. However, if the test programme is stopped in a CRT or system API, such as 'scanf' or 'sleep', the BP chain no longer works.
I checked the disassambly and noticed that CRT or system API functions do not establish stack frame by 'push ebp' and 'mov ebp,esp' like what my functions do. No wonder why the above approach does not work. But I cannot explain why GDB can still work properly in such case?! So there must be many things I do not know about Linux C programme's call stack.
Could you figure my mistake/misunderstanding? Or could you simply suggest some articles/links for me to read? Thank you very much.

Because the BPs are a chain
They are not. It used to be that a frame pointer chain was used on i386, but for a few years now GCC defaults to -fomit-frame-pointer in optimized compiles even on i386. On x86_64 the -fno-omit-frame-pointer was never the default in optimized code.
this works fine if the call stack is totally inside my test programme's code.
This will only work if you compile without optimization (or with optimization if you also use -fno-omit-frame-pointer).
I cannot explain why GDB can still work properly in such case
GDB (and libunwind) uses DWARF unwind info, which you can examine with readelf -wf a.out.

Why is _dl_fixup called before dynamic linker start?

I'm trying to understand how glibc dynamic linker works. I know that _dl_fixup is called in _dl_runtime_resolve, and solves the relocation problems. So I thought it's called only after linker starts and has loaded some libraries. But when I do some print work in it, I find the function is called even before _dl_start. It's confusing: why it was called? What work it has done?
I did some print work, the function is working on symbols like strncpy, fopen, fread64 and so on, but the object name(l->l_name) seems to be null.
I use gdb to debug the linker, and I think gdb itself used _dl_fixup to complete some tasks. If I didn't use gdb, the _dl_fixup will be called only after _dl_start.

So I thought it's called only after linker starts and has loaded some libraries
That is correct.
I find the function is called even before _dl_start
This is not correct: _dl_fixup is called only after _dl_start.
Unfortunately you didn't provide any details on how you've came to the incorrect conclusion, so it's impossible to tell you where you made a mistake, but you did make (at least one) mistake.

How to debug a crash before main?

My program links statically to many libraries and crashes before getting to main in GDB. How do I diagnose what the problem is?

It's a good bet that LD_DEBUG can help you here. Try this: LD_DEBUG=all ./a.out. This will allow you to easily identify the library which is being loaded when your program crashes.
(Edit: if it wasn't clear, a.out is meant to refer to a generic binary file -- in this case, replace it with the name of your executable).
Edit 2:
To clarify, LD_DEBUG is an environment variable which is examined by the dynamic linker when a program begins execution. If LD_DEBUG is set to some value, the dynamic linker will output a lot of information about the dynamic libraries being loaded during program execution, symbol binding, and so on.
For starters, execute the following on your machine:
LD_DEBUG=help ls
You will see the valid options for LD_DEBUG on your system listed. The most verbose setting is all, which will display all available information.
Now, to use this is as simple as the ls example, only replace ls with the name of your program. There is no need for gdb in order to use LD_DEBUG, as it is functionality provided solely by the dynamic linker, and not by gdb.

It may crash because some component throws an exception and nobody catches it since main() hasn't been entered yet. Set a breakpoint on throwing an exception:
catch throw
run
(If catch throw doen't work the first time you start it, run it once to let it load the dynamic libraries and then do catch throw and run again).

This post has the answer, you have to set a breakpoint before main in the crt0 startup code:
Using GDB without debugging symbols on x86?

starti
starti breaks at the very first instruction executed, see also: Stopping at the first machine code instruction in GDB
An alternative if your GDB is not new enough:
break _start
if you know the that the name of the entry point method is _start, or:
info files
search for Entry point:
Entry point: 0x400440
and run:
break *0x400440
TODO: find out how to compile crt* objects with debug symbols and step into them: How to compile my own glibc C standard library from source and use it?

Start taking the libraries out one by one until it stops crashing.
Then examine the culprit.

I haven't run into this in C but if you link to a c++ library static initialization can crash. You can create it easily by having an assert in a constructor of a static scope variable.

If you can, link your program dynamically instead of statically and follow #denniston.t answer. Maybe debug trace from dynamic linker will help to fix this problem.

Implementing traceback on i386

I am currently porting our code from an alpha (Tru64) to an i386 processor (Linux) in C.
Everything has gone pretty smoothly up until I looked into porting our
exception handling routine. Currently we have a parent process which
spawns lots of sub processes, and when one of these sub-processes
fatal's (unfielded) I have routines to catch the process.
I am currently struggling to find the best method of implementing a traceback routine which can list the function addresses in the error log, currently my routine just prints the the signal which caused the exception and the exception qualifier code.
Any help would be greatly received, ideally I would write error handling for all processors, however at this stage I only really care about i386, and x86_64.
Thanks
Mark

The glibc functions backtrace() and backtrace_symbols(), from execinfo.h, might be of use.

You might look at http://tlug.up.ac.za/wiki/index.php/Obtaining_a_stack_trace_in_C_upon_SIGSEGV. It covers the functionality you need. However you must link against libgdb and libdl, compile with -rdynamic (includes more symbols in the executable), and forgo the use of some optimizations.

There are two GNU (non-POSIX) functions that can help you - backtrace() and backtrace_symbols() - first returns array of function addresses and second resolves addresses to names. Unfortunately names of static functions cannot be resolved.
To get it working you need to compile your binary with -rdynamic flag.

Unfortunately, there isn't a "best" method since the layout of the stack can vary depending on the CPU, the OS and the compiler used to compile your code. But this article may help.
Note that you must implement this in the child process; the parent process just gets a signal that something is wrong; you don't get a copy of the child stack.

If a comment, you state you are using gcc. This http://gcc.gnu.org/onlinedocs/gcc-4.4.3/gcc/Return-Address.html#Return-Address could be useful.

If you're fine with only getting proper backtraces when running through valgrind, then this might be an option for you:
VALGRIND_PRINTF_BACKTRACE(format, ...):
It will give you the backtrace for all functions, including static ones.

__libc_lock_lock is segfaulting

I am working on a piece of code which uses regular expressions in c.
All of the regex stuff is using the standard regex c library.
On line 246 of regexec.c, the line is
__libc_lock_lock(dfa->lock);
My program is segfaulting here and I cannot figure out why. I was trying to find where __libc_lock_lock was defined and it turns out it is a macro in bits/libc-lock.h. However, the macro isnt actually defined to be anything, just defined.
Two questions:
1) Where is the code that is run when __libc_lock_lock is called (I know it must be
replaced with something but I dont know where that would be.
2) if dfa is a re_dfa_t object which is casted from a c string which is the buffer member of the regex_t object type, it will not have any member lock. Is this what is supposed to happen.
It really seams like there is some kind of magic going on here with this __libc_lock_lock

If the segfault is in libc then you can be 99.9% sure of the following:
You are doing something wrong with the API
You have at some previous point clobbered or corrupted memory used by libc, and this is a delayed effect. (Thanks Tyler!)
You are doing something that is pushing the API's capability
You are a developer testing the current trunk with new changes in the API implementation
I suspect that the first is the cause. Posting your API usage and your library version might help. The Regexp API in libc is pretty stable.
Look up debugging with gdb to find a stack trace of the execution path leading to the segfault, and install the glibc-devel packages for the symbols. If the segfault is in (or out) of libc ... then you have done something bad (not initialized an opaque pointer for example)
[aiden#devbox ~]$ gdb ./myProgram
(gdb) r
... Loads of stuff, segfault info ..
(gdb) bt
Will print the stack and function-names that led to the segault. Compile your source with the '-g' debug flag to keep important debugging information.
Get an authoritative source for API usage/examples!
Good Luck

In answer to your first question:
The macro is defined in the libc-lock.h; its relative path is sysdeps/mach/bits on
the glibc release I use (2.2.5). Lines 67/68 from that file are
/* Lock the named lock variable. */
#define __libc_lock_lock(NAME) __mutex_lock (&(NAME))

Run your code in gdb until you get to the segfault. Then do a backtrace to find out where it was.
Here is the set of commands you will type to do this:
gdb myprogram
run
***Make it crash***
backtrace
Typing backtrace will print the call stack and will show you what path the code has taken to get to the point where it is segfaulting.
You can go up and down in the stack to your code by typing 'up' or 'down' respectively. Then you can examine variables in that scope.
So for instance, if your backtrace command prints this:
linux_black_magic
more_linux
libc
libc
yourcode.c
Type 'up' a few times so that the stack frame is in your code instead of linux's. You can then examine variables and memory that your program is operating on. Do this:
print VariableName
x/10 &Variable
That will print the value of the variable and then will print a hex dump of memory starting at the variable.
Those are some general techniques to use with gdb and debugging, post more details for more detailed answers.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Manipulating x64 Unwind Info To Match Assembly Hook - c

Related

How can I figure out the full call stack with the current IP and BP registers?

Why is _dl_fixup called before dynamic linker start?

How to debug a crash before main?

Implementing traceback on i386

__libc_lock_lock is segfaulting

Categories

Resources