Execution-independent backtrace addresses - c

In a tracing tool I'm developing, I use the libC backtrace function to get the list of return pointers in the stack (array of void*) from some points in the execution. These lists of pointers are then stored in a dictionary and associated with a unique integer.
Problem: from one execution of the program to another, the code of the program may be located somewhere else in memory, so the array of pointers from a particular call to backtrace will change from an execution to another, although the program has executed the same things.
Partial solution: getting the address of a reference function (e.g., main) and store the difference between the address of this function and the addresses in the backtrace, instead of the raw addresses.
New problem: if the program uses dynamically loaded libraries, the location in memory of the code of these libraries may differ from one execution to another, so the differences of address with respect to a reference function will change.
Any suggestion to solve this? (I thought of using the backtrace_symbols function to get the names of the functions instead of return addresses, but this function will return names only if they are available, for instance if the program has been compiled with -g).

On Linux, I would suggest also looking at /proc/self/maps (or /proc/pid/maps if you're looking at another process) to get the load map of all the dynamic libs, as well as the main executable. You can then map the void *'s in the backtrace to which object they are part of, and offsets from the start of the object.
Something similar may be available on other OSes (many UNIX variants have a /proc filesystem that may contain similar information)

Related

GCC symbol table for local variables on stack

Of course, symbol and type information of each variable defined in a C/C++ program is available, otherwise the debuggers could not show them. But how to access this information?
A lot info about the elf is available, but that is about linking an seems to hold only global variables, not local ones on the stack i.e.
In a remote real time system (not under unix), I'd like to be able to peek now and then by copying some memory in a list together with the associated variable name, and later on take a look at them while the RT system goes on.
The best would be that the dump could be introduced at any time for any variable without the need to add some statements in the code upfront.
But how to access this information?
TL;DR: it's complicated.
You would need to build almost a complete debugger. You can watch this space. When the author gets around to step 9, you'll have an example to follow.
I'd like to be able to peek now and then by copying some memory in a list together with the associated variable name, and later on take a look at them while the RT system goes on.
RT systems do not usually lend themselves to easy debugging. The best you could probably do is take a snapshot of the entire (used portion of) the stack, and "fish out" variable values later.
To do that, you'll need to know current values of the stack pointer and instruction pointer, contents of the stack, and load addresses of all ELF objects. And you'll need to re-implement large part of a debugger (or modify existing one).
The easiest approach might be to convert (post-process) the above info into an ELF core, and then use existing debugger of your choice to analyse the values. You can use Google user-space coredumper to see what's involved. See also this answer.

How are function calls resolved?

When a function is called, execution is shifted to a point indicated by the function pointer. At the start of execution, the executable code has to be loaded from disk.
How is the correct function pointer called? The executable code is not mapped into virtual memory at the same location every time, right? So how does the runtime make sure that a call to a function always calls the correct function even if the location of the executable code is different for each execution?
Consider the following code:
void func(void); //Func defined in another dynamic library
int main()
{
func();
//How is the pointer to func known if the file containing func is loaded from disk at run time?
};
The way that function pointers are resolved is really quite simple. When the compiler chain spits out an executable binary, all internal addresses are relative to a "base address." In some executable formats, this base address is specified, in others it is implied.
Basically, the compiler says that it assumes execution will start at address A. The runtime decides that it should actually start at B. The runtime then subtracts A and adds B to all non-relative addresses in the binary before executing it.
This process also applies to things like DLLs. Dynamic libraries store a list of addresses relative to the base pointer that point to each exported function. Names are often also associated with the list, so that you can reference a function by name. When the library is loaded, the address translation is applied to everything, including the address table. At that point, a caller just has to look up the address in the table that was translated, and then they'll have the absolute address of a given function.
In older operating systems, long long ago (and, in some cases, even today), well before things like address space layout randomization, memory pages, and multitasking operating systems, programs would just be copied to the specified base address in memory where it would then be executed.
In modern operating systems, one of a few things can happen, depending on the capabilities or requirements of the platform and application. Most operating systems handle native binaries as I described in the second paragraph, however some applications (such as running 16-bit x86 on later architectures) can involve more complex strategies. One such strategy involves giving the code a static virtual address space. This has various limitations, such as the need for an emulation/compatibility layer if you want it to interact with external code (like a windowed console or the network stack).
As the need for 16-bit support declines though, that sort of scheme is used less and less. Giving all programs their own unique address space (rather than letting it overlap) promotes the use of shared libraries, services, and other shared goodies.
In general, function calls are resolved statically. When you compile the file, first - .o (or .obj) file is created. All known addresses - are local functions (from this file). Unknown are "extern" functions.
Then, linking is performed. Linking completes address mapping for every function which is "extern". If any names are missing - linking error occurs.
How is the correct function pointer called?
Function pointer is function address, function name is function address. Both are values, not L-values. &func and func are absolutely same.
Loading or PE (or ELF) files is a process or loading the executable to memory. Too much information to explain. Basically, just for clarification, consider: every function has its own address in the process address space.
You can print the 'func' and see whether is has the same address during every execution like this:
printf("%u", function);
For me it's the same address every time (virtual memory wise).

In C is a function loaded into memory when it is first called or when the program starts? And can it be unloaded from memory?

For example:
If I have a function named void Does_Stuff(int arg) and call it in the main function, is void Does_Stuff loaded into memory ONLY when it is first called? Or is it loaded into memory during program initialization?
And after calling Does_Stuff in main, can I manually unload it from memory?
For reference the operating system I am running is Windows 7 and I am compiling with MinGW.
In simple terms (with the usual depends-on-various-platform-things caveat), the code for your normal, global C function is "loaded into memory" at the time the program is loaded. You cannot request that it be "unloaded".
That said, as Hans mentions in a comment, the OS at a lower level is in charge of what bits of stuff are important enough to be present in physical RAM, and may choose to "page out" memory that isn't being used frequently. This isn't per-function, and has no knowledge of the structure of your code. So in that sense the function's code may happen at various times exist in actual RAM or not. But this is a level below the application's execution, where a C function is always "present and available".
DLL's called by your code could conceivably come and go as you call them. But your main program *.exe should go all-in at start time.
Though the exact details depend on the compiler, linker, platform and implementation, typically all the functions in your program are loaded into memory by the executable loader of the OS and reside there until the program terminates. This memory is also typically static (though certain programs can and do rewrite parts of themselves), so it's read-only.
Now every time you call a function and pass it an argument, that argument is added to memory (a different memory in principal than where the functions are), and removed again when the function call returns (this is a simplified version).
On some platforms (for instance, DOS) your whole program resides in memory while it runs. On other platforms, it might be swapped out of memory while not running (for instance, ancient UNIX versions). On most platforms your program is splitted into pages of usually 4 kilobytes. When you access a page that is not yet loaded, the operating system produces the required page for you transparently (i.e. you don't notice that at all). If the operating system runs out of memory it may swap out single pages. You cannot control this at all from inside your program.
If you want to be able to control what is in memory and what not, you might wan to read about memory mapping and the mmap system call.

What's inside the stack?

If I run a program, just like
#include <stdio.h>
int main(int argc, char *argv[], char *env[]) {
printf("My references are at %p, %p, %p\n", &argc, &argv, &env);
}
We can see that those regions are actually in the stack.
But what else is there? If we ran a loop through all the values in Linux 3.5.3 (for example, until segfault) we can see some weird numbers, and kind of two regions, separated by a bunch of zeros, maybe to try to prevent overwriting the environment variables accidentally.
Anyway, in the first region there must be a lot of numbers, such as all the frames for each function call.
How could we distinguish the end of each frame, where the parameters are, where the canary if the compiler added one, return address, CPU status and such?
Without some knowledge of the overlay, you only see bits, or numbers. While some of the regions are subject to machine specifics, a large number of the details are pretty standard.
If you didn't move too far outside of a nested routine, you are probably looking at the call stack portion of memory. With some generally considered "unsafe" C, you can write up fun functions that access function variables a few "calls" above, even if those variables were not "passed" to the function as written in the source code.
The call stack is a good place to start, as 3rd party libraries must be callable by programs that aren't even written yet. As such, it is fairly standardized.
Stepping outside of your process memory boundaries will give you the dreaded Segmentation violation, as memory fencing will detect an attempt to access non-authorized memory by the process. Malloc does a little more than "just" return a pointer, on systems with memory segmentation features, it also "marks" the memory accessible to that process and checks all memory accesses that the process assignments are not being violated.
If you keep following this path, sooner or later, you'll get an interest in either the kernel or the object format. It's much easier to investigate one way of how things are done with Linux, where the source code is available. Having the source code allows you to not reverse-engineer the data structures by looking at their binaries. When starting out, the hard part will be learning how to find the right headers. Later it will be learning how to poke around and possibly change stuff that under non-tinkering conditions you probably shouldn't be changing.
PS. You might consider this memory "the stack" but after a while, you'll see that really it's just a large slab of accessible memory, with one portion of it being considered the stack...
The contents of the stack are basically:
Whatever the OS passes to the program.
Call frames (also called stack frames, activation areas, ...)
What does the OS pass to the program? A typical *nix will pass the environment, arguments to the program, possibly some auxiliary information, and pointers to them to be passed to main().
In Linux, you'll see:
a NULL
the filename for the program.
environment strings
argument strings (including argv[0])
padding full of zeros
the auxv array, used to pass information from the kernel to the program
pointers to environment strings, ended by a NULL pointer
pointers to argument strings, ended by a NULL pointer
argc
Then, below that are stack frames, which contain:
arguments
the return address
possibly the old value of the frame pointer
possibly a canary
local variables
some padding, for alignment purposes
How do you know which is which in each stack frame? The compiler knows, so it just treats its location in the stack frame appropriately. Debuggers can use annotations for each function in the form of debug info, if available. Otherwise, if there is a frame pointer, you can identify things relative to it: local variables are below the frame pointer, arguments are above the stack pointer. Otherwise, you must use heuristics, things that look like code addresses are probably code addresses, but sometimes this results in incorrect and annoying stack traces.
The content of the stack will vary depending on the architecture ABI, the compiler, and probably various compiler settings and options.
A good place to start is the published ABI for your target architecture, then check that your particular compiler conforms to that standard. Ultimately you could analyse the assembler output of the compiler or observe the instruction level operation in your debugger.
Remember also that a compiler need not initialise the stack, and will certainly not "clear it down", when it has finished with it, so when it is allocated to a process or thread, it might contain any value - even at power-on, SDRAM for example will not contain any specific or predictable value, if the physical RAM address has been previously used by another process since power on or even an earlier called function in the same process, the content will have whatever that process left in it. So just looking at the raw stack does not tell you much.
Commonly a generic stack frame may contain the address that control will jump to when the function returns, the values of all the parameters passed, and the value of all auto local variables in the function. However the ARM ABI for example passes the first four arguments to a function in registers R0 to R3, and holds the return value of the leaf function in the LR register, so it is not as simple in all cases as the "typical" implementation I have suggested.
The details are very dependent on your environment. The operating system generally defines an ABI, but that's in fact only enforced for syscalls.
Each language (and each compiler even if they compile the same language) in fact may do some things differently.
However there is some sort of system-wide convention, at least in the sense of interfacing with dynamically loaded libraries.
Yet, details vary a lot.
A very simple "primer" could be http://kernelnewbies.org/ABI
A very detailed and complete specification you could look at to get an idea of the level of complexity and details that are involved in defining an ABI is "System V Application Binary Interface AMD64 Architecture Processor Supplement" http://www.x86-64.org/documentation/abi.pdf

What is the need of randomizing memory addresses for loading libraries?

ldd displays the memory addresses where the shared libraries are linked at runtime
$ cat one.c
#include<stdio.h>
int main() {
printf ("%d", 45);
}
$ gcc one.c -o one -O3
$ ldd one
linux-gate.so.1 => (0x00331000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0x00bc2000)
/lib/ld-linux.so.2 (0x006dc000)
$
From this answer to another question,
... The addresses are basically random numbers. Before secure implementations were devised, ldd would consistently indicate the memory addresses where the program sections were loaded. Since about five years ago, many flavors of Linux now intentionally randomize load addresses to frustrate would-be virus writers, etc.
I do not fully understand how these memory addresses can be used for exploitations.
Is the problem something like "If the addresses are fixed, one can put some undesirable code at that address which would be linked as if it was a library" or is it something more than this?
"If the addresses are fixed, one can put some undesirable code at that address which would be linked as if it was a library"
Yes.
Also. Buffer overflow exploits require a consistent memory model so that the bytes that overflow the buffer do known things to known parts of the code.
http://www.corewars.org/ A great illustration of the principle.
Some vulnerabilities allow overwriting some address (stack overflows allow overwriting return addresses, exploit for heap overflows typically overwrite SEH pointers on Win32 and addresses (GOT entries) of dynamically called functions on Linux, ...). So the attacker needs to make the overwritten address point to something interesting. To make this more difficult, several counter-measures have been adopted:
Non-executable stacks prevents exploits from just jumping to some code the attacker has put on the stack.
W^X segments (segments which can never be writable and executable at the same time) prevents the same for other memory areas.
Randomized load addresses for libraries and position independent executables decrease the probabilities of succesful exploitation via return-into-libc and return-oriented-programming techniques, ...
Randomized load addresses also prevent attackers from knowing in advance where to find some interesting function (e.g: imagine an attacker that can overwrite the GOT entry and part of the message for the next logging call, knowing the address of system would be "interesting").
So, you have to view load address randomization as another counter-measure among many (several layers of defense and all that).
Also note that exploits aren't restricted to arbitrary code execution. Getting a program to print some sensitive information instead of (or in addition to, think of string truncation bugs) some non-sensitive information also counts as an exploit; it would not be difficult to write some proof-of-concept program with this kind of vulnerability where knowing absolute addresses would make reliable exploits possible.
You should definitely take a look at return-into-libc and return-oriented-programming. These techniques make heavy use of knowledge of addresses in the executable and libraries.
And finally, I'll note there are two ways to randomize library load addresses:
Do it on every load: this makes (some) exploits less reliable even if an attacker can obtain info about addresses on one run and try to use that info on another run.
Do it once per system: this is what prelink -R does. It avoids attackers using generic information for e.g: all Redhat 7.2 boxes. Obviously, its advantage is that it doesn't interfere with prelink :).
A simple example:
If on a popular operating system the standard C library was always loaded at address 0x00100000 and a recent version of the standard C library had the system function at offset 0x00000100 then if someone were able to exploit a flaw in a program running on a computer with this operating system (such as a web server) causing it to write some data to the stack (via a buffer overrun) they would know that it was very likely that if they wrote 0x00100100 to the place on the stack where the current function expected its return address to be then they could make it so that upon returning from the current function the system function would be called. While they still haven't done everything needed to cause system to execute something that they want it to, they are close, and there are some tricks writing more stuff to the stack aver the address mentioned above that have a high likelihood of resulting in a valid string pointer and a command (or series of commands) being run by this forced call to system.
By randomizing the addresses at which libraries are loaded the attacker is more likely to just crash the web server than gain control of the system.
The typical method is by a buffer overrun, where you put a particular address on the stack, and then return to it. You typically pick an address in the kernel where it assumes the parameters you've passed it on the stack have already been checked, so it just uses them without any further checking, allowing you to do things that normally wouldn't be allowed.

Resources