Modify Relocations Conditionally at Runtime - linker

TL;DR
I want to overwrite .got, .got.plt,... to point to the correct addresses, because the linker makes wrong decisions.
I need to use two different dynamic allocation functions (i.e., malloc(),...) in the code. The appropriate one will be selected based on some condition during program execution. Therefore, I provided two glibc instances and used the LD_PRELOAD trick. The LD_PRELOAD value is something like the following:
LD_PRELOAD=multiplexer_library.so:glibc1.so:glibc2.so
where, multiplexer_library.so chooses the correct library. glibc1.so is accessed using dlsym(RTLD_NEXT, "malloc") and glibc2.so is accessed using dlopen() followed by dlsym(). The same goes for calloc(),...
The problem is that the second malloc will interfere with the first one. This happens because all dynamic relocations of the latter will be mapped to the former by the linker. For example, when glibc2.so calls the global function pointer morecore it will be mapped to the morecore and its target which is the __default_morecore() function in glibc1.so. The relocation entry for this global variable in glibc2.so is as follows:
0000003addc0 085600000006 R_X86_64_GLOB_DAT 00000000003af4d8 __morecore##GLIBC_2.2.5 + 0
I traced the execution in Pin. 125 out of over 1370 relocation entries, were accessed in my code during dynamic allocations. For example, an imporant entry is the global variable __curbrk which determines the brk boundary for dynamic allocations (NOTE THAT I provided an isolated brk region for each library at the system call level). This will obviously corrupt the allocations, because both allocators use the same __curbrk. The relocation entry for __curbrk in glibc1.so is shown below:
0000003adeb8 044400000006 R_X86_64_GLOB_DAT 00000000003b10b8 __curbrk##GLIBC_2.2.5 + 0
I tried to rename these conflicting names but 125 is a huge number and the code is hard to grasp. Because it is full of nested macros which makes the manual rename solution, practically, infeasible.
IIUC, for each relocation entry, there exists a memory address (e.g., somewhere in .got,...) where the linker will put the target relocated address and this address is exclusive to each shared library. I will call that address the TARGET HOLDER. For example, in the __curbrk case, the linker placed the runtime address for the __curbrk variable of glibc1.so in the target holder of __curbrk in both glibc1.so and glibc2.so. If this is right, at runtime, I will have to update the value in the target holder of __curbrk in glibc2.so to hold the runtime address of __curbrk variable of glibc2.so. And to completely resolve the problem, this should be done for all of the 125 relocation entries accessed by malloc(),.... Is it possible?
Any help is appreciated!

Therefore, I provided two glibc instances and used the LD_PRELOAD trick.
These answers explain why using LD_PRELOAD=glibc.so can not work (at least not reliably).
I decided to use two separate glibc libraries and use each of them for a dedicated region. The region is isolated in each library.
This can't possibly work, because designers of GLIBC don't support this approach. You'll need to do something else.

Related

How are shared libraries referenced by various programs?

I understand that shared libraries are loaded into memory and used by various programs.
How can a program know where in memory the library is?
When a shared library is used, there are two parts to the linkage process. At compile time, the linker program, ld in Linux, links against the shared library in order to learn which symbols are defined by it. However, none of the code or data initializers from the shared library are actually included in the ultimate a.out file. Instead, ld just records which dynamic libraries were linked against and the information is placed into an auxiliary section of the a.out file.
The second phase takes placed at execution time, before main gets invoked. The kernel loads a small helper program, ld.so, into the address space and this gets executed. Therefore, the start address of the program is not main or even _start (if you have heard of it). Rather, it is actually the start address of the dynamic library loader.
In Linux, the kernel maps the ld.so loader code into a convenient place in the precess address space and sets up the stack so that the list of required shared libraries (and other necessary info) is present. The dynamic loader finds each of the required libraries by looking at a sequence of directories which are often point in the LD_LIBRARY_PATH environment variable. There is also a pre-defined list which is hard-coded into ld.so (and additional search places can be hard-coded into the a.out during link time). For each of the libraries, the dynamic loader reads its header and then uses mmap to create memory regions for the library.
Now for the fun part.
Since the actual libraries used at run-time to satisfy the requirements are not known at link-time, we need to figure out a way to access functions defined in the shared library and global variables that are exported by the shared library (this practice is deprecated since exporting global variables is not thread-safe, but it is still something we try to handle).
Global variables are assigned a statics address at link time and are then accessed by absolute memory address.
For functions exported by the library, the user of the library is going to emit a series of call assembly instructions, which reference an absolute memory address. But, the exact absolute memory address of the referenced function is not known at link time. How do we deal with this?
Well, the linker creates what is known as a Procedure Linkage Table, which is a series of jmp (assembly jump) instructions. The target of the jump is filled in at run time.
Now, when dealing with the dynamic portions of the code (i.e. the .o files that have been compiled with -fpic), there are no absolute memory references whatsoever. In order to access global variables which are also visible to the static portion of the code, another table called the Global Offset Table is used. This table is an array of pointers. At link time, since the absolute memory addresses of the global variables are known, the linker populates this table. Then, at run time, dynamic code is able to access the global variables by first finding the Global Offset Table, then loading the address of the correct variable from the appropriate slot in the table, and finally dereferencing the pointer.

GCC on ARM Cortex M3: Calling functions from specific addresses

I need to call function from a specific addresses (e.g. Double function indirection in C) but not exactly the same. I could pull the pointers from the mapping table and manipulate dynamically generated function pointers, which I prefer to avoid. E.g., I want to avoid this type of call:
((int)(*)(void*)) compute_volume = ((int)(*)(void*)) 0x20001000;
int vol = (*compute_volume)();
Instead, I would prefer to use some sort of linker provided symbols or other methods to achieve the following, except that the compute_volume() function is provided by a different image, perhaps something like this:
extern int compute_volume(void);
vol = compute_volume();
In other words, I intend to split my code into multiple images, thus reducing the need for modifying or overwriting the flash everytime a symbol or computation changes.
Any suggestions/ideas?
You can define jump table which would reside always in te same flash region (you can define that region in linker and pragmas in the code I think) and when called it jumps to desired function.
In firmware part I you only define symbols which refer to "passing" functions addresses (if you will always keep it in the same region it will make future updates MUCH easier). In firmware part II you create jump table which resides in the address space you were referring to in firmware part I and calls the actual functions.
I am not 100% sure I have described it correctly but this should give you some notion how to solve your problem. The link Ring Ø provided should help you with placing jump table code in one place.

Using pointer functions - 2 separate applications on 1 device

I asked some time ago this question How can I use one function from main application and bootloader? (embedded) and started to implement proposed solution but ran into a few problems.
On my cortex M4 I, have 2 separate applications - bootloader and user application. Now I had some (many) functions which were the same for both apps. So I compiled them only for bootloader, then created an array of function pointers at specified address, which is known for user application. So in application, I didn't compile the files with those functions again, but I use those pointers whenever needed.
This is example of code I tried to make common for both applications:
static uint8_t m_var_1;
// Sends events to the application.
static void send_event(fs_op_t const * const p_op, fs_ret_t result)
{
uint8_t var_2;
[...]
}
My application ends in Hardfault, which happens e.g. when dividing by zero or using pointer to function with NULL value. I am not sure why yet, but I started wondering what happens with those variables. var_2 will most surely be located on stack so this is no problem. But what about m_var_1? In the map file, it has a specified place in RAM. But I don't have seperate RAM sections for app and bootloader. I am not sure, but I have a feeling that this variable may use the same RAM location as when created for bootloader. Is this possible? Maybe some other issues?
Yes you are right, the code will attempt to access the global variable at the same location as it is linked for loader. This is because linking involves replacing all occurrences of identifiers (including function names and variable names) by the addresses determined after compiling.
In your application, the variable, even if it does exist there too, is likely to be at a different address.
The calling of the functions happens to work, because they are located in ROM and cannot be different for application and loader. Calling them via const pointers, which are also stored in ROM, bypasses the problem.
The solution is using a file system simulator, if you can find one for your hardware.
Otherwise you will hate having to do the following.
Part 1, setup:
introduce a special linker section with all the variables accessed by both system paprts (application and loader)
let one linker fill it
set it up for the other linker as don't-tocuh
be careful with the initialisation
preferrably do not assume any intialisation value
if you need initialisation, e.g. "bss" (init to 0) or "data" (init to specified value),
do so explicitly at the start of the system part which is not associated to the linker you let setup the variables
for safety, it is recommended to do the init the same way in both system parts
"data" init uses a special non-volatile linker section with a copy of the to-be-initialised variables, accessing that is possible
Part 2, access:
option 1)
store const pointers to those variables, like you did for the functions
option 2)
get the second linker (the other one, which did not do the actual setup of the common variable section) to create an identically structured and identically located section as the one from first linker; more studying of your linker needed here
Part 3, resuing values stored by other system part
(e.g. you want to leave some kind of message from loader, to be read my application)
design which system part initisalises which variable, the other one only reads them
separate the common variables in four sections,
written and read by both system parts, initialised by both
written and read by x, only read by y, initialised by x
written and read by y, only read by x, initialised by y
written by both system parts, not initialised, uses checksums and plausibility cehcks,
if a variable has not been initialised, init to default
init each section only in the corresponding writer system part
setup as "no init" in the other linker
setup as "no init" in both linkers for the fourth case
use getters and setters with checksum update and plausibility for the fourth case
To do all that, intense study of your linker features and syntax is needed.
So I recommend not to try, if you can get around it. Consider using an existing file system simulator; because that is basically what above means.

Hook and Replace Export Function in the Loaded ELF ( .so shared library )

I'm writing some C code to hook some function of .so ELF (shared-library) loaded into memory.
My C code should be able to re-direct an export function of another .so library that was loaded into the app/program's memory.
Here's a bit of elaboration:
Android app will have multiple .so files loaded. My C code has to look through export function that belongs to another shared .so library (called target.so in this case)
This is not a regular dlsym approach because I don't just want address of a function but I want to replace it with my own fuction; in that: when another library makes the call to its own function then instead my hook_func gets called, and then from my hook_func I should call the original_func.
For import functions this can work. But for export functions I'm not sure how to do it.
Import functions have the entries in the symbol table that have corresponding entry in relocation table that eventually gives the address of entry in global offset table (GOT).
But for the export functions, the symbol's st_value element itself has address of the procedure and not GOT address (correct me if I'm wrong).
How do I perform the hooking for the export function?
Theoretically speaking, I should get the memory location of the st_value element of dynamic symbol table entry ( Elf32_Sym ) of export function. If I get that location then I should be able to replace the value in that location with my hook_func's address. However, I'm not able to write into this location so far. I have to assume the dynamic symbol table's memory is read-only. If that is true then what is the workaround in that case?
Thanks a lot for reading and helping me out.
Update: LD_PRELOAD can only replace the original functions with my own, but then I'm not sure if there any way to call the originals.
In my case for example:
App initializes the audio engine by calling Audio_System_Create and passes a reference of AUDIO_SYSTEM object to Audio_System_Create(AUDIO_SYSTEM **);
AUDIO API allocates this struct/object and function returns.
Now if only I could access that AUDIO_SYSTEM object, I would easily attach a callback to this object and start receiving audio data.
Hence, my ultimate goal is to get the reference to AUIOD_SYSTEM object; and in my understanding, I can only get that if I intercept the call where that object is first getting allocated through Audio_System_Create(AUIOD_SYSTEM **).
Currently there is no straight way to grab the output audio at android. (all examples talk about recording audio that comes from microphone only)
Update2:
As advised by Basile in his answer, I made use of dladdr() but strangely enough it gives me the same address as I pass to it.
void *pFunc=procedure_addr; //procedure address calculated from the st_value of symbol from symbol table in ELF file (not from loaded file)
int nRet;
// Lookup the name of the function given the function pointer
if ((nRet = dladdr(pFunc, &DlInfo)) != 0)
{
LOGE("Symbol Name is: %s", DlInfo.dli_sname);
if(DlInfo.dli_saddr==NULL)
LOGE("Symbol Address is: NULL");
else
LOGE("Symbol Address is: 0x%x", DlInfo.dli_saddr);
}
else
LOGE("dladdr failed");
Here's the result I get:
entry_addr =0x75a28cfc
entry_addr_through_dlysm =0x75a28cfc
Symbol Name is: AUDIO_System_Create
Symbol Address is: 0x75a28cfc
Here address obtained through dlysm or calculated through ELF file is the address of procedure; while I need the location where this address itself is; so that I can replace this address with my hook_func address. dladdr() didn't do what I thought it will do.
You should read in details Drepper's paper: how to write shared libraries - notably to understand why using LD_PRELOADis not enough. You may want to study the source code of the dynamic linker (ld-linux.so) inside your libc. You might try to change with mprotect(2) and/or mmap(2) and/or mremap(2) the relevant pages. You can query the memory mapping thru proc(5) using /proc/self/maps & /proc/self/smaps. Then you could, in an architecture-specific way, replace the starting bytes (perhaps using asmjit or GNU lightning) of the code of original_func by a jump to your hook_func function (which you might need to change its epilogue, to put the overwritten instructions -originally at original_func- there...)
Things might be slightly easier if original_func is well known and always the same. You could then study its source and assembler code, and write the patching function and your hook_func only for it.
Perhaps using dladdr(3) might be helpful too (but probably not).
Alternatively, hack your dynamic linker to change it for your needs. You might study the source code of musl-libc
Notice that you probably need to overwrite the machine code at the address of original_func (as given by dlsym on "original_func"). Alternatively, you'll need to relocate every occurrence of calls to that function in all the already loaded shared objects (I believe it is harder; if you insist see dl_iterate_phdr(3)).
If you want a generic solution (for an arbitrary original_func) you'll need to implement some binary code analyzer (or disassembler) to patch that function. If you just want to hack a particular original_func you should disassemble it, and patch its machine code, and have your hook_func do the part of original_func that you have overwritten.
Such horrible and time consuming hacks (you'll need weeks to make it work) make me prefer using free software (since then, it is much simpler to patch the source of the shared library and recompile it).
Of course, all this isn't easy. You need to understand in details what ELF shared objects are, see also elf(5) and read Levine's book: Linkers and Loaders
NB: Beware, if you are hacking against a proprietary library (e.g. unity3d), what you are trying to achieve might be illegal. Ask a lawyer. Technically, you are violating most abstractions provided by shared libraries. If possible, ask the author of the shared library to give help and perhaps implement some plugin machinery in it.

Dynamic relocation of code section

Just out of curiosity I wonder if it is possible to relocate a piece of code during
the execution of a program. For instance, I have a function and this function should
be replaced in memory each time after it has been executed. One idea that came up our mind
is to use self-modifying code to do that. According to some online resources, self-modifying
code can be executed on Linux, but still I am not sure if such a dynamic relocation is possible. Has anyone experience with that?
Yes dynamic relocation is definitely possible. However, you have to make sure that the code is completely self-contained, or that it accesses globals/external functions by absolute references. If your code can be completely position independent, meaning the only references it makes are relative to itself, you're set. Otherwise you will need to do the fixups yourself at loading time.
With GCC, you can use -fpic to generate position independent code. Passing -q or --emit-relocs to the linker will make it emit relocation information. The ELF specification (PDF link) has information about how to use that relocation information; if you're not using ELF, you'll have to find the appropriate documentation for your format.
As Carl says, it can be done, but you're opening a can of worms. In practice, the only people who take the trouble to do this are academics or malware authors (now donning my flame proof cloak).
You can copy some code into a malloc'd heap region, then call it via function pointers, but depending on the OS you may have to enable execution in the segment. You can try to copy some code into the code segment (taking care not to overwrite the following function), but the OS likely has made this segment read-only. You might want to look at the Linux kernel and see how it loads its modules.
If all these different functions exist at compile time then you could simply use a function pointer to keep track of the next one that is to be called. If you absolutely have to modify the function at runtime and that modification can't be done in place then you could also use a function pointer that is updated with address of the new function when it is created/loaded. The rest of your system would then call the self-modifying function through the function pointer and therefore doesn't have to know or care about the self-modifying code and you only have to do the fixup in one place.

Resources