I'm trying to figure out how relocation works, but I can't seem to get my head around it.
This document describes the different types one can encounter when relocating an ELF file.
Let's take R_ARM_ALU_SB_G0_NC (#70) for example.
Type: static
Class: ARM, describes the type of place being relocated (which I do not understand)
Operation: ((S + A) | T) – B(S))
I'm guessing that the mathematical expression is the operation I'm looking for. However, I do not completely understand how this fits in my function.
The method where the relocation takes place looks as follows:
int elfloader_arch_relocate(int input_fd, struct elfloader_output *output,
unsigned int sectionoffset, char *sectionaddr, struct elf32_rela *rela, char *addr)
input_fd is a file descriptor for the ELF file, *output is used when writing the output segment, sectionoffset is the file offset at which the relocation can be found, *sectionaddr is the section start address (absolute runtime) and *addr is the relocated address.
The 32-bit relocation structure looks like this
struct elf32_rela {
elf32_addr r_offset;
elf32_word r_info;
elf32_sword r_addend;
};
On page 26 of the above mentioned document the nomenclature is explained:
S (when used on its own) is the address of the symbol.
A is the addend for the relocation.
T is 1 if the target symbol S has type STT_FUNC and the symbol addresses a Thumb instruction; it is 0 otherwise.
B(S) is the addressing origin of the output segment defining the symbol
So my question is, which of the parameters in the relocate function correspond to the ones used in the formula?
If I'm reading this right, and I'm not sure I am, it goes like this:
S is addr.
B(S) is sectionaddr.
A is rela->r_addend.
T may be derivable from information in rela->r_info; if not, I don't know where you need to look.
This is a really complicated-looking relocation. Consider starting with the simple ones (like R_ARM_ABS16). In a dynamic loader, you should not have to implement all of the relocation typess in the spec you linked to, only a small subset. If it seems like you need a lot of relocation types, this is probably because you are trying to feed unlinked object files to the dynamic loader; you should turn them into shared objects, using your existing ARM linker. (With the GNU toolchain, a first approximation to how you do that is gcc -shared foo.o -o foo.so.)
Cribbing off an existing dynamic loader for the architecture is often a good plan; there tends to be a lot of undocumented wisdom buried in the code for such things. For instance, here is GNU libc's ld.so's ARM relocator. (LGPL)
Related
still struggling to understand the relocation entry in Relocatable Object Files, let's say I have this simple C program:
//main1.c
void functionTest();
functionTest(){
...
}
int main()
{
functionTest();
return 0;
}
my questions are:
Q1. since everything is known to main1, so there is no relocation entry in .rel.text or .rel.data section of main1.o, is my understanding correct?
Q2. below is a picture illustrates how DLL works,
for libc.so, everything is known(it has all definitions just like main1), so why there is still relocation entries in libc.so? I can understand the symbol table information needs to be copied because they exist, how can you copy something that doesn't exist?
Q3. lets say below is the relocation entry structure;
typedef struct {
int offset; /* Offset of the reference to relocate */
int symbol:24, /* Symbol the reference should point to */
type:8; /* Relocation type */
} Elf32_Rel;
so my understanding is there is already a relocation entry in main2.o for printf(), so the offset will be something like 8 or 9 bytes offset from caller function, symbol will be 'printf', type is R_386_PC32, so if there is another one needs to be copied from libc.so to main2.o, what's the structure of that relocation entry?
Q1: Yes, if you compile the main1.c in your question, it will build without the need to link in anything, because it's not using functions that are defined elsewhere.
Q2: That diagram won't apply to building main1.c, because main1.c does not use external functions. But, in a program that does have a call to, say, printf(), here's what's going on: the diagram shows that relocation entries about libc.so are being placed into main2.o. You ask "why there is still relocation entries in libc.so?" but the relocation entries are not being put into libc.so; they are being put into main2.o, and they refer to things in libc.so.
Q2 follow-up #1: When you say "for libc.so, everything is known", that is true only within libc.so. Anything that uses a function defined in libc.so will not know how that function is defined, until linking takes place. That's the function of ld: to copy reference info from a library like libc.so into a program being built, like main2 in the diagram. The reference info allows the kernel executing main2 to also load libc.so into memory in such a way that execution can flow from the main2 code over to the libc.so code and back to main2 wherever main2 calls a function whose definition / code resides in libc.so.
Q3: I think the best way to put it is this: The information that is used to populate the relocation structure within main2.o comes from libc.so. Where I say that relocation entries are copied from libc.so, that's what I mean: information about the target (e.g. printf()) is taken from libc.so and used to provide values for the relocation entry in main2.o whose purpose is to tell the loader where to load the code for printf() from.
Q3 follow-up #1: There is another sense in which libc.so has relocation entries: the thing that built libc.so added relocation entries to libc.so, so that anything that wants to use its (exportable) functions and variables can do so. These don't need to be copied anywhere. Part of building an object file is to create information for internal things that other programs might use; and, part of building a program is to populate the information about external things that it makes use of. But the diagram looks to me like it's only meant to show that information about libc.so and libvector.so are added to main2.o so that the loader can load all the needed code into memory when main2 is executed.
I'm trying to analyse an dynamically linked 64-bit ELF file using IDA pro, and I find a segment with an extern tpye, which is right after the .bss, as follows
extern:00000000006021C0 ; Segment type: Externs
extern:00000000006021C0 ; extern
extern:00000000006021C0 ; void free(void *ptr)
extern:00000000006021C0 extrn free:near ; DATA XREF: .got.plt:off_602018o
However, when I debug it at the runtime using gdb, I find that this 'extern' segment contains ONLY ZERO! There isn't any valid data other than zero in this segment. Also, there is no descriptions about the permissions of this segment, it looks as if this segment doesn't even exist.
Since there is DATA XREF in GOT, maybe it has something to do with import functions? But I couldn't find relevant documents, I wonder how IDA recognizes it, and what it is exactly?
Thanks!
extern is not a real segment. It is a pseudo segment created by IDA to represent symbols with unknown addresses in other modules; the GOT usually contains pointers to those. During debugging it probably gets covered by .bss or stack area cleared by the OS loader, that's why you see zeroes there.
extern in the context of IDA is a bit different than in the context of C/C++.
In C/C++, the extern keyword is used to declare a variable/function/object that is not actually defined in the current object but will be available by the time the binary is linked. This is for when you define an array in one .c file and access it in multiple files, for example.
In the context of IDA, the externs section is used to describe a memory area defining APIs from .so/.dll files. This is usually the IAT in a PE and the GOT in an ELF file. When an object in an externs section has a name of a known API, IDA will automatically color it pink and add the prototype if available.
I'm writing a little program which trace all the syscall and calls of a binary file (elf) using ptrace (singlestep, getregs, pick_text, opcodes comparison, etc).
So far I've succeed to trace syscalls and simple calls like user defined functions.
But I failed to get the name of the printf symbol from the address I pick thanks to ptrace.
My question is: For dynamic linked function as printf, strlen, etc, how can I retrieve in the elf file the name of the symbol from the address ?
With simple calls it's kind of easy, I run through the .strtab section and when an address match I return the corresponding str.
But for printf, the symbol is known in the .strtab but has the address "0".
objdump -d somehow succeed to link a call to printf with its address.
Do you have any idea ?
I think you may need to read up a little more about dynamic linking. Let's take strlen as an example symbol as printf is a bit special (fortification stuff).
Your problem is (I think) that you want to take the address of a symbol and translate that back into an address. You're trying to do this by parsing the ELF file of the program you are debugging. This works with symbols that are in your program, but not with dynamically linked symbols such as strlen. And you want to know how to resolve that.
The reason for that is that the address of symbols such as strlen are not held within your ELF program. They are instead unresolved references that are resolved dynamically when the program loads. Indeed modern Linux will (I believe) load dynamic libraries (which contain relocatable aka position independent code) in a randomised order and at randomised addresses, so the location of those symbols won't be known until the program loads.
For libraries that you have opened with dlopen() (i.e. where you are doing the loading yourself in the program), you can retrieve the address of such symbols using dlsym(); that's not much good if they are linked into the program at compile/link time.
On gcc, to resolve the position of symbols in general, use the gcc extension dladdr(). From the man page:
The function dladdr() takes a function pointer and tries to
resolve name and file where it is located. Information is
stored in the Dl_info structure:
typedef struct {
const char *dli_fname; /* Pathname of shared object that
contains address */
void *dli_fbase; /* Address at which shared object
is loaded */
const char *dli_sname; /* Name of nearest symbol with address
lower than addr */
void *dli_saddr; /* Exact address of symbol named
in dli_sname */
} Dl_info;
If no symbol matching addr could be found, then dli_sname and
dli_saddr are set to NULL.
dladdr() returns 0 on error, and nonzero on success.
I believe that will work for you.
For further information, I suggest you look at the source to ltrace which traces library calls, and how backtrace_symbols (and here) works; note that particularly for non-global symbols this is going to be unreliable, and note the comment re adding -r dynamic to the link line.
You might also want to look at addr2line and its source.
I'm using custom elf headers in an autotools C project similar to this thread: How do you get the start and end addresses of a custom ELF section in C (gcc)?. The problem is that the c files that declare the custom sections are linked into a static library which is then linked to the final application.
In this configuration the symbols __start_custom_section and __stop_custom_section do not get generated. I define the elf section like this:
struct mystruct __attribute((__section__("custom_section"))) __attribute((__used__) = {
...
};
If I link to the object file instead of the library the symbols get created and everything works as expected. This isn't a scalable solution though because I'd like new modules to just work by compiling them into the modules library. Any idea why the linker doesn't create these special symbols when the section exists in a library vs a single object file?
I have done something similar to this recently, and my solution does not rely on any compiler specific implementations, internal undocumented symbols, etc. However, it does require a bit more work :)
Background
The ELF binary on disk can be loaded and parsed quite easily by knowing its format and using a couple structures provided to us: http://linux.die.net/man/5/elf. You can iterate through each of its segments and sections (segments are containers for sections). If you do this, you can calculate the the relative start/end virtual addresses of your section. By this logic, you would think that you can do the same thing at runtime by iterating through the segments and sections of the loaded, in-memory version of the ELF binary. But alas, you can only iterate through the segments themselves (via http://linux.die.net/man/3/dl_iterate_phdr), and all section metadata has been lost.
So, how can we retain the section metadata? Store it ourselves.
Solution
If you have a custom section named '.mycustom', then define a metadata struct that should at minimum store two numbers that will indicate the relative start address and the size of your '.mycustom' section. Create a global instance of this metadata struct that will live by itself in another custom section named '.mycustom_meta'.
Example:
typedef struct
{
unsigned long ulStart;
unsinged long ulSize;
} CustomSectionMeta;
__attribute((__section__(".mycustom_meta"))) CustomSectionMeta g_customSectionMeta = { 0, 0 };
You can see that our struct instance is initialized with zero for both start and size values. When you compile this code, your object file will contain a section named '.mycustom_meta' which will be 8 bytes in size for a 32-bit compilation (or 16 bytes for 64-bit), and the values will be all zeroes. Run objdump on it and you will see as much. Go ahead and put that into a static lib (.a) if you want, run readelf on it, and you will see exactly the same thing. Build it into a shared object (.so) if you want, run readelf on it, and again you will see the same thing. Build it into an executable program, run readelf on it, and voila its still there.
Now the trick comes in. You need to write a little executable (lets call it MetaWriter) that will update your ELF file on disk to fill in the start and size values. Here are the basic steps:
Open your ELF file (.o, .so, or executable) in binary mode and read it into a contiguous array. Or, you can mmap it into memory to achieve the same.
Read through the binary using header structures and instructions found in the ELF link I listed above.
Find your '.mycustom' section and read section.sh_addr and section.sh_size.
Find your '.mycustom_meta' section. Create an instance of CustomSectionMeta using the start and size values from step 3. memcpy() your struct over the top of the existing '.mycustom_meta' section data, which up to now was all zeroes.
Save you ELF data back to the original file. It should now be completely unmodified except for the few bytes you wrote into your '.mycustom_meta' section.
What I did was executed this MetaWriter program as part of the build process in my Makefile. So, you would build your .so or executable, then run MetaWriter on it to fill in the meta section. After that, its ready to go.
Now, when the code in your .so or executable runs, it can just read from g_customSectionMeta, which will be populated with the starting address offset of your '.mycustom' section, as well as the size of it, which can be used to easily calculate the end, of course. This start offset must be added to the base address of your loaded ELF binary. There are a couple ways to get this, but the easiest way I found was to run dladdr on a symbol that I know to exist in the binary (such as g_customSectionMeta!) and use the resulting value of dli_fbase to know the base address of the module.
Example:
#include <dlfcn.h>
Dl_info dlInfo;
if (dladdr(&g_customSectionMeta, &dlInfo) != 0)
{
void * vpBase = dlInfo.dli_fbase;
void * vpMyCustomStart = vpBase + g_customSectionMeta.ulStart;
void * vpMyCustomEnd = vpMyCustomStart + g_customSectionMeta.ulSize;
}
It would be a bit overboard to post the full amount of code required to do all this work, especially the parsing of the ELF binary in MetaWriter. However, if you need some help, feel free to reach out to me.
In my case, the variable was not referenced in the code and the section was optimised out in release mode (-O2). Adding used attribute solved the issue. Example:
static const unsigned char unused_var[] __attribute__((used, section("foo"))) = {
0xCA, 0xFE, 0xBA, 0xBE
};
I've got a small static library (.a). In the static library is a pointer that points to a large, statically allocated, 1D array.
When I link my code to this library, the pointer's address is hardcoded in various locations, easily found through the disassembly. The issue is, I'd like my code to be able to have access to this array (the library is faulting, and I want to know why).
Naturally, it would be trivial to get that pointer by disassembling, hardcoding that address into my code, and then recompiling. That wouldn't be a problem except the library can be configured in different ways with other modules, and the array's pointer changes depending on what modules are linked in.
What are my options for getting that pointer? Because the starting state of the array is predictable, I could walk through memory, catching segfaults with a signal handler, until I found something that looks reasonable. Is there a better way?
Since your library is a .a archive, I'll assume you are on some kind of UNIX.
The global array should have a symbolic name associated with it. Your job would be easier or harder depending on what kind of symbol describes it.
If there is a global symbol describing this array, then you can just reference it directly, e.g.
extern char some_array[];
for (int i = 0; i < 100; i++) printf("%2d: 0x%2x\n", i, some_array[i]);
If the symbol is local, then you can first globalize it with objcopy --globalize-symbol=some_array, then proceed as above.
So how can you determine what is the symbol describing that array? Run objdump -dr foo.o, where foo.o contains instructions which you know reference that array. The relocation that will appear next to the referring instruction will tell you the name.
Finally, run nm foo.o | grep some_array. If you see 00000XX D some_array, you are done -- the array is globally visible (same for B). If you see 000XX d some_array, you need to globalize it first (likewise for b).
Update:
The -dr to objectdump didn't work
Right, because the symbol turned out to be local, the relocation probably referred to .bss + 0xNNN.
00000000006b5ec0 b grid
00000000006c8620 b grid
00000000006da4a0 b grid
00000000006ec320 b grid
00000000006fe1a0 b grid
You must have run nm on the final linked executable, not on individual foo.o objects inside your archive. There are five separate static arrays called grid in your binary, only the first one is the one you apparently care about.
declaring "extern int grid[];" and using it gives an undefined reference
That's expected for local symbols: the code in the library was something like:
// foo.c
static char grid[1000];
and you can't reference this grid from outside foo.o without globalizing the symbol first.
I'm not allowed to run a changed binary of the library on our server for security reasons
I hope you understand that that argument is total BS: if you can link your own code into that binary, then you can do anything on the server (subject to user-id restrictions); you are already trusted. Modifying third-party library should be the least worry of the server's admin if he doesn't trust you.