I'm trying to implement some parts of what dyld does and I'm a little bit stuck at stub trampolines.
Consider the following ARM instruction:
BL 0x2fec
It branches with link (subprocedure call) to 0x2fec. I'm aware of the fact, that there is a section __symbolstub1 in the __TEXT segment starting at 0x2fd8, so it's a jump to 20 bytes inside of __symbolstub1.
Now, there is a symbol
(undefined) external _objc_autoreleasePoolPush (from libobjc)
that I've resolved through LC_SYMTAB load command. There is no known address provided. I know, as a fact, that 0x2fec address is a trampoline to _objc_autoreleasePoolPush, but I cannot prove it via any means.
I've checked the LC_DYLD_INFO_ONLY command, and I had a slight hint in there, in the lazy_bind symbols I've found:
{:offset=>20, :segment=>2, :library=>6, :flags=>[], :name=>"_objc_autoreleasePoolPush"}
where the name and offset match what I have exactly, and the library #6 is "/usr/lib/libobjc.A.dylib", which is also perfect. Now the issue is that segment #2 is __TEXT, but __TEXT starts at 0x1000, and __symbolstub1 is way down there at 0x2fd8. So I'm missing some reference down to section.
Any ideas on how am I supposed to map 0x2fec virtual address to _objc_autoreleasePoolPush?
Heh, just a little more digging and I've found it at LC_DYSYMTAB's indirect symbols.
Now the long answer.
Find a section for given address;
The section should be of type S_NON_LAZY_SYMBOL_POINTERS, S_LAZY_SYMBOL_POINTERS, S_LAZY_DYLIB_SYMBOL_POINTERS, S_THREAD_LOCAL_VARIABLE_POINTERS or S_SYMBOL_STUBS;
If the section type is S_SYMBOL_STUBS, then the byte size is stored in reserved2, otherwise it is considered equal to 4;
The offset into indirect symbols table is stored in reserved1;
The index into indirect symbols table is calculated as
index = sect.reserved1 + (vmaddr - sect.addr) / bytesize;
The symbol in the symbols table is found at symbols[indirect_symbols[index]].
Related
Now I'm reading computer systems : a programmer's perspective, and in the chapter 7 Linking.
There are reference, symbol and entry related knowledge, what mentioned in book is Entry has definition of symbol, and my thought about these is "Every symbol has entry, and entry has symbol reference like pointer, this reference actually has some address".
Therefore, every time I read code related global variable or function / procedure, all of them actually can be regarded as corresponding entry, which has symbol reference and other info.
Finally, my thought is right? can I keep going with this thought? Really want to understand all about computer system and techniques related programming.enter image description here
Oh~final question, is the symbol table in .symtab section same with relocation entry table?
Please avoid associating symbol with entry. The term entry is reserved in most linkers to specify the entry point of the whole linked program, i.e. address of the first instruction performed at the start of program execution. I prefer the term records for items arranged into table, for instance the symbol table.
When you create a procedure or function in your program, it will be loaded at certain address in memory when the program runs. You don't know where exactly (at which numeric address) will the procedure be located at run-time, that's why you give that address a symbolic name (label). That is symbol - a human-readable denomination of certain position in program (address symbol) or of a constant value (scalar symbol).
You can call the procedure or refer to it at write-time using its symbolic name: CALL MyProcedure or
MOV register,MyProcedure. Again, the final value of MyProcedure address is not known yet, so compiler temporarily puts 0 into the instruction body instead of this address, and creates a relocation record in relocation table. Each such record specifies 1) pointer to the temporary 0 inside the instruction body, and
2) specification of the target symbol in the form of index into symbol table.
Global symbols, such as MyProcedure should be unique in the program, but they may be referred many times, and each reference will create a record in relocation table.
The relation between symbol and relocation is not 1:1.
When the linker has enough information to decide about the final address of each symbol, it will go through relocation table and replace temporary 0 in the code with symbol's final address.
Here is my question. Suppose you want to compile the c code:
void some_function() {
write_string("Hello, World!\n");
}
For this example, I want to focus specifically on the string: "Hello, World!\n". My understanding is that the compiler will put the string into the .rodata section in an elf file. A symbol, referring to its location in the .rodata section, is added to the symbol table and that symbol is kept in the .text section as a placeholder for the location of the string.
Here is the problem. How can you leave a value like that unresolved in machine code? In x86, it should be easy enough for the linker to do a find and replace on the symbol when the location is known. However, there are many CPU architectures where an address can not be encoded in its entirety into a single machine instruction. Therefore the value would have to be loaded in 2 stages, using separate machine instructions and the linker would have to figure that out. It would have to be smart enough to manipulate the machine code with half the address in one place the half the address in another. Furthermore, somehow the elf file has to represent this complex encoding scheme for the linker later on. How does this all work?
I most programs, this will be in a user space application. So the kernel may load the .rodata section wherever it wants in memory. So it would seem that when the program is loaded, somehow, at runtime, the kernel loader would have to resolve all these symbols in the program prior to beginning execution. It would have to inject into the machine code where it put each section so they may be referenced appropriately. How does this work?
I have a feeling that my understanding and above descriptions are wrong or that I am missing something very important because this does not seem right to me. Ether that, or there is in fact the logic to preform these complex functions within modern kernels and linkers. I am looking for some further explanation and understanding.
Compilation takes place, emitting something like this:
lea rdi, [rip+some_function.hello_world]
mov rax, [rip+some_function.write_string]
call rax
after the asm pass, we end up with something that disassembles to
lea rdi, [rip+00000000]
mov rax, [rip+00000000]
call rax
where the two 00000000 slots are filled as load-time fixups. The loader performs symbol resolution and fills in the 00000000 values with the correct values.
This is a simplification. In reality there's an extra layer of indirection called the global offset table, which is used (among other things) to put all the fixups right next to each other.
The innards of how this works is CPU and OS specific, but in general you don't really have to care exactly how it works, and it could change in the next release of the compiler (and has changed at least twice already). The loader understands fixups at a very generic level using a fixup table, and can deal with new ideas so long as they resolve to put (absolute or relative) address of a symbol at offset + size.
The Alpha processor had it kind of bad back in the day. Fixups had to be in between functions, and relative addressing could be only done in signed 16 bit sizes, so the fixups for functions were located immediately before or after each function, and presumably you got an error in the ASM pass if the pointer didn't fit because the function was too big. I did come up with a clever sequence that would have fixed the problem on Alpha, but that was long after the platform was retired, and nobody cares anymore so it never got implemented.
I remember the bad old days from before the loader could do good patchups. There once was a global (and I really do mean global) table of shared library load addresses, and the compiler emitted absolute addresses and you had to rebuild your application if you changed a library, even though you used shared libraries. That just wasn't the brightest ideas, and no wonder people keps statically linked emergency binaries lying around. Breaking libc wasn't fun.
Let's assume the following:
There's a jump or a reference to data in an address reference encoded in 2 bytes. Now when statically linking, the relocation happens so that the new address does not fit in 2 bytes -- maybe it needs 4 bytes.
I assume the linker will rewrite the code, possibly using a different instruction, and use 4 bytes for the new address.
Does the linker then need to update the size of the current segment/section, and update all farther addresses by the same offset (+2 bytes in this example)?
Machine instructions which refer to external symbols cannot use abbreviated form, where the displacement or immediate operand is encoded in one byte (extendable on runtime) instead of full word.
Linkers are not that smart to recompile the once assembled segments (at least the one that I wrote isn't :-)
I'm trying to understand the linking stage of C toolchain. I wrote a sample program and dissected the resulting object file. While this helped me to get a better understanding of the processes involved, there are some things which remain unclear to me.
Here are:
My (blazingly simple) sample program
Relevant parts of the object disassembly
The objects symbol table
The objects relocation table
Part 1: Handling of initialized variables.
Is it correct, that theses relocation table entries...
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
0000002b dir32 .data
00000035 dir32 .data
0000003f dir32 .data
... are basically telling the linker, that the addresses stored at offset 2b, 35 and 3f from .text are not absolute adresses, but relative adresses (= offsets) in relation to .data? It is my understanding that this enables the linker to
either convert these relative adresses to absolute adresses for creation of a non-relocatable object file,
or just adjust them accordingly in case the object file gets linked with some other object file.
Part 2: Handling of uninitialized variables.
I don't understand why uninitalized variables are handled so differently to initialized variables. Why are the register adresses stored in the opcode,
equal for all the uninitialized variables (0x0, 0x0 and 0x0), while being
different for all the initialized variables (0x0, 0x4 and 0x8)?
Also the value field of their relocation table entries is entirely unclear to me. I would have expected the .bss section to be referenced there.
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
0000000d dir32 _var1_zeroed-0x00000004
00000017 dir32 _var2_zeroed-0x00000004
00000021 dir32 _var3_zeroed-0x00000004
... are basically telling the linker, that the addresses stored at offset ...
No, the linker is no longer involved with this. The relocation tables tell the loader, the part of the operating system that's responsible for loading the executable image into memory about the addresses.
The linker builds the executable image based on the assumption that everything is ideal and the image can be loaded at the intended address. If that's the case then everything is hunky-dory, nothing needs to be done. If there's a conflict however, the virtual address space is already in use by something else, then the image needs to be relocated at a different address.
That requires addresses to be patched, the offset between the ideal and the actual load address needs to be added. So if the .data section ends up at another address then addresses .text+0x2b, .text+0x35, etcetera, must be changed. No different for the uninitialized variables, the linker already picked an address for them but when _var1_zeroed-0x00000004 ends up at another address then .text+0x0d, .text+0x17, etcetera, need to be changed.
In the U-boot for S3C24X0(ARM920T), we use following instructions to jump to C part:
ldr pc, _start_armboot
_start_armboot: .word start_armboot
But how could I know the value of start_armboot? I couldn't find when or where we have defined the address value of start_armboot. It doesn't exist in the .lds file,either. Or because of
_start_armboot: .word start_armboot
we put start_armboot in the memory after the current position directly? Then how could we associate this instruction/address with the C function of "void start_armboot(void)"?
_start_armboot: .word start_armboot just means to put the address of the symbol start_armboot at that location.
The linker is responsible for filling it with the correct address at link time.
Internally, start_armboot is just a stub filled with some dummy value (usually zero) when it is compiled into an object file. Later, when all the object files have been gathered together, the linker starts putting pieces together. Once all the pieces are laid out, it goes back through the object files and fills in the stubs since the symbol locations are known to the linker now.