A shared object, such as glibc, when compiled appropriately, defines many symbols, such as main_arena, that are not normally used by other programs (although they can be seen in objdump and gcc), but are defined, with their addresses, as local symbols:
$ objdump -t ../.glibc/glibc_2.30_no-tcache/libc.so.6 | grep main_arena
00000000003b4b60 l O .data 0000000000000898 main_arena
Yet, when I reference one of these in C (via extern), and attempt to link, the linker can't find it:
$ gcc -g -Og -no-pie -Wl,-rpath ../.glibc/glibc_2.30_no-tcache/ -Wl,--dynamic-linker=../.glibc/glibc_2.30_no-tcache/ld.so.2 s1.c -o s1
/usr/bin/ld: /tmp/ccjKyCNh.o: in function `printf':
/usr/include/x86_64-linux-gnu/bits/stdio2.h:112: undefined reference to `main_arena'
/usr/bin/ld: /usr/include/x86_64-linux-gnu/bits/stdio2.h:112: undefined reference to `main_arena'
collect2: error: ld returned 1 exit status
Note: I've updated this question with extensive research:
This is by design:
c language, global symbol, local symbol clarification "local (static): local symbols that are defined and referenced exclusively by module m.... These symbols are visible anywhere within module m, but cannot be referenced by other modules."
See also "Symbol Visibility
Symbols can be categorized as local or global. Local symbols can not be referenced from an object other than the object that contains the symbol definition." https://docs.oracle.com/cd/E26505_01/html/E26506/chapter2-90421.html
and https://reverseengineering.stackexchange.com/questions/14895/why-are-symbols-with-local-binding-present-in-the-symbol-table-of-my-elf-files and http://web.cse.ohio-state.edu/~reeves.92/CSE2421au12/SlidesDay52.pdf
Nonetheless, for debugging, exploration, and reverse engineering, its sometimes desirable to reference an external local symbol defined in a shared object. All the information is there, as evidenced by gdb's ability to display it; its simply a flag that tells ld to not resolve symbols to it.
Given such, is it possible to tell ld to ignore the local flag, and resolve to the symbol anyway?
For example:
$ objdump -t ../.glibc/glibc_2.30_no-tcache/libc.so.6 | grep -E ' malloc$| main_arena$'
00000000003b4b60 l O .data 0000000000000898 main_arena
0000000000083500 g F .text 0000000000000213 malloc
$ man objdump 2>/dev/null | grep -A10 'flag characters'
The flag characters are divided into 7 groups as follows:
"l"
"g"
"u"
"!" The symbol is a local (l), global (g), unique global (u), neither global nor local (a space) or both global and
local (!). ...
I'd like to be able to write code that, for debugging and reverse engineering, references the symbol main_arena regardless. How can I do this?
Update
I've read Employed Russian's excellent posts on related topics, and seen his reference to the XY Problem. With that in mind, let me ask my question X:
For exploratory purposes, I'd like to be able to look at the behavior of things like main_arena, and other malloc internals, as I use malloc and free. I can do this with gdb. But I'd like to do this programaticaly, in C. One way to do this might have been to actually link to these symbols (question Y), but there's no reason to assume that's the best way, the only way, or even a viable way. Given that:
How can I inspect the value of local symbols in a shared library from within a different program, without having to drop to gdb?
Given such, is it possible to tell ld to ignore the local flag, and resolve to the symbol anyway?
No.
All the information is there, as evidenced by gdb's ability to display it; its simply a flag that tells ld to not resolve symbols to it.
You are mistaken. While the symbol is present in the static symbol table (in the .symtab section), it is not present in the dynamic symbol table (in the .dynsym section). It is not just a matter of a flag, fundamental parts needed to perform dynamic linking at runtime are missing.
You can confirm this by looking in readelf --dyn-syms .../libc.so.6 | grep main_arena -- the symbol will not be there.
You could binary patch the "flag", changing STB_LOCAL binding of the symbol in .symtab to STB_GLOBAL. After you do that, the symbol will show as g in the objdump output, but the linker will still not be able to use it.
P.S. You should never use objdump to examine ELF binaries -- it's highly deficient for that purpose. Use readelf instead.
Update:
How does GDB find ...
By reading .symtab section.
Is there a way I can tell ld to do something similar?
No. The linker could easily read the .symtab section as well, and can link the binary that imports the main_arena symbol in the same way it imports e.g. stdout.
But such a binary will not run.
At runtime, as soon as the binary is loaded, the loader (ld.so) will need to resolve the reference to main_arena. And since the symbol is not present in the dynamic symbol table (which is the only symbol table ld.so can use), the symbol resolution will fail and ld.so will exit with a fatal error.
This is precisely the same thing as linking a.out against foo.so with int foo defined, and then running that a.out against a different version of foo.so, one without foo in it.
Update 2:
Is that simply a feature that ld lacks (because it's not needed outside of reverse engineering and other nonstandard use cases), or is it inherently not possible?
It's a feature that both ld (the static linker) and ld.so (the dynamic loader) lack.
It's possible to do (GDB can resolve these symbols after all), but a lot of work, for very little gain.
Could one possibly augment ld to use the regular .symtab (I understand it would be slower due to lack of hashes)?
Like I said, you would need to modify both ld and ld.so. The latter is part of GLIBC, and modifying GLIBC has complications. Making any mistakes in the process can easily render your system un-bootable.
And if you are going to modify GLIBC anyway, it would likely be much simpler to expose all the symbols you want (make them non-local). That way you only need to change GLIBC, and can use standard ld and the rest of standard symbol resolution mechanisms.
Recently, I have been studying the relocation types in the program linking process for arm32 target.
I have tested some little programs to produce different relocation types to analyze.
And I found some of the relocation types are difficult to produce, such as R_ARM_ABS16, R_ARM_ABS12, R_ARM_THM_ABS5 and R_ARM_ABS8.
I have tried many times and none of them can be produced. I also tried to analyze the source codes of binutils (version 2.26). But no clues can be found how these types are generated in the relocation method elf32_arm_final_link_relocate() in file elf32-arm.c. Or maybe I am just not familiar with the source code and ommit some points.
Does someone know what are these four relocation types for? How can I produce them? Any suggestion is welcomed.
By the way, in the document link below, there are relocation type descriptions for arm32 target.
All my relocation knowledge is come from this document.
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044f/IHI0044F_aaelf.pdf
I have an issue with an ELF file generated by the GNU linker ld.
The result is that the data section (.data) gets corrupted when the executable is loaded into memory. The corruption to the .data section occurs when the loader performs the relocation on the .eh_frame section using the relocation data (.rela.eh_frame).
What happens is that this relocation causes seven writes that are beyond the .eh_frame section and over-write the correct contents of the .data section which is adjacent to the top of the .eh_frame section.
After some investigation, I believe the loader is behaving correctly, but the ELF file it has been given contains an error.
But I could be wrong and wanted to check what I've found so far.
Using readelf on the ELF file, it can be seen that seven of the entries in the .rela.eh_frame section contain offsets that are outside (above) the range given by readelf for the .eh_frame section. ie The seven offsets in .rela.eh_frame are greater than the length given for .eh_frame. When these seven offsets are applied in the relocation, they corrupt the .data section.
So my questions are:
(1) Is my deduction that relocation offsets should not be greater than the length of the section to which they apply? And therefore the ELF file that has been generated is in error?
(2) What are people's opinions on the best way of proceeding to diagnose the cause of the incorrect ELF file? Are there any options to ld that will help, or any options that will remove/fix the .eh_frame and it's relocation counterpart .rela.eh_frame?
(3) How would I discover what linker script is being used when the ELF file is generated?
(4) Is there a specific forum where I might find a whole pile of linker experts who would be able to help. I appreciate this is a highly technical question and that many people may not have a clue what I'm talking about!
Thanks for any help!
The .eh_frame section is not supposed to have any run-time relocations. All offsets are fixed when the link editor is run (because the object layout is completely known at this point) and the ET_EXEC or ET_DYN object is created. Only ET_REL objects have relocations in that section, and those are never seen by the dynamic linker. So something odd most be going on.
You can ask such questions on the binutils list or the libc-help list (if you use the GNU toolchain).
EDIT It seems that you are using a toolchain configured for ZCX exceptions with a target which expects SJLJ exceptions. AdaCore has some documentation about his:
GNAT User's Guide Supplement for Cross Platforms 19.0w documentation ยป VxWorks Topics
Zero Cost Exceptions on PowerPC Targets
It doesn't quite say how t switch to the SJLJ-based VxWorks 5 toolchain. It is definitely not a matter of using the correct linker script. The choice of exception handling style affects code generation, too.
I am experiencing relocation truncated to fit kind of error for my embedded ARM application compiled and linked with with GCC 4.9.3. I am using code relocation for this function from external flash (0x70000000) to internal RAM (0x08000000) to improve performance of my application, and this is one of the causes of the problem.
I have a small inline-assembly naked function to perform a short loop:
void ThreeCycleDelay(uint32_t count) __attribute__((naked))
{
__asm(" subs r0, #1\n"
" bne ThreeCycleDelay\n"
" bx lr");
}
But when linking, I receive the following error from ld:
D:/app/app.a(app_utils.obj):(.ARM.exidx.text.ThreeCycleDelay+0x0):
relocation truncated to fit: R_ARM_PREL31 against
`.text.ThreeCycleDelay'
I have seen suggestions on the internet to solve this issue, but none of them were helpful. Trying to "remove" .ARM.exidx section by -funwind-tables -fno-exceptions made no difference.
The error disappears when I perform no code relocation, and it does not show for any other function. Removing the __attribute__((naked)) does not solve the issue either, so I was suspicious it is linekd with the inline assembly jump, but the real question is - how can I solve this issue?
I was compiling/linking my program
i386-gcc -o output.lnx func.opc mainc.opc
and I kept getting that error. I honestly have no idea what this means.
Any clue?
thanks,
This is usually a symptom of having too much code or data in the program. The relocation at offset 7 in .text segment (code) has been compiled with a fixed size (2 or 4), but the data/instruction it is referring to is more than 64k or 2G away.
Other than that, I can't tell you how to fix it without actually seeing the object files. Useful tools for pinpointing the problem are objdump (with flags -dr) and readelf programs.