GNU linker no success to make a relocatable module with calls to absolute addresses - linker

I work on a MC68360 platform using GNU development tools.
What I need is a relocatable execution module that can make calls to absolute addresses,
i.e to functions that are already in memory (ROM).
I can't get the GNU linker to do so.
The place of the function call in the application is a relocatable address
and the provided function address is an absolute address.
The end result is a relocatable address.
How did I do it so far:
I extract the Global Functions from the rom-image and make a file out of this, say rom_functions.S. This file looks like this:
.text
.globl sqrt
.equ sqrt, 0x<abs addr>
A check with readelf on rom_functions.o confirms all symbols are absolute addresses, there is no relocation table either.
rom_functions.o is used to link with the application into a relocatable module with the following command line:
ld -d -r -Rrom_functions.o -uappl_start -Tmyscript #$objs -o appl.rel appl.o
The -R is used to include and preserve absolute addresses as is the purpose of this option I guess. Possibly I have mis-interpreted the -R option. I have tried -R<rom.img> but yields similar result , the called function address is made relocatable in the output and is thus - when loading - modified with the loadaddress; eventulally a the call will nog enter the desired function.
Is there a solution to achieve what I want: a relocatable module with calls to absolute addresses?

Related

addr2line with archive files

I am trying to use addr2line with a archive file libdpdk.a
I have a backtrace:
backtrace returned: 7
0: 0x46fd05 ./build/ip_pipeline(bt+0x25) [0x46fd05]
1: 0x42a163 ./build/ip_pipeline() [0x42a163]
2: 0x46ff21 ./build/ip_pipeline(rte_eal_init+0x171) [0x46ff21]
3: 0x439629 ./build/ip_pipeline(app_init+0x709) [0x439629]
4: 0x42b3ff ./build/ip_pipeline(main+0x5f) [0x42b3ff]
5: 0x7f101166b830 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f101166b830]
6: 0x42d009 ./build/ip_pipeline(_start+0x29) [0x42d009]
I tried the following command:
addr2line 0x46fd05 -f -e ../../build/lib/librte_eal.a
addr2line: ../../build/lib/librte_eal.a: cannot get addresses from archive
The expected output should be a name of the function in the backtrace at address 0x46fd05 or 0x46fd05 depending on which address I pass. Currently there is no symbol name associated with this address.
Any suggesstions.
I have compiled the code using -rdynamic
Putting a side the reason for choosing .a/.so, The 'addr2line' should be used with the binary that was executed. The reason is that the backtrace addresses are specific to a binary.
The same static (.a) library will usually be loaded into different addresses in different binaries. This is true with '.so' (especially, position-independent code) - but in many cases, Linux will attempt to reuse already mapped '.so' files, so that the actual addresses are the same.
Bottom line - from the man page - use the executable name.
--exe=filename
Specify the name of the executable for which addresses should be translated.
The default file is a.out.
Some practical note - when using '.so' - you want to execute the addr2line on a system that has the same executable, shared objects, and LD_LIBRARY_PATH. If there are different '.so' on your development and on production, the addresses may not match.

how to print all the undefined function calls along with file name from shared object?

I am trying to print all the Undefined function calls from a shared object file along with file name.
I tried with "nm" command, It print all the undefined function calls .But could not get the file name.
Example:
bash$ nm -u my_test.so
:
U _ZNSs4_Rep20_S_empty_rep_storageE##GLIBCXX_3.4
:
Environment : Ubuntu 18.04 , X86 Arch (Intel processor)
Study in details the specification of the DWARF format (which is the format used by debugging information on Linux). So you could extract the information (but it is not exactly simple) by parsing the DWARF inside your ELF binary.
Consider looking inside the source code of Ian Taylor's libbacktrace. It is doing this extraction of file name from DWARF inside ELF.
Perhaps your real problem is getting precise backtrace information, and then that libbacktrace is exactly what you need!
You might also use gdb : it is extensible and scriptable in Python (or Guile) and you could write your own specialized script.
Perhaps you'll better solve your real problem with some GCC plugin working when you compile your code.
Read How to write shared libraries by Drepper and read more about ELF.
You could for example collect all the undefined symbols in your shared library using nm (or readelf). Then a second script will find the occurrences of these in your source code. It could be even a simple awk script (or some for shell loop using grep), or something as sophisticated as a GCC plugin.
Your example shows (probably) a mangled C++ name. You could use nm -C to get it unmangled. And later write a GCC plugin to find all the GIMPLE CALL instructions using it.
Writing a GCC plugin may take some time, in particular if you are not familiar with GCC internals.

How to check the values of a struct from an image/binary file?

Is there anyway i can look into the values of a structure after compilation? objdump -td gives the function definitions and only the address where the structure is stored. The problem is i am getting a wrong address for one of the threads/functions in a structure when i run a program. The target mcu is lpc1347 (ARM Cortex-m3).
objdump parses object files (products of the compiler), which are relocatable (not executable) ELF files. At this stage, there is no such notion as the memory address these compiled pieces will run at.
You have the following possibilities:
Link your *.obj files into the final non-stripped (-g passed to compiler) executable ELF image and parse it using readelf.
Generate the linker map file by adding -Wl,-Map,file.map to your LDFLAGS and see the output sections and addresses your data is located at in the map file.
Use a debugger/gdb.

binutils - kernel - "_binary" meaning?

I am reading xv6 lectures.
I have a file named initcode.S that is to be linked in the kernel.
Now two symbols are created that way :
extern char _binary_initcode_start[], _binary_initcode_size[];
inside a function.
The lecture says :
as part of the kernel build process, the linker embeds that binary that defines two special symbols, _binary_initcode_starcode_size, indicating the location and size of the binary.
I understand that binutils is getting the address and the size of this assembled code.
I wonder about the notation : is it default ? my searches didn't prove that clearly.
_binary -> it is originally an assembly code
_initcode -> the name of my file
_start -> the parameter i am interested in.
It would imply that any assembly code compiled would have those variables too.
I have no proof of that, though.
The question is :
is _binary_myAsmFileHere_myParameterhere the default variable structure binutils give to the assembly file to export their address, size and so on ?
Could someone tell me if my assumption is right and if it is better than that : the rule
Thanks
Strangely enough, it doesn't seem to be documented in the ld manual. However, man objcopy does say this:
You can access this binary data inside a program by referencing the
special symbols that are created by the conversion process. These
symbols are called _binary_objfile_start, _binary_objfile_end and
_binary_objfile_size. e.g. you can transform a picture file into an object file and then access it in your code using these symbols.
Apparently the same logic is used by ld when embedding binary files.
Notice that the Makefile for xv6 contains this line for linking the kernel:
$(LD) $(LDFLAGS) -T kernel.ld -o kernel entry.o $(OBJS) -b binary initcode entryother
As you can see, it uses -b binary to embed the files initcode and entryother, so the above symbols will be defined during this process.
when a .global variable is defined in an assembly file, for a C file to be able to reference that variable, the C file has to prepend a '_' to the variable name. This is so the linker can 'link' the name in the C file with the name in the assembly file.

Re-export Shared Library Symbols from Other Library (OS X / POSIX)

My question is fairly OS X on x86-64 specific but a universal solution that works on other POSIX OSes is even more appreciated.
Given a list of symbol names of some shared library (called original library in the following) and I want my shared library to re-export these symbols. Re-export as in if someone tries to resolve the symbol against my library I either provide my version of this symbol or (if my library doesn't have this symbol) forward to the original library's symbol.
I don't know the types of the symbols, I only know whether they are functions (type T in nm output) or other symbols (type S in nm output).
For functions, I already have a solution: For every function I want to re-export I generate an assembly stub that does dynamically resolve the symbol (using dlsym()) and then jumps into the resolved function with the very same environment (registers rdi, rsi, rdx, rcx, r8, r9, stack pointer, ...). I'm basically generating universal proxy functions. Using some macro trickery that can be generated fairly easy without writing code for each and every symbol.
For non-function symbols the problem seems to be harder because I cannot generate this universal proxy function, because the resolving party does never call a function.
Using a constructor function static void init(void) __attribute__((constructor)); I can execute code whenever someone loads my library, that would be a good point to resolve and re-export all non-function symbols if that's possible.
In other words, I'd like to write the symbol table of my library to point to the respective symbols of another shared library. Doing the rewriting at compile or run time is okay (run time preferred). Or put yet another way, the behaviour of DYLD_INSERT_LIBRARIES (LD_PRELOAD) is exactly what I need but I don't want to insert a new library, I want to replace one (in the file system). EDIT: The reason I don't want/can't use DYLD_INSERT_LIBRARIES or any other environment variable of the DYLD_* family is that they are ignored for code signed, restricted, ... binaries.
I'm aware of the -reexport-l, -reexport_library and -reexported_symbols_list linker flags but I could not get them to work, especially when my library is a "replacement" for frameworks that are part of umbrella frameworks (example: /System/Library/Frameworks/CoreServices.framework/Frameworks/SearchKit.framework/SearchKit) because ld forbids to link directly against parts of umbrella frameworks.
EDIT: Because I explained it somewhat ambiguously: I can't change the way the actual program is linked. The goal is to produce a shared library that is a replacement for the original library. (Apparently called filter library.)
Found it out now (OS X specific): clang -o replacement-lib.dylib ... -Xlinker -reexport_library PATH_TO_ORIGINAL_LIB does the trick. PATH_TO_ORIGINAL_LIB could for example be /System/Library/Frameworks/CoreServices.framework/Frameworks/SearchKit.framework/Versions/Current/SearchKit.
If PATH_TO_ORIGINAL_LIB is a library that is part of an umbrella framework (as in the example above), then replace PATH_TO_ORIGINAL_LIB by the path of some other lib (I created a lib empty.dylib for that) and as a second step do
install_name_tool -change /usr/local/lib/empty.dylib PATH_TO_ORIGINAL_LIB replacement-lib.dylib
To see if the actual reexporting worked use:
otool -l replacement-lib.dylib | grep -A2 LC_REEXPORT_DYLIB
The output should look like
cmd LC_REEXPORT_DYLIB
cmdsize XX
name empty.dylib (offset YY)
After launching the install_name_tool it could be
cmd LC_REEXPORT_DYLIB
cmdsize XX
name /System/Library/Frameworks/CoreServices.framework/Frameworks/SearchKit.framework/Versions/Current/SearchKit (offset YY)
You could link against both libraries and use the link order to make sure to link against the right symbols. This works on both OS X and Linux:
cc -o executable -lmylib -loriglib
Where origlib is the original library and mylib contains symbols that are supposed to overwrite symbols in origlib. Then the executable will be linked against your symbols from mylib first and all unresolved symbols will be linked against origlib.
This works in the same way when linking against OS X frameworks. Just link against your library that replaces symbols first and against the framework after.
cc -o executable -lmylib -framework SomeFramework
Edit: If you just want to replace symbols at runtime then you can use LD_PRELOAD in the same way:
cc -o executable -framework SomeFramework
LD_PRELOAD=libmylib.dylib ./executable

Resources