Is there a way to unhide hidden-visibility symbols with GNU binutils? - c

I'm working on a script to make uClibc usable on an existing glibc-targetted gcc/binutils toolchain, and the one problem I'm left with is that pthread_cancel needs to dlopen libgcc_s.so.1. The version supplied with the host gcc is linked to depend on glibc, so I'm instead using ld's -u option to pull in the needed symbols (and their dependencies) from libgcc_eh.a to make a replacement libgcc_s.so.1:
gcc -specs uclibc.specs -Wl,-u,_Unwind_Resume -Wl,-u,__gcc_personality_v0 \
-Wl,-u,_Unwind_ForcedUnwind -Wl,-u,_Unwind_GetCFA -shared -o libgcc_s.so.1
In principle I would be done, but all the symbols in libgcc_eh.a have their visibility set to hidden, so in the output .so file, they all become local and don't get added to the .dynsym symbol table.
I'm looking for a way to use binutils (perhaps objcopy? or a linker script?) on either the .so file or the original .o files in libgcc_eh.a to un-hide these symbols. Is this possible?

objcopy doesn't seem to have this feature, but you can do it with the ELFkickers rebind tool:
rebind --visibility default file.o SYMBOLS...
This must be done on the original .o files. If you try to do it on the .so it'll be too late, because the hidden symbols will have been omitted from the .dynsym section.

I think you should be able to use --globalize-symbol in objcopy.
e.g.
$ nm /usr/lib/gcc/i686-redhat-linux/4.6.3/libgcc_eh.a | grep emutls_alloc
00000000 t emutls_alloc
$ objcopy --globalize-symbol=emutls_alloc /usr/lib/gcc/i686-redhat-linux/4.6.3/libgcc_eh.a /tmp/libgcc_eh.a
$ nm /tmp/libgcc_eh.a |grep emutls_alloc
00000000 T emutls_alloc
You can provide --globalize-symbol several times to objcopy, but you'll need to explicitly mention the full symbol name of all the symbols you want to globalize.
Though I'm not sure what kind of breakage could occur turning libgcc_eh.a into a shared object, as libgcc_eh.a is presumably compiled without -fpic/-fPIC. Turns out libgcc_eh.a is compiled as position independent code.

Related

GCC/ld not using shared object with -l

I've read this question: ld cannot find shared library even with -L specified, but I'm asking a follow-up: why does GCC do this?
This is something I ran into while building a binary with links to two in-house libraries.
gcc cannot find the symbols from one of the libraries with -l, but uses the other one just fine!
Originally, the command from my Makefile was gcc baz.o qux.o -lfoo -lbar, with the linker unable to find the symbols from libfoo, while finding the symbols from -lbar. The libraries are the exact same type of file in the same locations: headers in /usr/local/include and libraries in /usr/local/lib. In fact, libfoo depends on libbar.
Corrected, the command is now gcc baz.o qux.o -lbar /usr/local/lib/libfoo.so. I have determined this is not an ordering issue.
Why does gcc need the shared object instead of an -l? Is there a better way to do this, other than using the absolute path? This solution seems kludgy to me.
The exact code and output I'm using are as follows:
Output from using -lsandbox
The content of libsandbox.h
The Makefile I'm referencing, annotated.
Thanks!

How do I strip symbols only from dependent libraries?

I'd like to ship libfoo.a, which is composed of foo.o--which in turn depends on libVendorBar.a and libVendorZoo.a.
When I link and generate my libfoo.a I notice that symbols in libVendor*.a are still public and visible for potential client applications to link against.
Due to many reasons outside of my control, I absolutely do not want 3rd party clients to be able to directly link against the vendor libraries.
How do I force gcc to resolve all libVendor symbols for libfoo and discard them, so that only symbols from libfoo are visible?
I'm not using any LD_FLAGS currently and everything is statically linked.
Unfortunately static libraries do not have equivalent of -fvisibility=hidden used for shared libraries. You can achieve what you need with more work though:
first link all necessary code into foo.o:
ld -r foo.o -Lpath/to/vendor/libs -lBar -lZoo -o foo_linked.o
This would allow you can to ship libfoo.a without vendor libs (vendor symbols are still present in it).
Unfortunately you can't simply remove vendor symbols from library symtab (e.g. via objcopy -L and strip --strip-symbol) because linker will need them for relocation processing during final executable link. But you can at least rename them to something unreadable:
for sym in all symbols you want to hide; do
id=$(echo $sym | md5sum | awk '{print $1}')
objcopy --redefine-sym $sym=f_$id foo_linked.o
done
Note however that this wouldn't stop motivated user from reverse engineering vendor's code.

Compiling a static executable [duplicate]

I'm trying to compile an executable (ELF file) that does not use a dynamic loader. I built a cross compiler that compiles mips from linux to be used on a simulator I made. I asserted the flag -static-libgcc on compilation of my hello.cpp file (hello world program). Apparently this is not enough though. Because there is still a segment in my executable which contains the name/path of the dynamic loader. What flags do I use to generate an executable which contains EVERYTHING needed to be run? Do I need to rebuild my cross compiler?
Use the following flags for linking
-static -static-libgcc -static-libstdc++
Use these three flags to link against the static versions of all dependencies (assuming gcc). Note, that in certain situation you don't necessarily need all three flags, but they don't "hurt" either. Therefore just turn on all three.
Check if it actually worked
Make sure that there is really no dynamic linkage
ldd yourexecutable
should return "not a dynamic executable" or something equivalent.
Make sure that there are no unresolved symbols left
nm yourexecutable | grep " U "
The list should be empty or should contain only some special kernel-space symbols like
U __tls_get_addr
Finally, check if you can actually execute your executable
Try using the -static flag?

Is there an option to GNU ld to omit -dynamic-linker (PT_INTERP) completely?

I'm experimenting with the concept of pure-static-linked PIE executables on Linux, but running into the problem that the GNU binutils linker insists on adding a PT_INTERP header to the output binary when -pie is used, even when also given -static. Is there any way to inhibit this behavior? That is, is there a way to tell GNU ld specifically not to write certain headers to the output file? Perhaps with a linker script?
(Please don't answer with claims that it won't work; I'm well aware that the program still needs relocation processing - load-address-relative relocations only due to my use of -Bsymbolic - and I have special startup code in place of the standard Scrt1.o to handle this. But I can't get it to be invoked without the dynamic linker already kicking in and doing the work unless hexedit the PT_INTERP header out of the binary.)
Maybe I'm being naïve, but... woudn't suffice to search for the default linker script, edit it, and remove the line that links in the .interp section?
For example, in my machine the scripts are in /usr/lib/ldscripts and the line in question is interp : { *(.interp) } in the SECTIONS section.
You can dumpp the default script used running the following command:
$ ld --verbose ${YOUR_LD_FLAGS} | \
gawk 'BEGIN { s = 0 } { if ($0 ~ /^=/) s = !s; else if (s == 1) print; }'
You can modify the gawk script slightly to remove the interp line (or just use grep -v and use that script to link your program.
I think I might have found a solution: simply using -shared instead of -pie to make pie binaries. You need a few extra linker options to patch up the behavior, but it seems to avoid the need for a custom linker script. Or in other words, the -shared linker script is already essentially correct for linking static pie binaries.
If I get it working with this, I'll update the answer with the exact command line I'm using.
Update: It works! Here's the command line:
gcc -shared -static-libgcc -Wl,-static -Wl,-Bsymbolic \
-nostartfiles -fPIE Zcrt1.s Zcrt2.c /usr/lib/crti.o hello.c /usr/lib/crtn.o
where Zcrt1.s is a modified version of Scrt1.s that calls a function in Zcrt2.c before doing its normal work, and the code in Zcrt2.c processes the aux vector just past the argv and environment arrays to find the DYNAMIC section, then loops over the relocation tables and applies all the relative-type relocations (the only ones that should exist).
Now all of this can (with a little work) be wrapped up into a script or gcc specfile...
Expanding on my earlier note as this doesn't fit in that puny box (and this is just as an idea or discussion, please do not feel obligated to accept or reward bounty), perhaps the easiest and cleanest way of doing this is to juts add a post-build step to strip the PT_INTERP header from the resulting binary?
Even easier than manually editing the headers and potentially having to shift everything around is to just replace PT_INTERP with PT_NULL. I don't know whether you can find a way of simply patching the file via existing tools (some sort of scriptable hex find and replace) or if you'll have to write a small program to do that. I do know that libbfd (the GNU Binary File Descriptor library) might be your friend in the latter case, as it'll make that entire business a lot easier.
I guess I just don't understand why it's important to have this performed via an ld option. If available, I can see why it would be preferable; but as some (admittedly light) Googling indicates there isn't such a feature, it might be less of a hassle to just do it separately and after-the-fact. (Perhaps adding the flag to ld is easier than scripting the replacement of PT_INTERP with PT_NULL, but convincing the devs to pull it upstream is a different matter.)
Apparently (and please correct me if this is something you've already seen) you can override the behavior of ld with regards to any of the ELF headers in your linker script with the PHDRS command, and using :none to specify that a particular header type should not be included in any segment. I'm not certain of the syntax, but I presume it would look something like this:
PHDRS
{
headers PT_PHDR PHDRS ;
interp PT_INTERP ;
text PT_LOAD FILEHDR PHDRS ;
data PT_LOAD ;
dynamic PT_DYNAMIC ;
}
SECTIONS
{
. = SIZEOF_HEADERS;
.interp : { } :none
...
}
From the ld docs you can override the linker script with --library-path:
--library-path=searchdir
Add path searchdir to the list of paths that ld will search for
archive libraries and ld control scripts. You may use this option any
number of times. The directories are searched in the order in which
they are specified on the command line. Directories specified on the
command line are searched before the default directories. All -L
options apply to all -l options, regardless of the order in which the
options appear. The default set of paths searched (without being
specified with `-L') depends on which emulation mode ld is using, and
in some cases also on how it was configured. See section Environment
Variables. The paths can also be specified in a link script with the
SEARCH_DIR command. Directories specified this way are searched at the
point in which the linker script appears in the command line.
Also, from the section on Implicit Linker Scripts:
If you specify a linker input file which the linker can not recognize
as an object file or an archive file, it will try to read the file as
a linker script. If the file can not be parsed as a linker script, the
linker will report an error.
Which would seem to imply values in user-defined linker scripts, in contrast with implicitly defined linker scripts, will replace values in the default scripts.
I'am not an expert in GNU ld, but I have found the following information in the documentation:
The special secname `/DISCARD/' may be used to discard input sections.
Any sections which are assigned to an output section named `/DISCARD/'
are not included in the final link output.
I hope this will help you.
UPDATE:
(This is the first version of the solution, which don't work because INTERP section is dropped along with the header PT_INTERP.)
main.c:
int main(int argc, char **argv)
{
return 0;
}
main.x:
SECTIONS {
/DISCARD/ : { *(.interp) }
}
build command:
$ gcc -nostdlib -pie -static -Wl,-T,main.x main.c
$ readelf -S a.out | grep .interp
build command without option -Wl,-T,main.x:
$ gcc -nostdlib -pie -static main.c
/usr/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000000218
$ readelf -S a.out | grep .interp
[ 1] .interp PROGBITS 00000134 000134 000013 00 A 0 0 1
UPDATE 2:
The idea of this solution is that the original section 'INTERP' (. interp in the linker script file) is renamed to .interp1. In other words, the entire contents of the section is placed to the .interp1 section. Therefore, we can safe remove INTERP section (now empty) without fear of losing default linker script settings and hence the header INTERP_PT will be removed too.
SECTIONS {
.interp1 : { *(.interp); } : NONE
/DISCARD/ : { *(.interp) }
}
In order to show that the contents of the section INTERP present in the file (as .interp1), but INTERP_PT header removed, I use a combination of readelf + grep.
$ gcc -nostdlib -pie -Wl,-T,main.x main.c
$ readelf -l a.out | grep interp
00 .note.gnu.build-id .text .interp1 .dynstr .hash .gnu.hash .dynamic .got.plt
$ readelf -S a.out | grep interp
[ 3] .interp1 PROGBITS 0000002e 00102e 000013 00 A 0 0 1
The option -Wl,--no-dynamic-linker solves the issues with binutils 2.26 or later.

Tool for modifying dynamic section of an ELF binary

Is there a tool for modifying the shared library entries in the dynamic section of an ELF binary? I would like to explicitly modify the shared library dependencies in my binary (i.e. replace path to existing library with a custom path)
replace path to existing library with a custom path
If this is your own library, then you probably linking it like that:
$ cc -o prog1 -l/full/path/to/libABC.so prog1.o
instead of the proper:
$ cc -o prog1 -L/full/path/to/ -lABC prog1.o
The first approach tells Linux linker that application needs precisely that library, only that library and no override should be possible. Second approach tells that application needs the library which would be installed somewhere on the system, either in the default library path or one pointed by the $LD_LIBRARY_PATH (would be looked up during run-time). -L is used only during link-time.
Otherwise, instead of patching the ELF, first check if you can substitute the library using a symlink. This is the usual trick: it is hard to modify executable afterward, but it is very easy to change where to the symlink points.
patchelf is what you want
$ patchelf --replace-needed LIB_ORIGIN LIB_NEW ELF_FILE
To see the effect
$ readelf -d ELF_FILE
Install the tools is easy:
$ sudo apt-get install patchelf readelf
You may want to check the LD_LIBRARY_PATH environment variable.
If you look at the .dynsym section in Linux via readelf, you'll just see something like:
1: 0000000000000000 163 FUNC GLOBAL DEFAULT UND fseek#GLIBC_2.2.5 (2)
which just contains a symbolic name of the library. However, if you include the dynamic loader info, you get:
libc.so.6 => /lib/libc.so.6 (0x00002ba11da4a000)
/lib64/ld-linux-x86-64.so.2 (0x00002ba11d82a000)
So as mentioned, the absolute easiest thing to do (assuming you're doing this for debugging, and not forever) would just be to create a new session, export your custom path in front of the existing LD_LIBRARY_PATH, and go from there.

Resources