I'm going to ship a to static library to a customer.
To maximize the privacy of the library I have restricted symbols for the static library using the technique provided by #ypsu Symbol hiding in static libraries built with Xcode
However, the above mentioned method only restrict the user from calling the hidden functions, the name of the hidden function names are still visible to "nm" or "strings".
The names for hidden function are very sensitive. How do I hide or remove this information from the static library ?
I have lately approached the same problem. I decided to rename all public symbols to it's md5sum so that the names are not visible to the user, including the filenames. A following example tries to demonstrate it:
cat >priv.c <<EOF
#include <stdio.h>
void priv() { printf("Hello, private function\n"); }
EOF
cat >interface.c <<EOF
void priv();
void interface() { priv(); }
EOF
cat >main.c <<EOF
void interface();
int main() {
interface();
}
EOF
cat >compile.sh <<EOF3
#!/bin/bash
namespace="namespace_"
# compile to object files
gcc -c -o priv.o priv.c
gcc -c -o interface.o interface.c
# rename the object file so the names of files are not visible
nofilename="$(echo "nofilenames" | md5sum | cut -d' ' -f1).o"
ld -relocatable priv.o interface.o -o "$nofilename"
# create the static library
ar rcs static.a "$nofilename"
# list of interface symbols
public_symbols=( interface )
# list of private symbols - all symbols except interface symbols
private_symbols=($(
nm static.a | sed '/^[0-9]\+ T /!d; s///' |
sort | comm -13 <(printf "%s\n" "${public_symbols[#]}" | sort) -
))
# strip unused symbols, leave only interface symbols
strip_args=($(printf " -K %s " "${public_symbols[#]}"))
strip --strip-unneeded --strip-debug "${strip_args[#]}" static.a
# rename all private symbols with it's md5sum
objcopy_args=($(
printf "%s\n" "${private_symbols[#]}" |
while IFS= read -r sym; do
new="${namespace}$(echo "$sym" | md5sum | cut -d' ' -f1)"
# replace the symbol with it's md5sum
echo --redefine-sym "$sym=$new"
# make the symbol local
echo -L "$new"
done
))
objcopy "${objcopy_args[#]}" static.a
gcc main.c static.a
# testing
set -x
nm static.a
strings static.a
./a.out
EOF3
The ./compile.sh script would output:
+ nm static.a
b8c84a861a264dfcb24ebf32892484dd.o:
U _GLOBAL_OFFSET_TABLE_
0000000000000013 T interface
0000000000000000 t namespace_6b2f60f631c17ca910498adb47387adf
U puts
+ strings static.a
!<arch>
/ 0 0 0 0 18 `
interface
// 36 `
b8c84a861a264dfcb24ebf32892484dd.o/
/0 0 0 0 644 1744 `
Hello, private function
GCC: (Arch Linux 9.3.0-1) 9.3.0
GCC: (Arch Linux 9.3.0-1) 9.3.0
namespace_6b2f60f631c17ca910498adb47387adf
puts
interface
_GLOBAL_OFFSET_TABLE_
.symtab
.strtab
.shstrtab
.rela.text
.rodata
.rela.eh_frame
.data
.bss
.comment
.note.GNU-stack
+ ./a.out
Hello, private function
The priv symbol was renamed to namespace_6b2f60f631c17ca910498adb47387adf and source files were combined into one b8c84a861a264dfcb24ebf32892484dd.o object file with ld -relocatable. I am open to more suggestions on how to improve such script.
The template shown emerged into a bigger script ,ar_hide_symbols.
Related
It's possible to pass --export-dynamic to ld and this will export symbols in the program (so that they are available to any shared libraries loaded at run-time):
$ cat > test.c
void foo() {}
int main() { foo(); }
^D
$ gcc test.c
$ nm -D a.out | grep foo
...nothing. And now:
$ gcc -Wl,--export-dynamic test.c
$ nm -D a.out | grep foo
0000000000001129 T foo
...works.
This is documented in https://sourceware.org/binutils/docs-2.34/ld/Options.html#Options
Is it possible to just export symbols from one particular static library?
Given like:
$ gcc myprogram.cc lib1.a lib2.a lib3.a
Say I just wanted to export symbols in the program from lib2.a, but not lib1.a or lib3.a?
I tried:
$ gcc myprogram.cc lib1.a -Wl,--export-dynamic lib2.a -Wl,--no-export-dynamic lib3.a
but it doesn't work, it looks like --export-dynamic is global.
(The documentation mentions --dynamic-list=listfile but I didn't understand the format of the file, or how to extract the symbol list from the static library?)
how to extract the symbol list from the static library?
nm staticlib.a | awk 'some parsing here, mostly {print $3}'
didn't understand the format of the file
I also don't, but I've found this link: https://sourceware.org/legacy-ml/binutils/2010-01/msg00416.html . The file should contain:
{
foo;
};
ld --export-dynamic for just one library?
Untested:
gcc myprogram.cc lib1.a lib2.a \
-Wl,--dynamic-list=<(echo '{'; nm lib1.a | awk '{print $3";"}'; echo '};')
I want to remove unused functions from code while compiling. Then I write some code (main.c):
#include <stdio.h>
const char *get1();
int main()
{
puts( get1() );
}
and getall.c:
const char *get1()
{
return "s97symmqdn-1";
}
const char *get2()
{
return "s97symmqdn-2";
}
const char *get3()
{
return "s97symmqdn-3";
}
Makefile
test1 :
rm -f a.out *.o *.a
gcc -ffunction-sections -fdata-sections -c main.c getall.c
ar cr libgetall.a getall.o
gcc -Wl,--gc-sections main.o -L. -lgetall
After run make test1 && objdump --sym a.out | grep get , I only find the next 2 lines output:
0000000000000000 l df *ABS* 0000000000000000 getall.c
0000000000400535 g F .text 000000000000000b get1
I guess the get2 and get3 was removed. But when I open the a.out by vim, I found s97symmqdn-1 s97symmqdn-2 s97symmqdn-3 exists.
Is the function get2 get3 removed really ? How I can remove the symbol s97symmqdn-2 s97symmqdn-3 ? Thank you for your reply.
My system is centos7 and gcc version is 4.8.5
The compilation options -ffunction-sections -fdata-sections and linkage option --gc-sections
are working correctly in your example. Your static library is superfluous, so it can
be simplified to:
$ gcc -ffunction-sections -fdata-sections -c main.c getall.c
$ gcc -Wl,--gc-sections main.o getall.o -Wl,-Map=mapfile
in which I'm also asking for the linker's mapfile.
The unused functions get2 and get3 are absent from the executable:
$ nm a.out | grep get
0000000000000657 T get1
and the mapfile shows that the unused function-sections .text.get2 and .text.get3 in which get2 and get3 are
respectively defined were discarded in the linkage:
mapfile (1)
...
Discarded input sections
...
.text.get2 0x0000000000000000 0xd getall.o
.text.get3 0x0000000000000000 0xd getall.o
...
Nevertheless, as you found, all three of the string literals "s97symmqdn-(1|2|3)"
are in the program:
$ strings a.out | egrep 's97symmqdn-(1|2|3)'
s97symmqdn-1
s97symmqdn-2
s97symmqdn-3
That is because -fdata-sections applies just to the same data objects that
__attribute__ ((__section__("name"))) applies to1, i.e. to the definitions
of variables that have static storage duration. It is not applied to anonymous string literals like your
"s97symmqdn-(1|2|3)". They are all just placed in the .rodata section as usual,
and there we find them:
$ objdump -s -j .rodata a.out
a.out: file format elf64-x86-64
Contents of section .rodata:
06ed 73393773 796d6d71 646e2d31 00733937 s97symmqdn-1.s97
06fd 73796d6d 71646e2d 32007339 3773796d symmqdn-2.s97sym
070d 6d71646e 2d3300 mqdn-3.
--gc-sections does not allow the linker to discard .rodata from the program
because it is not an unused section: it contains "s97symmqdn-1", referenced
in the program by get1 as well as the unreferenced strings "s97symmqdn-2"
and "s97symmqdn-3"
Fix
To get these three string literals separated into distinct data sections, you
need to assign them to distinct named objects, e.g.
getcall.c (2)
const char *get1()
{
static const char s[] = "s97symmqdn-1";
return s;
}
const char *get2()
{
static const char s[] = "s97symmqdn-2";
return s;
}
const char *get3()
{
static const char s[] = "s97symmqdn-3";
return s;
}
If we recompile and relink with that change, we see:
mapfile (2)
...
Discarded input sections
...
.text.get2 0x0000000000000000 0xd getall.o
.text.get3 0x0000000000000000 0xd getall.o
.rodata.s.1797
0x0000000000000000 0xd getall.o
.rodata.s.1800
0x0000000000000000 0xd getall.o
...
Now there are two new discarded data-sections, which contain
the two string literals we don't need, as we can see in the object file:
$ objdump -s -j .rodata.s.1797 getall.o
getall.o: file format elf64-x86-64
Contents of section .rodata.s.1797:
0000 73393773 796d6d71 646e2d32 00 s97symmqdn-2.
and:
$ objdump -s -j .rodata.s.1800 getall.o
getall.o: file format elf64-x86-64
Contents of section .rodata.s.1800:
0000 73393773 796d6d71 646e2d33 00 s97symmqdn-3.
Only the referenced string "s97symmqdn-1" now appears anywhere in the program:
$ strings a.out | egrep 's97symmqdn-(1|2|3)'
s97symmqdn-1
and it is the only string in the program's .rodata:
$ objdump -s -j .rodata a.out
a.out: file format elf64-x86-64
Contents of section .rodata:
06f0 73393773 796d6d71 646e2d31 00 s97symmqdn-1.
[1] Likewise, -function-sections has the same effect as qualifying the
definition of every function foo with __attribute__ ((__section__(".text.foo")))
This may just be an issue with the build system I am migrating to, but I'll include differences in the two systems and how I encountered the problem.
My old build system is a SLES 10 machine. The gcc/cpp/g++ version is 4.1.0
My new system is on SLES 11 SP4, and the gcc/cpp/g++ version is 4.3.4.
I am building a shared library; building and linking work fine on the new system. However, at load time on the new system, I get the following:
error ./mysharedlib.so: undefined symbol: stat
Since the stat() function is included from /usr/include/sys/stat.h, I looked at glibc on both systems. Old:
# rpm -q -f /usr/include/sys/stat.h
glibc-devel-2.4-31.2
and new:
# rpm -q -f /usr/include/sys/stat.h
glibc-devel-2.11.3-17.95.2
I also looked at objdump output related to stat() on the old system:
# objdump -T mysharedlib.so | grep stat
0000000000000000 D *UND* 0000000000000000 __xstat
# objdump -x mysharedlib.so | grep stat
00000000000e3f8a l F .text 0000000000000024 stat
0000000000000000 *UND* 0000000000000000 __xstat
And the new system:
# objdump -T mysharedlib.so | grep stat
0000000000000000 D *UND* 0000000000000000 stat
0000000000000000 D *UND* 0000000000000000 lstat
# objdump -x mysharedlib.so | grep stat
0000000000000000 *UND* 0000000000000000 stat
0000000000000000 *UND* 0000000000000000 lstat
This tells me that on the old system, stat() was defined as a local function in the .text section of my actual shared object. Stat is undefined in mysharedlib on the new system.
I did find some information on feature_test_macros and thought that might resolve the issue, so I included features.h before stat.h and updated my makefile to define _XOPEN_SOURCE:
cc -D_XOPEN_SOURCE=500
This didn't resolve the problem.
I also tried adding "-lc" to my ld flags to link in libc. This seemed like it should work, since that's where stat() is defined(I think), but it did not.
At this point, I found this StackOverflow question:
Why does -O to gcc cause "stat" to resolve?
So I tried adding -O to my makefile when invoking g++ on the file that calls stat(). This seems to resolve the problem. I probably don't know enough about resolving symbols; however, this seems a bit hack-ish to me. Am I way off base there? If not, what's the correct way to resolve the load time error on the new system?
The problem you are facing is most likely the result of building your shared library with ld. User-level code on UNIX systems should never use ld directly. You should use the compiler driver (g++ in your case) to perform the link instead.
Example:
// t.c
#include <sys/stat.h>
void fn(const char *p)
{
struct stat st;
stat(p, &st);
}
gcc -fPIC -c t.c
ld -shared -o t.so t.o
nm t.so | grep stat
U stat ## problem: this library is not linked correctly
Compare to correctly linked library:
gcc -shared -o t.so t.o
nm t.so | grep stat
0000000000000700 t stat
0000000000000700 t __stat
U __xstat##GLIBC_2.2.5
To find where the above local stat symbol came from, you could do this:
gcc -shared -o t.so t.o -Wl,-y,stat
t.o: reference to stat
/usr/lib/x86_64-linux-gnu/libc_nonshared.a(stat.oS): definition of stat
Finally, the reason U stat disappears with optimization:
gcc -E t.c | grep -A2 ' stat '
extern int stat (const char *__restrict __file,
struct stat *__restrict __buf) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
gcc -E t.c -O | grep -A2 ' stat '
__attribute__ ((__nothrow__ , __leaf__)) stat (const char *__path, struct stat *__statbuf)
{
return __xstat (1, __path, __statbuf);
That's right: you get different preprocessed source depending on the optimization level.
I have a sample program like this:
#include <stdio.h>
#if 1
#define FOR_EXPORT __attribute__ ((visibility("hidden")))
#else
#define FOR_EXPORT
#endif
FOR_EXPORT void mylocalfunction1(void)
{
printf("function1\n");
}
void mylocalfunction2(void)
{
printf("function2\n");
}
void mylocalfunction3(void)
{
printf("function3\n");
}
void printMessage(void)
{
printf("Running the function exported from the shared library\n");
}
And compile it using
gcc -shared -fPIC -fvisibility=hidden -o libdefaultvisibility.so defaultvisibility.c
Now after compilation I do:
$ nm libdefaultvisibility.so
nm libdefaultvisibility.so
0000000000000eb0 t _mylocalfunction1
0000000000000ed0 t _mylocalfunction2
0000000000000ef0 t _mylocalfunction3
0000000000000f10 t _printMessage
U _printf
U dyld_stub_binder
Which means as far as I can tell that despite -fvisibility=hidden all symbols get exported. The book I was following claimed that only the function marked with FOR_EXPORT should be exported.
I looked oup several other resources, but for the simple test I'm doing -fvisibility=hidden should be sufficient.
My clang version:
$ clang -v
clang -v
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.0.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
You're misunderstanding the output of nm. Scroll through man nm and you'll you
read that the t flag means the symbol is a local (static) symbol in
the text section. The linker can't see it. If it were global (external)
the flag would be T. So all four of your functions are local.
Contrast:
$ clang -shared -fPIC -fvisibility=hidden -o libdefaultvisibility.so defaultvisibility.c
$ nm libdefaultvisibility.so | grep ' t '
0000000000000570 t deregister_tm_clones
0000000000000600 t __do_global_dtors_aux
0000000000200e08 t __do_global_dtors_aux_fini_array_entry
0000000000000640 t frame_dummy
0000000000200e00 t __frame_dummy_init_array_entry
0000000000000670 t mylocalfunction1
0000000000000690 t mylocalfunction2
00000000000006b0 t mylocalfunction3
00000000000006d0 t printMessage
00000000000005b0 t register_tm_clones
with dropping the -fvisibility=hidden:
$ clang -shared -fPIC -o libdefaultvisibility.so defaultvisibility.c
$ nm libdefaultvisibility.so | grep ' t '
0000000000000600 t deregister_tm_clones
0000000000000690 t __do_global_dtors_aux
0000000000200e08 t __do_global_dtors_aux_fini_array_entry
00000000000006d0 t frame_dummy
0000000000200e00 t __frame_dummy_init_array_entry
0000000000000700 t mylocalfunction1
0000000000000640 t register_tm_clones
$ nm libdefaultvisibility.so | grep ' T '
0000000000000780 T _fini
00000000000005b0 T _init
0000000000000720 T mylocalfunction2
0000000000000740 T mylocalfunction3
0000000000000760 T printMessage
Then only the explicitly hidden mylocalfunction1 remains local, and the
other three are now global.
You should not expect that a symbol marked with __attribute__ ((visibility("hidden")))
will be exported by a shared library in any circumstances. The attribute means precisely
that it will not be, whether it is applied explicitly to a symbol, as in this case,
or acquired by default in the presence of the linker option -fvisibility=hidden.
If you want to export just that one function in the example by means of a visibility attribution
you would have:
#define FOR_EXPORT __attribute__ ((visibility("default")))
Then:
$ clang -shared -fPIC -fvisibility=hidden -o libdefaultvisibility.so defaultvisibility.c
$ nm libdefaultvisibility.so | grep ' T '
0000000000000720 T _fini
0000000000000550 T _init
00000000000006a0 T mylocalfunction1
It is global, because the explicit attribition overrides the commandline option,
and all your other functions are local. Perhaps confusingly, default visibility
is always public.
And you could accomplish this without resorting to visibility attributions - which are
not portable - simply declaring all the functions that you don't want to export as static. Then the compiler
would not expose them to the linker in the first place:
foo.c
#include <stdio.h>
void mylocalfunction1(void)
{
printf("function1\n");
}
static void mylocalfunction2(void)
{
printf("function2\n");
}
static void mylocalfunction3(void)
{
printf("function3\n");
}
static void printMessage(void)
{
printf("Running the function exported from the shared library\n");
}
With which you get again:-
$ clang -shared -fPIC -o libfoo.so foo.c
$ nm libfoo.so | grep ' T '
00000000000006c0 T _fini
0000000000000550 T _init
00000000000006a0 T mylocalfunction1
Although the distinction does not make itself felt in your example you
should understand that while a local/static symbol is not seen by the linker and (therefore) is unavailable for dynamic linkage, a global/external symbol
may or may not be available for dynamic linkage. visibility
controls the availability of global symbols for dynamic linkage, only.
According to GCC Wiki on Visibility, you should:
Use nm -C -D on the outputted DSO [Dynamic Shared Object] to compare before and after to see
the difference it makes.
As stated on nm manual:
-D will display the dynamic symbols rather than the normal symbols
If I compile your code exactly as you did I get the following objects:
$ nm -C -D libdefaultvisibility.so
nm -C -D libdefaultvisibility.so
0000000000200a68 B __bss_start
w __cxa_finalize
0000000000200a68 D _edata
0000000000200a70 B _end
00000000000006c8 T _fini
w __gmon_start__
0000000000000518 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
w _Jv_RegisterClasses
U puts
And if I compile it without the -fvisibility=hidden option I get the objects:
$ nm -C -D libdefaultvisibility.so
nm -C -D libdefaultvisibility.so
0000000000200ae8 B __bss_start
w __cxa_finalize
0000000000200ae8 D _edata
0000000000200af0 B _end
0000000000000748 T _fini
w __gmon_start__
00000000000005a0 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
w _Jv_RegisterClasses
0000000000000712 T mylocalfunction2
0000000000000724 T mylocalfunction3
0000000000000736 T printMessage
U puts
I have a large code base which is mainly built as binary. I have changed the Makefile to create a static library and I am creating a binary linking the library.
When I use it as a static library, code doesn't run due to weak symbols undefined reference.
gcc test.c -L . -lasntc -ltc -lresolv -lnetlink -lutil -ltc -lm -o mytc
nm mytc | grep htb_qdisc_util
w htb_qdisc_util
I untared the archives which resulted in .o's and then using those object files, I created a binary and this however works as shown
gcc tc.o tc_qdisc.o tc_class.o tc_filter.o tc_util.o tc_monitor.o
m_police.o m_estimator.o m_action.o m_ematch.o emp_ematch.yacc.o
emp_ematch.lex.o asn_tc.o asn_global.o q_fifo.o q_sfq.o q_red.o q_prio.o q_tbf.o
q_cbq.o q_rr.o q_multiq.o q_netem.o f_rsvp.o f_u32.o f_route.o f_fw.o f_basic.o
f_flow.o f_cgroup.o q_dsmark.o q_gred.o f_tcindex.o q_ingress.o q_hfsc.o q_htb.o
q_drr.o q_qfq.o m_gact.o m_mirred.o m_nat.o m_pedit.o m_skbedit.o p_ip.o
p_icmp.o p_tcp.o p_udp.o em_nbyte.o em_cmp.o em_u32.o em_meta.o q_mqprio.o static-syms.o tc_core.o tc_red.o tc_cbq.o tc_estimator.o tc_stab.o -lresolv -L. -lnetlink -lutil -L. -lm -o tc
nm tc | grep htb_qdisc_util
0000000000641bc0 D htb_qdisc_util
Just looking at the object files symbol table, following is seen
nm *.o | grep htb_qdisc_util
0000000000000000 D htb_qdisc_util
w htb_qdisc_util
Weak symbols resulting due to
extern char hfsc_qdisc_util[] __attribute__((weak)); if (!strcmp(sym, "hfsc_qdisc_util")) return hfsc_qdisc_util;
extern char htb_qdisc_util[] __attribute__((weak)); if (!strcmp(sym, "htb_qdisc_util")) return htb_qdisc_util;
How do I create a static library and what is happening when I create a binary