What exactly does -rdynamic (or --export-dynamic at the linker level) do and how does it relate to symbol visibility as defined by the -fvisibility* flags or visibility pragmas and __attribute__s?
For --export-dynamic, ld(1) mentions:
...
If you use "dlopen" to load a dynamic object which needs to refer back
to the symbols defined by the program, rather than some other dynamic
object, then you will probably need
to use this option when linking the program itself. ...
I'm not sure I completely understand this. Could you please provide an example that doesn't work without -rdynamic but does with it?
Edit:
I actually tried compiling a couple of dummy libraries (single file, multi-file, various -O levels, some inter-function calls, some hidden symbols, some visible), with and without -rdynamic, and so far I've been getting byte-identical outputs (when keeping all other flags constant of course), which is quite puzzling.
Here is a simple example project to illustrate the use of -rdynamic.
bar.c
extern void foo(void);
void bar(void)
{
foo();
}
main.c
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
void foo(void)
{
puts("Hello world");
}
int main(void)
{
void * dlh = dlopen("./libbar.so", RTLD_NOW);
if (!dlh) {
fprintf(stderr, "%s\n", dlerror());
exit(EXIT_FAILURE);
}
void (*bar)(void) = dlsym(dlh,"bar");
if (!bar) {
fprintf(stderr, "%s\n", dlerror());
exit(EXIT_FAILURE);
}
bar();
return 0;
}
Makefile
.PHONY: all clean test
LDEXTRAFLAGS ?=
all: prog
bar.o: bar.c
gcc -c -Wall -fpic -o $# $<
libbar.so: bar.o
gcc -shared -o $# $<
main.o: main.c
gcc -c -Wall -o $# $<
prog: main.o | libbar.so
gcc $(LDEXTRAFLAGS) -o $# $< -L. -lbar -ldl
clean:
rm -f *.o *.so prog
test: prog
./$<
Here, bar.c becomes a shared library libbar.so and main.c becomes
a program that dlopens libbar and calls bar() from that library.
bar() calls foo(), which is external in bar.c and defined in main.c.
So, without -rdynamic:
$ make test
gcc -c -Wall -o main.o main.c
gcc -c -Wall -fpic -o bar.o bar.c
gcc -shared -o libbar.so bar.o
gcc -o prog main.o -L. -lbar -ldl
./prog
./libbar.so: undefined symbol: foo
Makefile:23: recipe for target 'test' failed
And with -rdynamic:
$ make clean
rm -f *.o *.so prog
$ make test LDEXTRAFLAGS=-rdynamic
gcc -c -Wall -o main.o main.c
gcc -c -Wall -fpic -o bar.o bar.c
gcc -shared -o libbar.so bar.o
gcc -rdynamic -o prog main.o -L. -lbar -ldl
./prog
Hello world
-rdynamic exports the symbols of an executable, this mainly addresses scenarios as described in Mike Kinghan's answer, but also it helps e.g. Glibc's backtrace_symbols() symbolizing the backtrace.
Here is a small experiment (test program copied from here)
#include <execinfo.h>
#include <stdio.h>
#include <stdlib.h>
/* Obtain a backtrace and print it to stdout. */
void
print_trace (void)
{
void *array[10];
size_t size;
char **strings;
size_t i;
size = backtrace (array, 10);
strings = backtrace_symbols (array, size);
printf ("Obtained %zd stack frames.\n", size);
for (i = 0; i < size; i++)
printf ("%s\n", strings[i]);
free (strings);
}
/* A dummy function to make the backtrace more interesting. */
void
dummy_function (void)
{
print_trace ();
}
int
main (void)
{
dummy_function ();
return 0;
}
compile the program: gcc main.c and run it, the output:
Obtained 5 stack frames.
./a.out() [0x4006ca]
./a.out() [0x400761]
./a.out() [0x40076d]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f026597f830]
./a.out() [0x4005f9]
Now, compile with -rdynamic, i.e. gcc -rdynamic main.c, and run again:
Obtained 5 stack frames.
./a.out(print_trace+0x28) [0x40094a]
./a.out(dummy_function+0x9) [0x4009e1]
./a.out(main+0x9) [0x4009ed]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f85b23f2830]
./a.out(_start+0x29) [0x400879]
As you can see, we get a proper stack trace now!
Now, if we investigate ELF's symbol table entry (readelf --dyn-syms a.out):
without -rdynamic
Symbol table '.dynsym' contains 9 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND free#GLIBC_2.2.5 (2)
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND puts#GLIBC_2.2.5 (2)
3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND backtrace_symbols#GLIBC_2.2.5 (2)
4: 0000000000000000 0 FUNC GLOBAL DEFAULT UND backtrace#GLIBC_2.2.5 (2)
5: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __stack_chk_fail#GLIBC_2.4 (3)
6: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf#GLIBC_2.2.5 (2)
7: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main#GLIBC_2.2.5 (2)
8: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
with -rdynamic, we have more symbols, including the executable's:
Symbol table '.dynsym' contains 25 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND free#GLIBC_2.2.5 (2)
2: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterTMCloneTab
3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND puts#GLIBC_2.2.5 (2)
4: 0000000000000000 0 FUNC GLOBAL DEFAULT UND backtrace_symbols#GLIBC_2.2.5 (2)
5: 0000000000000000 0 FUNC GLOBAL DEFAULT UND backtrace#GLIBC_2.2.5 (2)
6: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __stack_chk_fail#GLIBC_2.4 (3)
7: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf#GLIBC_2.2.5 (2)
8: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main#GLIBC_2.2.5 (2)
9: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
10: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMCloneTable
11: 0000000000601060 0 NOTYPE GLOBAL DEFAULT 24 _edata
12: 0000000000601050 0 NOTYPE GLOBAL DEFAULT 24 __data_start
13: 0000000000601068 0 NOTYPE GLOBAL DEFAULT 25 _end
14: 00000000004009d8 12 FUNC GLOBAL DEFAULT 14 dummy_function
15: 0000000000601050 0 NOTYPE WEAK DEFAULT 24 data_start
16: 0000000000400a80 4 OBJECT GLOBAL DEFAULT 16 _IO_stdin_used
17: 0000000000400a00 101 FUNC GLOBAL DEFAULT 14 __libc_csu_init
18: 0000000000400850 42 FUNC GLOBAL DEFAULT 14 _start
19: 0000000000601060 0 NOTYPE GLOBAL DEFAULT 25 __bss_start
20: 00000000004009e4 16 FUNC GLOBAL DEFAULT 14 main
21: 00000000004007a0 0 FUNC GLOBAL DEFAULT 11 _init
22: 0000000000400a70 2 FUNC GLOBAL DEFAULT 14 __libc_csu_fini
23: 0000000000400a74 0 FUNC GLOBAL DEFAULT 15 _fini
24: 0000000000400922 182 FUNC GLOBAL DEFAULT 14 print_trace
I hope that helps!
I use rdynamic to print out backtraces using the backtrace()/backtrace_symbols() of Glibc.
Without -rdynamic, you cannot get function names.
To know more about the backtrace() read it over here.
From The Linux Programming Interface:
42.1.6
Accessing Symbols in the Main Program
Suppose that we use dlopen() to dynamically load a shared library,
use dlsym() to obtain the address of a function x() from that
library, and then call x(). If x() in turn calls a function y(),
then y() would normally be sought in one of the shared libraries
loaded by the program.
Sometimes, it is desirable instead to have x() invoke an
implementation of y() in the main program. (This is similar to a
callback mechanism.) In order to do this, we must make the
(global-scope) symbols in the main program available to the dynamic
linker, by linking the program using the --export-dynamic linker
option:
$ gcc -Wl,--export-dynamic main.c (plus further options and
arguments)
Equivalently, we can write the following:
$ gcc -export-dynamic main.c
Using either of these options allows a dynamically loaded library to
access global symbols in the main program.
The gcc -rdynamic option and the gcc -Wl,-E option are further
synonyms for -Wl,--export-dynamic.
I guess this only works for dynamically loaded shared library, opened with dlopen(). Correct me if I am wrong.
Related
I want to remove unused functions from code while compiling. Then I write some code (main.c):
#include <stdio.h>
const char *get1();
int main()
{
puts( get1() );
}
and getall.c:
const char *get1()
{
return "s97symmqdn-1";
}
const char *get2()
{
return "s97symmqdn-2";
}
const char *get3()
{
return "s97symmqdn-3";
}
Makefile
test1 :
rm -f a.out *.o *.a
gcc -ffunction-sections -fdata-sections -c main.c getall.c
ar cr libgetall.a getall.o
gcc -Wl,--gc-sections main.o -L. -lgetall
After run make test1 && objdump --sym a.out | grep get , I only find the next 2 lines output:
0000000000000000 l df *ABS* 0000000000000000 getall.c
0000000000400535 g F .text 000000000000000b get1
I guess the get2 and get3 was removed. But when I open the a.out by vim, I found s97symmqdn-1 s97symmqdn-2 s97symmqdn-3 exists.
Is the function get2 get3 removed really ? How I can remove the symbol s97symmqdn-2 s97symmqdn-3 ? Thank you for your reply.
My system is centos7 and gcc version is 4.8.5
The compilation options -ffunction-sections -fdata-sections and linkage option --gc-sections
are working correctly in your example. Your static library is superfluous, so it can
be simplified to:
$ gcc -ffunction-sections -fdata-sections -c main.c getall.c
$ gcc -Wl,--gc-sections main.o getall.o -Wl,-Map=mapfile
in which I'm also asking for the linker's mapfile.
The unused functions get2 and get3 are absent from the executable:
$ nm a.out | grep get
0000000000000657 T get1
and the mapfile shows that the unused function-sections .text.get2 and .text.get3 in which get2 and get3 are
respectively defined were discarded in the linkage:
mapfile (1)
...
Discarded input sections
...
.text.get2 0x0000000000000000 0xd getall.o
.text.get3 0x0000000000000000 0xd getall.o
...
Nevertheless, as you found, all three of the string literals "s97symmqdn-(1|2|3)"
are in the program:
$ strings a.out | egrep 's97symmqdn-(1|2|3)'
s97symmqdn-1
s97symmqdn-2
s97symmqdn-3
That is because -fdata-sections applies just to the same data objects that
__attribute__ ((__section__("name"))) applies to1, i.e. to the definitions
of variables that have static storage duration. It is not applied to anonymous string literals like your
"s97symmqdn-(1|2|3)". They are all just placed in the .rodata section as usual,
and there we find them:
$ objdump -s -j .rodata a.out
a.out: file format elf64-x86-64
Contents of section .rodata:
06ed 73393773 796d6d71 646e2d31 00733937 s97symmqdn-1.s97
06fd 73796d6d 71646e2d 32007339 3773796d symmqdn-2.s97sym
070d 6d71646e 2d3300 mqdn-3.
--gc-sections does not allow the linker to discard .rodata from the program
because it is not an unused section: it contains "s97symmqdn-1", referenced
in the program by get1 as well as the unreferenced strings "s97symmqdn-2"
and "s97symmqdn-3"
Fix
To get these three string literals separated into distinct data sections, you
need to assign them to distinct named objects, e.g.
getcall.c (2)
const char *get1()
{
static const char s[] = "s97symmqdn-1";
return s;
}
const char *get2()
{
static const char s[] = "s97symmqdn-2";
return s;
}
const char *get3()
{
static const char s[] = "s97symmqdn-3";
return s;
}
If we recompile and relink with that change, we see:
mapfile (2)
...
Discarded input sections
...
.text.get2 0x0000000000000000 0xd getall.o
.text.get3 0x0000000000000000 0xd getall.o
.rodata.s.1797
0x0000000000000000 0xd getall.o
.rodata.s.1800
0x0000000000000000 0xd getall.o
...
Now there are two new discarded data-sections, which contain
the two string literals we don't need, as we can see in the object file:
$ objdump -s -j .rodata.s.1797 getall.o
getall.o: file format elf64-x86-64
Contents of section .rodata.s.1797:
0000 73393773 796d6d71 646e2d32 00 s97symmqdn-2.
and:
$ objdump -s -j .rodata.s.1800 getall.o
getall.o: file format elf64-x86-64
Contents of section .rodata.s.1800:
0000 73393773 796d6d71 646e2d33 00 s97symmqdn-3.
Only the referenced string "s97symmqdn-1" now appears anywhere in the program:
$ strings a.out | egrep 's97symmqdn-(1|2|3)'
s97symmqdn-1
and it is the only string in the program's .rodata:
$ objdump -s -j .rodata a.out
a.out: file format elf64-x86-64
Contents of section .rodata:
06f0 73393773 796d6d71 646e2d31 00 s97symmqdn-1.
[1] Likewise, -function-sections has the same effect as qualifying the
definition of every function foo with __attribute__ ((__section__(".text.foo")))
Consider 3 C source files:
/* widgets.c */
void widgetTwiddle ( struct widget * w ) {
utilityTwiddle(&w->bits, 1);
}
and
/* wombats.c */
void wombatTwiddle ( struct wombat * w ) {
utilityTwiddle(&w->bits, 1);
}
and
/* utility.c */
void utilityTwiddle ( int * bitsPtr, int bits ) {
*bitsPtr ^= bits;
}
which get compiled and put in a library (say, either libww.a or libww.so).
Is there a way to make utilityTwiddle() visible and usable by the other two library members, but not be visible to to those who link to the library? That is, given this:
/* appl.c */
extern void utilityTwiddle ( int * bitsPtr, int bits );
int main ( void ) {
int bits;
utilityTwiddle(&bits, 1);
return 0;
}
and
cc -o appl appl.c -lww
it would fail to link because utilityTwiddle() is not visible to appl.c. And, consequently appl.c would be free to define its own utilityTwiddle function or variable.
[EDIT] And hopefully obviously, we would like this to work:
/* workingappl.c */
extern void wombatTwiddle ( struct wombat * wPtr );
int main ( void ) {
struct wombat w = { .bits = 0 };
wombatTwiddle(&w);
return 0;
}
This Limiting visibility of symbols when linking shared libraries seems related, but it doesn't seem to address whether the symbols suppressed are available to other library members.
[EDIT2] I have sort-of figured out a way to do it without modifying the C source. Add a map file:
/* utility.map */
{ local: *; };
and then do:
$ gcc -shared -o utility.so utility.c -fPIC -Wl,--version-script=utility.map
gives us a dynamic symbol table w/o utilityTwiddle:
$ nm -D utility.so
w _Jv_RegisterClasses
w __cxa_finalize
w __gmon_start__
but it's not clear to me how to effectively go from this to building a shared library with all three source files. If I put all three source files on the command line, the symbols from all three are hidden. If there is a way to incrementally build the shared library, I could have two simple map files (one to export nothing, one to export everything). Is this doable or is the only option something like this:
/* libww.map */
{ global: list; of; all; symbols; to; export; local: *; };
and
$ gcc -shared -o libww.so *.c -fPIC -Wl,--version-script=libww.map
[EDIT3]
Boy, it sure seems like this also ought to be possible without using shared libraries. If I do:
ld -r -o wboth.o widgets.o wombats.o utility.o
I can see that the linker has resolved to location of utilityTwiddle() where widgetTwiddle() and wombatTwiddle() call it:
$ objdump -d wboth.o
0000000000000000 <widgetTwiddle>:
0: be 01 00 00 00 mov $0x1,%esi
5: e9 00 00 00 00 jmpq a <widgetTwiddle+0xa>
0000000000000010 <wombatTwiddle>:
10: be 01 00 00 00 mov $0x1,%esi
15: e9 00 00 00 00 jmpq 1a <wombatTwiddle+0xa>
0000000000000020 <utilityTwiddle>:
20: 31 37 xor %esi,(%rdi)
22: c3 retq
but utilityTwiddle remains as a symbol:
$ nm wboth.o
U _GLOBAL_OFFSET_TABLE_
0000000000000020 T utilityTwiddle
0000000000000000 T widgetTwiddle
0000000000000010 T wombatTwiddle
and so if you could find a way to remove that symbol, you could still successfully link against wboth.o (I have tested this by binary editing wboth.o) and it still links and runs fine:
$ nm wboth.o
U _GLOBAL_OFFSET_TABLE_
0000000000000000 T widgetTwiddle
0000000000000010 T wombatTwiddle
0000000000000020 T xtilityTwiddle
You can't achieve what you want by creating a static library libww.a. If you
read static-libraries you
will see why. A static library can be used to offer a bunch N of object files
to the linker, from which it will extract k (possibly = 0) that it needs and link them. So you
can't achieve anything by linking with the static library that you can't achieve by
linking those k object files directly. For linkage purposes, static libraries don't really
exist.
But shared libraries really do exist for linkage purposes and the global symbols exposed by shared library
acquire an additional property, dynamic visibility, that exists precisely for your
purpose. The dynamically visible symbols are a subset of the global symbols: they are
the global symbols that are visible for dynamic linkage, i.e. for linking the shared library
with something else (a program or another shared library).
Dynamic visibility is not an attribute that source language standards say anything
about, because they don't say anything about dynamic linkage. So controlling the
dynamic visibility of symbols has to be done in an individual way by a toolchain that
does support dynamic linkage. GCC does it with the compiler-specific declaration
qualifier1:
__attribute__((visibility("default|hidden|protected|internal")
and/or the compiler switch2:
-fvisibility=default|hidden|protected|internal
Here's a demo of how build libww.so so that utilityTwiddle is hidden from
clients of the library while wombatTwiddle and widgetTwiddle are visible.
Your source code needs fleshed out a bit in one way or another to compile.
Here's a first cut:
ww.h (1)
#ifndef WW_H
#define WW_H
struct widget {
int bits;
};
struct wombat {
int bits;
};
extern void widgetTwiddle ( struct widget * w );
extern void wombatTwiddle ( struct wombat * w );
#endif
utility.h (1)
#ifndef UTILITY_H
#define UTILITY_H
extern void utilityTwiddle ( int * bitsPtr, int bits );
#endif
utility.c
#include "utility.h"
void utilityTwiddle ( int * bitsPtr, int bits ) {
*bitsPtr ^= bits;
}
wombats.c
#include "utility.h"
#include "ww.h"
void wombatTwiddle ( struct wombat * w ) {
utilityTwiddle(&w->bits, 1);
}
widgets.c
#include "utility.h"
#include "ww.h"
void widgetTwiddle ( struct widget * w ) {
utilityTwiddle(&w->bits, 1);
}
Compile all the *.c files to *.o files in the default manner:
$ gcc -Wall -Wextra -c widgets.c wombats.c utility.c
and link them into libww.so in the default manner:
$ gcc -shared -o libww.so widgets.o wombats.o utility.o
Here are *Twiddle symbols in the global symbol table of libww.so
$ nm libww.so | egrep '*Twiddle'
000000000000063a T utilityTwiddle
00000000000005fa T widgetTwiddle
000000000000061a T wombatTwiddle
This is just the sum of the global (extern) *Twiddle symbols that went into the linkage
of libww.so from the object files. They're all defined (T), as they'd have to be
if the library itself was to be linked without external *Twiddle dependencies.
Any ELF file (object file, shared library, program) has a global symbol table, but
a shared library also has a dynamic symbol table. Here are the *Twiddle symbols in the dynamic symbol table of libww.so:
$ nm -D libww.so | egrep '*Twiddle'
000000000000063a T utilityTwiddle
00000000000005fa T widgetTwiddle
000000000000061a T wombatTwiddle
They're exactly the same. That's what we want to change, so that utilityTwiddle
disappears.
Here's a second cut. We have to change the source code slightly.
utility.h (2)
#ifndef UTILITY_H
#define UTILITY_H
extern void utilityTwiddle ( int * bitsPtr, int bits ) __attribute__((visibility("hidden")));
#endif
Then recompile and relink, just as before:
$ gcc -Wall -Wextra -c widgets.c wombats.c utility.c
$ gcc -shared -o libww.so widgets.o wombats.o utility.o
Here are the *Twiddle symbols now in the global symbol table:
$ nm libww.so | egrep '*Twiddle'
000000000000063a T utilityTwiddle
00000000000005fa T widgetTwiddle
000000000000061a T wombatTwiddle
No change there. And here are the *Twiddle symbols now in the dynamic symbol table:
$ nm -D libww.so | egrep '*Twiddle'
00000000000005aa T widgetTwiddle
00000000000005ca T wombatTwiddle
utilityTwiddle is gone.
Here's a third cut that achieves the same result differently. It's more long-winded
but illustrates how the -fvisibility compiler option plays. This time,
utility.h is again as per (1), but ww.h is:
ww.h (2)
#ifndef WW_H
#define WW_H
struct widget {
int bits;
};
struct wombat {
int bits;
};
extern void widgetTwiddle ( struct widget * w ) __attribute__((visibility("default")));
extern void wombatTwiddle ( struct wombat * w ) __attribute__((visibility("default")));
#endif
Now we recompile like so:
$ gcc -Wall -Wextra -fvisibility=hidden -c widgets.c wombats.c utility.c
We're telling the compiler to annotate every global symbol it generates with
__attribute__((visibility("hidden"))) unless there is a countervailing
__attribute__((visibility("..."))) explicitly in the source code.
Then relink the shared library just as previously. Again we see in the global symbol table:
$ nm libww.so | egrep '*Twiddle'
00000000000005ea t utilityTwiddle
00000000000005aa T widgetTwiddle
00000000000005ca T wombatTwiddle
and in the dynamic symbol table:
$ nm -D libww.so | egrep '*Twiddle'
00000000000005aa T widgetTwiddle
00000000000005ca T wombatTwiddle
Finally, to show that removing utilityTwiddle from the dynamic symbol table
of libww.so in one of these ways really does hide it from clients linking with
libww.so. Here's a program that wants to call all the *Twiddles:
prog.c
#include <ww.h>
extern void utilityTwiddle ( int * bitsPtr, int bits );
int main()
{
struct widget wi = {1};
struct wombat wo = {2};
widgetTwiddle(&wi);
wombatTwiddle(&wo);
utilityTwiddle(&wi.bits,wi.bits);
return 0;
}
We have no problem building it like:
$ gcc -Wall -Wextra -I. -c prog.c
$ gcc -o prog prog.o utility.o widgets.o wombats.o
But nobody can build it like:
$ gcc -Wall -Wextra -I. -c prog.c
$ gcc -o prog prog.o -L. -lww
prog.o: In function `main':
prog.c:(.text+0x4a): undefined reference to `utilityTwiddle'
collect2: error: ld returned 1 exit status
Be clear that -fvisibility is a compilation option, not a linkage option.
You pass it to your compilation commands and not to your linkage commands,
because it's effect is the same as sprinkling __attribute__((visibility("...")))
qualifiers over the declarations in your source code, which the compiler has
to honour by injecting linkage information into the object files that it generates. If
you care to see the evidence of that you can just repeat that last compilation
and request that the assembly files be saved:
$ gcc -Wall -Wextra -fvisibility=hidden -c widgets.c wombats.c utility.c -save-temps
Then compare say:
widgets.s
.file "widgets.c"
.text
.globl widgetTwiddle
.type widgetTwiddle, #function
widgetTwiddle:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movq %rdi, -8(%rbp)
movq -8(%rbp), %rax
movl $1, %esi
movq %rax, %rdi
call utilityTwiddle#PLT
nop
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size widgetTwiddle, .-widgetTwiddle
.ident "GCC: (Ubuntu 7.3.0-16ubuntu3) 7.3.0"
.section .note.GNU-stack,"",#progbits
with:
utility.s
.file "utility.c"
.text
.globl utilityTwiddle
.hidden utilityTwiddle
^^^^^^^^^^^^^^^^^^^^^^
.type utilityTwiddle, #function
utilityTwiddle:
...
...
[1] See the GCC manual:
6.31.1 Common Function Attributes
6.32.1 Common Variable Attributes
[2] See the GCC Manual, 3.16 Options for Code Generation Conventions.
I am facing linking problem. I'll illustrate it:
a.c:
extern void b(void);
int main() {
a();
return 0;
}
void a() {
b();
}
b.S:
.extern a
b:
jmp a
No matter if I'll link
gcc a.o b.o -o c
or
gcc b.o a.o -o c
I'll get unresolved symbols. How do I link these files? I can't merge them. Example may be nonsensical, but that illustrates point, what do i try to archive.
Initial investigation:
a.c
extern void b(void);
void a(void);
int main() {
a();
return 0;
}
void a() {
b();
}
b.S
.extern a
b:
jmp a
b.c
void a(void);
void b(void)
{
a();
}
Output
$ gcc -c a.c
$ gcc -c b.c -o b_gcc.o
$ as b.S -o b_as.o
$ gcc a.o b_gcc.o -o test_gcc
$ gcc a.o b_as.o -o test_as
a.o: In function `a':
a.c:(.text+0x15): undefined reference to `b'
collect2: error: ld returned 1 exit status
So what gives? Why is it okay with GCC but not GAS?
$ objdump -t b_gcc.o > syms_gcc
$ objdump -t b_as.o > syms_as
$ diff syms_gcc syms_as
2c2
< b_gcc.o: file format elf64-x86-64
---
> b_as.o: file format elf64-x86-64
5d4
< 0000000000000000 l df *ABS* 0000000000000000 b.c
9,12c8
< 0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack
< 0000000000000000 l d .eh_frame 0000000000000000 .eh_frame
< 0000000000000000 l d .comment 0000000000000000 .comment
< 0000000000000000 g F .text 000000000000000b b
---
> 0000000000000000 l .text 0000000000000000 b
Okay, so gcc makes b a global symbol. Lets try .global b in b.S:
$ as b.S -o b_as2.o
$ gcc a.o b_as2.o
$
Success. So gcc/ld will do multi-pass symbol resolution for anything that is not in a static library. But it only looks for global symbols. Here's the final b.S:
.extern a
.global b
b:
jmp a
I have a sample program like this:
#include <stdio.h>
#if 1
#define FOR_EXPORT __attribute__ ((visibility("hidden")))
#else
#define FOR_EXPORT
#endif
FOR_EXPORT void mylocalfunction1(void)
{
printf("function1\n");
}
void mylocalfunction2(void)
{
printf("function2\n");
}
void mylocalfunction3(void)
{
printf("function3\n");
}
void printMessage(void)
{
printf("Running the function exported from the shared library\n");
}
And compile it using
gcc -shared -fPIC -fvisibility=hidden -o libdefaultvisibility.so defaultvisibility.c
Now after compilation I do:
$ nm libdefaultvisibility.so
nm libdefaultvisibility.so
0000000000000eb0 t _mylocalfunction1
0000000000000ed0 t _mylocalfunction2
0000000000000ef0 t _mylocalfunction3
0000000000000f10 t _printMessage
U _printf
U dyld_stub_binder
Which means as far as I can tell that despite -fvisibility=hidden all symbols get exported. The book I was following claimed that only the function marked with FOR_EXPORT should be exported.
I looked oup several other resources, but for the simple test I'm doing -fvisibility=hidden should be sufficient.
My clang version:
$ clang -v
clang -v
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.0.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
You're misunderstanding the output of nm. Scroll through man nm and you'll you
read that the t flag means the symbol is a local (static) symbol in
the text section. The linker can't see it. If it were global (external)
the flag would be T. So all four of your functions are local.
Contrast:
$ clang -shared -fPIC -fvisibility=hidden -o libdefaultvisibility.so defaultvisibility.c
$ nm libdefaultvisibility.so | grep ' t '
0000000000000570 t deregister_tm_clones
0000000000000600 t __do_global_dtors_aux
0000000000200e08 t __do_global_dtors_aux_fini_array_entry
0000000000000640 t frame_dummy
0000000000200e00 t __frame_dummy_init_array_entry
0000000000000670 t mylocalfunction1
0000000000000690 t mylocalfunction2
00000000000006b0 t mylocalfunction3
00000000000006d0 t printMessage
00000000000005b0 t register_tm_clones
with dropping the -fvisibility=hidden:
$ clang -shared -fPIC -o libdefaultvisibility.so defaultvisibility.c
$ nm libdefaultvisibility.so | grep ' t '
0000000000000600 t deregister_tm_clones
0000000000000690 t __do_global_dtors_aux
0000000000200e08 t __do_global_dtors_aux_fini_array_entry
00000000000006d0 t frame_dummy
0000000000200e00 t __frame_dummy_init_array_entry
0000000000000700 t mylocalfunction1
0000000000000640 t register_tm_clones
$ nm libdefaultvisibility.so | grep ' T '
0000000000000780 T _fini
00000000000005b0 T _init
0000000000000720 T mylocalfunction2
0000000000000740 T mylocalfunction3
0000000000000760 T printMessage
Then only the explicitly hidden mylocalfunction1 remains local, and the
other three are now global.
You should not expect that a symbol marked with __attribute__ ((visibility("hidden")))
will be exported by a shared library in any circumstances. The attribute means precisely
that it will not be, whether it is applied explicitly to a symbol, as in this case,
or acquired by default in the presence of the linker option -fvisibility=hidden.
If you want to export just that one function in the example by means of a visibility attribution
you would have:
#define FOR_EXPORT __attribute__ ((visibility("default")))
Then:
$ clang -shared -fPIC -fvisibility=hidden -o libdefaultvisibility.so defaultvisibility.c
$ nm libdefaultvisibility.so | grep ' T '
0000000000000720 T _fini
0000000000000550 T _init
00000000000006a0 T mylocalfunction1
It is global, because the explicit attribition overrides the commandline option,
and all your other functions are local. Perhaps confusingly, default visibility
is always public.
And you could accomplish this without resorting to visibility attributions - which are
not portable - simply declaring all the functions that you don't want to export as static. Then the compiler
would not expose them to the linker in the first place:
foo.c
#include <stdio.h>
void mylocalfunction1(void)
{
printf("function1\n");
}
static void mylocalfunction2(void)
{
printf("function2\n");
}
static void mylocalfunction3(void)
{
printf("function3\n");
}
static void printMessage(void)
{
printf("Running the function exported from the shared library\n");
}
With which you get again:-
$ clang -shared -fPIC -o libfoo.so foo.c
$ nm libfoo.so | grep ' T '
00000000000006c0 T _fini
0000000000000550 T _init
00000000000006a0 T mylocalfunction1
Although the distinction does not make itself felt in your example you
should understand that while a local/static symbol is not seen by the linker and (therefore) is unavailable for dynamic linkage, a global/external symbol
may or may not be available for dynamic linkage. visibility
controls the availability of global symbols for dynamic linkage, only.
According to GCC Wiki on Visibility, you should:
Use nm -C -D on the outputted DSO [Dynamic Shared Object] to compare before and after to see
the difference it makes.
As stated on nm manual:
-D will display the dynamic symbols rather than the normal symbols
If I compile your code exactly as you did I get the following objects:
$ nm -C -D libdefaultvisibility.so
nm -C -D libdefaultvisibility.so
0000000000200a68 B __bss_start
w __cxa_finalize
0000000000200a68 D _edata
0000000000200a70 B _end
00000000000006c8 T _fini
w __gmon_start__
0000000000000518 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
w _Jv_RegisterClasses
U puts
And if I compile it without the -fvisibility=hidden option I get the objects:
$ nm -C -D libdefaultvisibility.so
nm -C -D libdefaultvisibility.so
0000000000200ae8 B __bss_start
w __cxa_finalize
0000000000200ae8 D _edata
0000000000200af0 B _end
0000000000000748 T _fini
w __gmon_start__
00000000000005a0 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
w _Jv_RegisterClasses
0000000000000712 T mylocalfunction2
0000000000000724 T mylocalfunction3
0000000000000736 T printMessage
U puts
Wikipedia mentions that "the bss section typically includes all uninitialized variables declared at file scope." Given the following file:
int uninit;
int main() {
uninit = 1;
return 0;
}
When I compile this to an executable I see the bss segment filled properly:
$ gcc prog1.c -o prog1
$ size prog1
text data bss dec hex filename
1115 552 8 1675 68b prog1
However if I compile it as an object file I don't see the bss segment (I'd expect it to be 4):
$ gcc -c prog1.c
$ size prog1.o
text data bss dec hex filename
72 0 0 72 48 prog1.o
Is there something obvious I am missing?
I am using gcc version 4.8.1.
If we use readelf -s to look at the symbol table, we'll see:
$ readelf -s prog1.o
Symbol table '.symtab' contains 10 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS bss.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 0 SECTION LOCAL DEFAULT 4
5: 0000000000000000 0 SECTION LOCAL DEFAULT 6
6: 0000000000000000 0 SECTION LOCAL DEFAULT 7
7: 0000000000000000 0 SECTION LOCAL DEFAULT 5
8: 0000000000000004 4 OBJECT GLOBAL DEFAULT COM uninit <<<<
9: 0000000000000000 16 FUNC GLOBAL DEFAULT 1 main
We see that your uninit symbol ("variable") is, at this stage, a "common" symbol. It has not yet been "assigned" to the BSS.
See this question for more information on "common" symbols: What does "COM" means in the Ndx column of the .symtab section?
Once your final executable is linked together, it will be put in the BSS as you expected.
You can bypass this behavior by passing the -fno-common flag to GCC:
$ gcc -fno-common -c bss.c
$ size bss.o
text data bss dec hex filename
72 0 4 76 4c bss.o
Instead, you could mark uninit as static. This way, the compiler will know that no other .o file can refer to it, so it will not be a "common" symbol. Instead, it will be placed into the BSS immediately, as you expected:
$ cat bss.c
static int uninit;
int main() {
uninit = 1;
return 0;
}
$ gcc -c bss.c
$ size bss.o
text data bss dec hex filename
72 0 4 76 4c bss.o