Using GCC, how can I remove a symbol from a shared object after I've created the shared object? If I have three files in C manipulating symbol foo() like:
// a.c
int foo() { return 0xdead; }
int baz() { return 1; }
and
// b.c
int foo() { return 0xbeef; }
int bar() { return 0; }
and
// c.c
#include "stdio.h"
extern int foo();
extern int bar();
extern int baz();
int main() { printf("0x%x, 0x%x, 0x%x\n",foo(),bar(),baz()); return 0; }
Then I compile and run like:
% gcc a.c --shared -fPIC -o a.so
% gcc b.c --shared -fPIC -o b.so
% setenv LD_LIBRARY_PATH . # export LD_LIBRARY_PATH=. for bash systems
% gcc c.c a.so b.so -o c
% ./c
0xdead, 0x0, 0x1
How can I make it so that a.so no longer has symbol foo() after I've created a.so? I want the foo() defined in b.so to be used instead of a.so by deleting the foo() symbol from a.so. After foo() is deleted from a.so, rerunning c should generate a printout of:
0xbeef, 0x0, 0x1
In this toy example, I know I can simply re-order the libary names when I compile c.c with a.so and b.so, but how can I actually delete the symbol from a.so? I imagine that after deleting foo() from a.so, this grep of the nm output would yield nothing:
nm -a a.so | grep foo
Whereas right now it returns:
000000000000063c T foo
You should be able to use the -N (--strip-symbol) option of objcopy to achieve what you want:
$ objcopy -N foo a.so
Symbol visibility seems like a superior solution to select which functions are up for runtime linking and which are local to the library but still "extern" declared to allow usage from other .c files that comprise the library - http://gcc.gnu.org/wiki/Visibility .
Related
I have the following two files:
// t.c
#include<stdio.h>
extern int x;
int main(void)
{
printf("%d\n", x);
}
// tt.c
int x=4;
And then I compile it into two object files with:
$ gcc -c tt.c t.c
So now I have two object files, tt.o and t.o. When I do the following to build an executable:
$ gcc tt.o t.o -o out
How does the linker resolve the definition of x? Does it basically do a "two-pass" where it saves all global variables with external linkage first, and then does a lookup in each file that needs an external definition, or what's the process that happens to resolve those lookups?
I'm in a situation that's quite similar to the following. There's libA.so that depending on some compile time flags exhibits slightly different behaviour (it's an external lib, and I can't modify the source). Then, I have libB.so that depends on libA.so (compiled with say -DVALUE=1), and in my executable I depend both on libB.so, as well as on libA.so, but compiled with -DVALUE=0. However, once I launch it, ld resolves all symbols with one of libA.so versions, so both my executable and libB.so are using the same functions.
Is there any way to specify that I want to load resolve undefined symbols of libB.so only using its dependencies? I've tried using -Wl,-Bgroup flag when building libB.so, but it didn't change anything. I know there's dlmopen that can load the library in a new namespace, but I'd like to have it loaded automatically at startup.
I'm attaching a set of files that reproduce the behaviour:
libA.so:
#include <stdio.h>
#define _STR(x) #x
#define STR(x) _STR(x)
#ifndef VALUE
#define VALUE default
#endif
void func2() {
printf(STR(VALUE) "\n");
}
void func() {
func2();
}
libB.so:
#include <stdio.h>
extern void func(void);
void b_func() {
func();
}
executable:
#include <stdio.h>
extern void b_func(void);
extern void func(void);
int main() {
func(); // should print "default"
b_func(); // should print "other"
}
build commands:
gcc -fPIC -shared A.c -o libA.so
gcc -fPIC -shared -DVALUE=other A.c -o libA2.so
gcc -fPIC -shared B.c -L. -lA2 -o libB.so
gcc main.c -L. -lA -lB -o main
Curiously, it all works fine on OS X.
I am building a shared library in C that is dynamically loaded by a program that I do not have source access to. The target platform is a 64-bit Linux platform and we're using gcc to build. I was able to build a reproduction of the issue in ~100 lines, but it's still a bit to read. Hopefully it's illustrative.
The core issue is I have two non-static functions (bar and baz) defined in my shared library. Both need to be non-static as we expect the caller to be able to dlsym them. Additionally, baz calls bar. The program that is using my library also has a function named bar, which wouldn't normally be an issue, but the calling program is compiled with -rdynamic, as it has a function foo that needs to be called in my shared library. The result is my shared library ends up linked to the calling program's version of bar at runtime, producing unintuitive results.
In an ideal world I would be able to include some command line switch when compiling my shared library that would prevent this from happening.
The current solution I have is to rename my non-static functions as funname_local and declare them static. I then define a new function:
funname() { return funname_local(); }, and change any references to funname in my shared library to funname_local. This works, but it feels cumbersome, and I'd much prefer to just tell the linker to prefer symbols defined in the local compilation unit.
internal.c
#include <stdio.h>
#include "internal.h"
void
bar(void)
{
printf("I should only be callable from the main program\n");
}
internal.h
#if !defined(__INTERNAL__)
#define __INTERNAL__
void
bar(void);
#endif /* defined(__INTERNAL__) */
main.c
#include <dlfcn.h>
#include <stdio.h>
#include "internal.h"
void
foo(void)
{
printf("It's important that I am callable from both main and from any .so "
"that we dlopen, that's why we compile with -rdynamic\n");
}
int
main()
{
void *handle;
void (*fun1)(void);
void (*fun2)(void);
char *error;
if(NULL == (handle = dlopen("./shared.so", RTLD_NOW))) { /* Open library */
fprintf(stderr, "dlopen: %s\n", dlerror());
return 1;
}
dlerror(); /* Clear any existing error */
*(void **)(&fun1) = dlsym(handle, "baz"); /* Get function pointer */
if(NULL != (error = dlerror())) {
fprintf(stderr, "dlsym: %s\n", error);
dlclose(handle);
return 1;
}
*(void **)(&fun2) = dlsym(handle, "bar"); /* Get function pointer */
if(NULL != (error = dlerror())) {
fprintf(stderr, "dlsym: %s\n", error);
dlclose(handle);
return 1;
}
printf("main:\n");
foo();
bar();
fun1();
fun2();
dlclose(handle);
return 0;
}
main.h
#if !defined(__MAIN__)
#define __MAIN__
extern void
foo(void);
#endif /* defined(__MAIN__) */
shared.c
#include <stdio.h>
#include "main.h"
void
bar(void)
{
printf("bar:\n");
printf("It's important that I'm callable from a program that loads shared.so"
" as well as from other functions in shared.so\n");
}
void
baz(void)
{
printf("baz:\n");
foo();
bar();
return;
}
compile:
$ gcc -m64 -std=c89 -Wall -Wextra -Werror -pedantic -o main main.c internal.c -l dl -rdynamic
$ gcc -m64 -std=c89 -Wall -Wextra -Werror -pedantic -shared -fPIC -o shared.so shared.c
run:
$ ./main
main:
It's important that I am callable from both main and from any .so that we dlopen, that's why we compile with -rdynamic
I should only be callable from the main program
baz:
It's important that I am callable from both main and from any .so that we dlopen, that's why we compile with -rdynamic
I should only be callable from the main program
bar:
It's important that I'm callable from a program that loads shared.so as well as from other functions in shared.so
Have you tried -Bsymbolic linker option (or -Bsymbolic-functions)? Quoting from ld man:
-Bsymbolic
When creating a shared library, bind references to global symbols to the definition within the shared library, if any. Normally, it is possible for a program linked against a shared library to override the definition within the shared library. This option can also be used with the --export-dynamic option, when creating a position independent executable, to bind references to global symbols to the definition within the executable. This option is only meaningful on ELF platforms which support shared libraries and position independent executables.
It seems to solve the problem:
$ gcc -m64 -std=c89 -Wall -Wextra -Werror -pedantic -shared -fPIC -o shared.so shared.c
$ gcc -m64 -std=c89 -Wall -Wextra -Werror -pedantic -o main main.c internal.c -l dl -rdynamic
$ ./main
main:
It's important that I am callable from both main and from any .so that we dlopen, that's why we compile with -rdynamic
I should only be callable from the main program
baz:
It's important that I am callable from both main and from any .so that we dlopen, that's why we compile with -rdynamic
I should only be callable from the main program
bar:
It's important that I'm callable from a program that loads shared.so as well as from other functions in shared.so
$ gcc -m64 -std=c89 -Wall -Wextra -Werror -pedantic -shared -fPIC -Wl,-Bsymbolic -o shared.so shared.c
$ ./main
main:
It's important that I am callable from both main and from any .so that we dlopen, that's why we compile with -rdynamic
I should only be callable from the main program
baz:
It's important that I am callable from both main and from any .so that we dlopen, that's why we compile with -rdynamic
bar:
It's important that I'm callable from a program that loads shared.so as well as from other functions in shared.so
bar:
It's important that I'm callable from a program that loads shared.so as well as from other functions in shared.so
A common solution for this problem is to not actually depend on a global symbol not being overriden. Instead do the following:
Call the function bar from your library mylib_bar or something like that
Hide mylib_bar with __attribute__((visibility("hidden"))) or similar
Make bar a weak symbol, referring to mylib_bar like this:
#pragma weak bar = mylib_bar
Make your library call mylib_bar everywhere instead of bar
Now everything works as expected:
When your library calls mylib_bar, this always refers to the definition in the library as the visibility is hidden.
When other code calls bar, this calls mylib_bar by default.
If someone defines his own bar, this overrides bar but not mylib_bar, leaving your library untouched.
We have a program that links in a number of static libraries, which may or may not define a number of symbols depending on compilation options. On OS X, we use dlsym(3) with a NULL handle to obtain the symbol addresses. However, on Linux, dlsym(3) always returns NULL.
Consider a trivial program (sources below) that links in a static library containing a function and a variable and tries to print their addresses. We can check that the program contains the symbols:
$ nm -C test | grep "test\(func\|var\)"
0000000000400715 T testFunc
0000000000601050 B testVar
However, when the program is run, neither can be located:
$ ./test
testVar: (nil)
testFunc: (nil)
Is what we are trying to do possible on Linux, using glibc's implementation of dlsym(3)?
Makefile
(Sorry about the spaces)
LDFLAGS=-L.
LDLIBS=-Wl,--whole-archive -ltest -Wl,--no-whole-archive -ldl
libtest.o: libtest.c libtest.h
libtest.a: libtest.o
test: test.o libtest.a
clean:
-rm -f test test.o libtest.o libtest.a
libtest.h
#pragma once
extern void *testVar;
extern int testFunc(int);
libtest.c
#include "libtest.h"
void *testVar;
int testFunc(int x) { return x + 42; }
test.c
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>
int main(int argc, char *argv[]) {
void *handle = dlopen(NULL, 0);
void *symbol = dlsym(handle, "testVar");
printf("testVar: %p\n", symbol);
symbol = dlsym(handle, "testFunc");
printf("testFunc: %p\n", symbol);
return 0;
}
You should link your program with -rdynamic (or --export-dynamic for ld(1)) so
LDFLAGS += -rdynamic -L.
Then all the symbols are in the dynamic symbol table, the one used by dlsym
BTW, the visibility attribute could be of interest.
In C, declaring a variable static at the global level (outside any function) indicates that it is visible only to that linker object (typically, that .C file).
If the same .C file is part of multiple different libraries that are then linked together in a single executable, do conflicts arise?
For example:
MyFile.c
typedef struct {
[my important data];
} MyGlobalType;
static MyGlobalType globalData = { [...data...] };
Then if I have:
Plugin_Alpha.so: composed of MyFile.C + AlphaSource.C
Plugin_Beta.so: composed of MyFile.C + BetaSource.C
MainProgram.exe: composed of MainCode.C (which loads the two plugins)
Will Plugin_Alpha and Plugin_Beta have separate, isolated copies of globalData?
Or will they end up referring to the same structure?
Well, here's one way to find out:
File liba.c:
static int globalData;
int *GetGlobalData() { return &globalData; }
Compile into two separate shared libraries:
$ gcc liba.c -o liba.so -fPIC -shared
$ gcc liba.c -o libb.so -fPIC -shared
Main program:
#include <dlfcn.h>
#include <stdio.h>
int main(void)
{
// Error checking omitted for expository purposes
void *liba = dlopen("liba.so", RTLD_LAZY);
void *libb = dlopen("libb.so", RTLD_LAZY);
typedef int* (*FuncV_IP)(void);
FuncV_IP funca = (FuncV_IP)dlsym(liba, "GetGlobalData");
FuncV_IP funcb = (FuncV_IP)dlsym(libb, "GetGlobalData");
printf("Module A: GetGlobalData() ==> %p\n", funca());
printf("Module B: GetGlobalData() ==> %p\n", funcb());
dlclose(liba);
dlclose(libb);
return 0;
}
Compile and run it:
$ gcc main.c -ldl
$ LD_LIBRARY_PATH=. ./a.out
Output:
Module A: GetGlobalData() ==> 0x7fa97536d020
Module B: GetGlobalData() ==> 0x7fa97516b020
So therefore, each shared library gets its own copy of the global variables.