weak linking in shared object not working as expected

weak linking in shared object not working as expected - c

I'm trying to use the cmocka unit test framework which suggests to use weak linking to be able to select a user-defined implementation over the actual implementation of a function. In my environment I have a shared object which I want to unit test. I've implemented the unit tests in a separate file which I compile and link to the shared object. My problem is that calling a function bar in the shared object which in turn calls a function foo in that shared object always leads to the real implementation of foo and not the custom one. I've created a simplified implementation of the shared object and of the unit test.
The shared library, a.c:
#include <stdio.h>
void foo(void); __attribute__((weak))
void bar(void); __attribute__((weak))
void foo(void) {
printf("called real foo\n");
}
void bar(void) {
printf("called real bar calling\n");
foo();
}
The unit test, b.c:
#include <stdio.h>
#include <stdbool.h>
bool orig_foo;
bool orig_bar;
void __wrap_foo(void) {
printf("in foo wrapper\n");
if (orig_foo)
__real_foo();
else
printf("called wrapped foo\n");
}
void __wrap_bar() {
printf("in bar wrapper\n");
if (orig_bar)
__real_bar();
else
printf("called wrapped bar\n");
}
int main(void) {
orig_bar = true;
orig_foo = false;
printf("calling foo from main\n");
foo();
printf("\n");
printf("calling bar from main\n");
bar();
return 0;
}
And finally, the Makefile:
all: a.out
a.out: b.c a.so
gcc -Wall b.c a.so -Wl,--wrap=foo -Wl,--wrap=bar
a.so: a.c
gcc -Wall -c a.c -shared -o a.so
clean:
rm -f a.so a.out
Running a.out produces the following output:
# ./a.out
calling foo from main
in foo wrapper
called wrapped foo
calling bar from main
in bar wrapper
called real bar
called real foo
From main, the direct call to foo results in __wrap_foo being called, as expected.
Next, I call bar from main which correctly results in __wrap_bar being called, where I redirect the call to the real implementation of bar (__real_bar). bar then calls foo but the real implementation is used, not the wrapped one. Why isn't the wrapped implementation of foo called in this case? It looks like the issue is related to from where the function call originates.
In function bar, if I replaced the called to foo with __wrap_foo I do get the expected behaviour however I don't think that this is an elegant solution.
I've managed to bypass this problem using normal linking and dlopen(3) and friends however I'm curious as to why weak linking isn't working in my case.

Ld's --wrap works by replacing calls to real functions by calls to their wrappers only in files which are linked with this flag. Your library isn't, so it just does plain foo and bar calls. So they get resolved to where they are implemented (a.so).
As for weak symbols, dynamic linker ignores their weakness and treats them as normal symbols (unless you run with LD_DYNAMIC_WEAK which isn't recommended).
In your particular case (overriding symbols in shared library) you'd probably need to apply --wrap to a.so as well. Or - you could use standard symbol interposition. Say if library-under-test is in libsut.so and you want to replace some functions with custom implementations in libstub.so, link them to driver program in a proper order:
LDFLAGS += -lstub -lsut
Then definitions in libstub.so will prevail over libsut.so ones.

In bar() the linker does not store a reference to foo() but to an offset in the text section of the translation unit. Therefore the name is lost.
The "weak" attribute does not help here.
Also it does not help to use -ffunction_sections because the reference will be to an offset of the section of foo().
One straight forward way to get the desired result is to separate all functions in their own translation unit. You don't need separated source files for this, some conditional compilation will also help. But it makes the source ugly.
You might like to look into my answer to the question "Rename a function without changing its references", too.

One error is that attribute syntax is not correct. Correct ones:
void foo(void) __attribute__((weak));
void bar(void) __attribute__((weak));
An alternative solution would be the standard Linux function interposition:
// a.c
#include <stdio.h>
void foo(void) {
printf("called real foo\n");
}
void bar(void) {
printf("called real bar calling\n");
foo();
}
// b.c
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
#include <stdbool.h>
bool orig_foo;
bool orig_bar;
void foo(void) {
printf("in foo wrapper\n");
static void(*__real_foo)(void);
if(!__real_foo)
__real_foo = dlsym(RTLD_NEXT, "foo");
if (orig_foo)
__real_foo();
else
printf("called wrapped foo\n");
}
void bar() {
printf("in bar wrapper\n");
static void(*__real_bar)(void);
if(!__real_bar)
__real_bar = dlsym(RTLD_NEXT, "bar");
if (orig_bar)
__real_bar();
else
printf("called wrapped bar\n");
}
int main(void) {
orig_bar = true;
orig_foo = false;
printf("calling foo from main\n");
foo();
printf("\n");
printf("calling bar from main\n");
bar();
return 0;
}
$ gcc -o a.so -shared -Wall -Wextra -fPIC a.c
$ gcc -o b -Wall -Wextra b.c -L. -l:a.so -Wl,-rpath='$ORIGIN' -ldl
$ ./b
calling foo from main
in foo wrapper
called wrapped foo
calling bar from main
in bar wrapper
called real bar calling
in foo wrapper
called wrapped foo

Related

What's the difference between declaring a function static and not including it in a header? [duplicate]

This question already has answers here:
What is a "static" function in C?
(11 answers)
Closed 2 years ago.
What's the difference between the following scenarios?
// some_file.c
#include "some_file.h" // doesn't declare some_func
int some_func(int i) {
return i * 5;
}
// ...
and
// some_file.c
#include "some_file.h" // doesn't declare some_func
static int some_func(int i) {
return i * 5;
}
// ...
If all static does to functions is restrict their accessibility to their file, then don't both scenarios mean some_func(int i) can only be accessed from some_file.c since in neither scenario is some_func(int i) put in a header file?

A static function is "local" to the .c file where it is declared in. So you can have another function (static or nor) in another .c file without having a name collision.
If you have a non static function in a .c file that is not declared in any header file, you cannot call this function from another .c file, but you also cannot have another function with the same name in another .c file because this would cause a name collision.
Conclusion: all purely local functions (functions that are only used inside a .c function such as local helper functions) should be declared static in order to prevent the pollution of the name space.
Example of correct usage:
file1.c
static void LocalHelper()
{
}
...
file2.c
static void LocalHelper()
{
}
...
Example of semi correct usage
file1.c
static LocalHelper() // function is local to file1.c
{
}
...
file2.c
void LocalHelper() // global functio
{
}
...
file3.c
void Foo()
{
LocalHelper(); // will call LocalHelper from file2.c
}
...
In this case the program will link correctly, even if LocalHelpershould have been static in file2.c
Example of incorrect usage
file1.c
LocalHelper() // global function
{
}
...
file2.c
void LocalHelper() // global function
{
}
...
file3.c
void Foo()
{
LocalHelper(); // which LocalHelper should be called?
}
...
In this last case we hava a nema collition and the program wil not even link.

The difference is that with a non-static function it can still be declared in some other translation unit (header files are irrelevant to this point) and called. A static function is simply not visible from any other translation unit.
It is even legal to declare a function inside another function:
foo.c:
void foo()
{
void bar();
bar();
}
bar.c:
void bar()
{ ... }

What's the difference between declaring a function static and not including it in a header?
Read more about the C programming language (e.g. Modern C), the C11 standard n1570. Read also about linkers and loaders.
On POSIX systems, notably Linux, read about dlopen(3), dlsym(3), etc...
Practically speaking:
If you declare a function static it stays invisible to other translation units (concretely your *.c source files, with the way you compile them: you could -at least in principle, even if it is confusing- compile foo.c twice with GCC on Linux: once as gcc -c -O -DOPTION=1 foo.c -o foo1.o and another time with gcc -c -O -DOPTION=2 -DWITHOUT_MAIN foo.c -o foo2.o and in some cases be able to link both foo1.o and foo2.o object files into a single executable foo using gcc foo1.o foo2.o -o foo)
If you don't declare it static, you could (even if it is poor taste) code in some other translation unit something like:
if (x > 2) {
extern int some_func(int); // extern is here for readability
return some_func(x);
}
In practice, I have the habit of naming all my C functions (including static ones) with a unique name, and I generally have a naming convention related to them (e.g. naming them with a common prefix, like GTK does). This makes debugging the program (with GDB) easier (since GDB has autocompletion function names).
At last, a good optimizing compiler could, and practically would often, inline calls to static functions whose body is known. With a recent GCC, compile your code with gcc -O2 -Wall. If you want to check how inlining happened, look into the produced assembler (using gcc -O2 -S -fverbose-asm).
On Linux, you can obtain using dlsym the address of a function which is not declared static.

Is it possible to declare a weak function by passing an argument to gcc?

We can declare weak function by using __attribute__((weak)) in C code files. I wonder if there exists a way to declare this during compile time from gcc and not write anything in the code files?
For e.g.
File: foo.h
int foo();
File: foo.c
#include<stdio.h>
int foo(){
printf("foo called from file\n");
return 1;
}
File: main.c
#include<stdio.h>
#include"foo.h"
int foo(){
printf("foo called from main");
return 1;
}
int main(){
foo();
return 0;
}
Is it possible to compile above code and export foo as weak from command line?
E.g. gcc --weak=foo.c:foo foo.c main.c
./a.out produces foo called from main.
I know that writing__attribute__((weak)) above foo() declaration in foo.c will call foo() in main.
The blog:
blog.microjoe.org/2017/unit-tests-c-cmocka-coverage-cmake.html
says that it is possible to do so....
There are two ways of declaring a weak symbol:
By passing an argument to GCC, telling it to export the symbol of this function as a weak symbol.
By putting a attribute((weak)) annotation before the function implementation.

I would say no, there is no such option.
This sounds a little bit like an XY-problem, perhaps you should state more clearly what problem you are trying to solve, instead of which solution you want to make work.
As an aside, your example would not work, since you're providing two definitions and not saying which one should be considered weak. It would have to be
# Remember this doesn't really work!
$ gcc --weak=foo.c:foo foo.c main.c
or something, i.e. you need to indicate in which file the weak definition resides.

How can I have the symbols in a shared library override existing symbols?

I want to load a shared library with dlopen and have the symbols in it available without having to individually grab function pointers to them with dlsym. The man page says that the RTLD_DEEPBIND flag will place lookup of symbols in the library ahead of global scope, but evidently this does not mean it overrides existing symbols because this does not work. Consider this example:
main.c:
#define _GNU_SOURCE
#include <stdio.h>
#include <dlfcn.h>
int is_loaded(){return 0;}
int main(){
void *h = dlopen("./libimplementation.so", RTLD_NOW | RTLD_DEEPBIND);
if(!h){
printf("Could not load implementation: %s\n", dlerror());
return 1;
}
puts(is_loaded() ? "Implementation loaded" : "Implementation not loaded");
dlclose(h);
}
implementation.c:
int is_loaded(){return 1;}
Makefile:
all: main libimplementation.so
main: main.c
gcc -Wall -std=c99 -o $# $^ -ldl
lib%.so: %.c
gcc -Wall -std=c99 -o $# $^ -shared
clean:
-rm main *.so
When I build and run with make and ./main, I expect the test() function from libimplementation.so to override the test() function from main but it doesn't. I know I could also move all of the code in main() into another shared library run and then have main() dlopen libimplementation.so with RTLD_GLOBAL and then have librun.so refer to the symbols from libimplementation.so without having them defined so it loads them:
modified main.c:
#define _GNU_SOURCE
#include <stdio.h>
#include <dlfcn.h>
int main(){
void *impl_h = dlopen("./libimplementation.so", RTLD_LAZY | RTLD_GLOBAL);
if(!impl_h){
printf("Could not load implementation: %s\n", dlerror());
return 1;
}
void *run_h = dlopen("./librun.so", RTLD_LAZY);
if(!run_h){
printf("Could not load run: %s\n", dlerror());
dlclose(impl_h);
return 1;
}
void (*run)(void);
*(void**)&run = dlsym(run_h, "run");
if(!*(void**)&run){
printf("Could not find entry point in run: %s\n", dlerror());
dlclose(impl_h);
dlclose(run_h);
return 1;
}
run();
dlclose(impl_h);
dlclose(run_h);
}
run.c:
#include <stdio.h>
int is_loaded(void);
void run(void){
puts(is_loaded() ? "Implementation loaded" : "Implementation not loaded");
}
and the Makefile gets librun.so added as a prerequisite for all.
Is there a way to get the symbols from the shared library available all at once without dlsym or putting the actual code in another shared library like with librun.so?

There is fundamentally no way to do what you're asking for. Imagine the main program had something like:
static char *myptr = array_in_lib1;
Later, at the time you dlopen, myptr has some other value. Has the program just changed the variable to point to a different object? Or has it been incremented to point to some element later in the array - in which case, would you want it adjusted to account for the redefinition of array_in_lib1 with a new definition from the newly-opened library? Or is it just a random integer cast to char *? Deciding how to treat it is impossible without understanding programmer intent and full process history of how it arrived in the current state.
The above is a particulrly egregious sort of example I've constructed, but the idea of symbols changing definition at runtime is fundamentally inconsistent in all sorts of ways. Even RTLD_DEEPBIND, in what it already does, is arguably inconsitent and buggy. Whatever you're trying to do, you should find another way to do it.

How can a shared library (.so) call a function that is implemented in its loader code?

I have a shared library that I implemented and want the .so to call a function that's implemented in the main program which loads the library.
Let's say I have main.c (executable) which contains:
void inmain_function(void*);
dlopen("libmy.so");
In the my.c (the code for the libmy.so) I want to call inmain_function:
inmain_function(NULL);
How can the shared library call inmain_function regardless the fact inmain_function is defined in the main program.
Note: I want to call a symbol in main.c from my.c not vice versa which is the common usage.

You have two options, from which you can choose:
Option 1: export all symbols from your executable.
This is simple option, just when building executable, add a flag -Wl,--export-dynamic. This would make all functions available to library calls.
Option 2: create an export symbol file with list of functions, and use -Wl,--dynamic-list=exported.txt. This requires some maintenance, but more accurate.
To demonstrate: simple executable and dynamically loaded library.
#include <stdio.h>
#include <dlfcn.h>
void exported_callback() /*< Function we want to export */
{
printf("Hello from callback!\n");
}
void unexported_callback() /*< Function we don't want to export */
{
printf("Hello from unexported callback!\n");
}
typedef void (*lib_func)();
int call_library()
{
void *handle = NULL;
lib_func func = NULL;
handle = dlopen("./libprog.so", RTLD_NOW | RTLD_GLOBAL);
if (handle == NULL)
{
fprintf(stderr, "Unable to open lib: %s\n", dlerror());
return -1;
}
func = dlsym(handle, "library_function");
if (func == NULL) {
fprintf(stderr, "Unable to get symbol\n");
return -1;
}
func();
return 0;
}
int main(int argc, const char *argv[])
{
printf("Hello from main!\n");
call_library();
return 0;
}
Library code (lib.c):
#include <stdio.h>
int exported_callback();
int library_function()
{
printf("Hello from library!\n");
exported_callback();
/* unexported_callback(); */ /*< This one will not be exported in the second case */
return 0;
}
So, first build the library (this step doesn't differ):
gcc -shared -fPIC lib.c -o libprog.so
Now build executable with all symbols exported:
gcc -Wl,--export-dynamic main.c -o prog.exe -ldl
Run example:
$ ./prog.exe
Hello from main!
Hello from library!
Hello from callback!
Symbols exported:
$ objdump -e prog.exe -T | grep callback
00000000004009f4 g DF .text 0000000000000015 Base exported_callback
0000000000400a09 g DF .text 0000000000000015 Base unexported_callback
Now with the exported list (exported.txt):
{
extern "C"
{
exported_callback;
};
};
Build & check visible symbols:
$ gcc -Wl,--dynamic-list=./exported.txt main.c -o prog.exe -ldl
$ objdump -e prog.exe -T | grep callback
0000000000400774 g DF .text 0000000000000015 Base exported_callback

You'll need make a register function in your .so so that the executable can give a function pointer to your .so for it's later used.
Like this:
void in_main_func () {
// this is the function that need to be called from a .so
}
void (*register_function)(void(*)());
void *handle = dlopen("libmylib.so");
register_function = dlsym(handle, "register_function");
register_function(in_main_func);
the register_function needs to store the function pointer in a variable in the .so where the other function in the .so can find it.
Your mylib.c would the need to look something like this:
void (*callback)() = NULL;
void register_function( void (*in_main_func)())
{
callback = in_main_func;
}
void function_needing_callback()
{
callback();
}

Put your main function's prototype in a .h file and include it in both your main and dynamic library code.
With GCC, simply compile your main program with the -rdynamic flag.
Once loaded, your library will be able to call the function from the main program.
A little further explanation is that once compiled, your dynamic library will have an undefined symbol in it for the function that is in the main code. Upon having your main app load the library, the symbol will be resolved by the main program's symbol table. I've used the above pattern numerous times and it works like a charm.

The following can be used to load a dynamic library and call it from the loading call (in case somebody came here after looking for how to load and call a function in an .so library):
void* func_handle = dlopen ("my.so", RTLD_LAZY); /* open a handle to your library */
void (*ptr)() = dlsym (func_handle, "my_function"); /* get the address of the function you want to call */
ptr(); /* call it */
dlclose (func_handle); /* close the handle */
Don't forget to put #include <dlfcn.h> and link with the –ldl option.
You might also want to add some logic that checks if NULL is returned. If it is the case you can call dlerror and it should give you some meaningful messages describing the problem.
Other posters have however provided more suitable answers for your problem.

Allowing dynamically loaded libraries in C to "publish" functions for use

I'm writing a program in C which allows users to implement custom "functions" to be run by an interpreter of sorts. I also want to allow users to write these custom functions in plain C, and then be loaded in dynamically. In order to do this, I created two structs, one for interpreted functions, and one for native ones.
Here's a simplified example:
struct func_lang {
bool is_native;
char* identifier;
// various other properties
}
typedef void (*func_native_ptr)(int);
struct func_native {
bool is_native;
char* identifier;
func_native_ptr func_ptr;
}
I then use the identifier property of each struct to put them in a hashtable, which I can then use to execute them at runtime.
My problem is actually from the other end, the loaded libraries. I'd like to be able to allow the loaded libraries to "publish" which functions they want inserted into the list. For example, maybe with a function like this:
void register_func_native(char* identifer, func_native_ptr func_ptr);
Then, when I would call an init function from the library, they could call this function to insert the functions into the hashtable.
Will this work? I'm a little confused about how the register_func_native function would be linked, since it's needed by the loaded library, but would have to be defined by the loader itself. Does my loader function need to be implemented in another shared library, which could then be linked at runtime?

This will depend on the platform, but on Linux and all the Unixes I've seen, this will work. Example:
$ cat dll.c
void print(char const *);
void foo()
{
print("Hello!");
}
$ gcc -Wall -shared -fPIC -o dll.so dll.c
$ cat main.c
#include <dlfcn.h>
#include <stdio.h>
void print(char const *msg)
{
puts(msg);
}
int main()
{
void *lib = dlopen("./dll.so", RTLD_NOW);
void (*foo)() = (void (*)())dlsym(lib, "foo");
foo();
return 0;
}
$ cc -fPIC -rdynamic main.c -ldl
$ ./a.out
Hello!

It will work. The shared library sees all the global symbols you have in your main program.
In the main program you'll need dlopen() and dlsym() to call the initialization function.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

weak linking in shared object not working as expected - c

Related

What's the difference between declaring a function static and not including it in a header? [duplicate]

Is it possible to declare a weak function by passing an argument to gcc?

How can I have the symbols in a shared library override existing symbols?

How can a shared library (.so) call a function that is implemented in its loader code?

Allowing dynamically loaded libraries in C to "publish" functions for use

Categories

Resources