Building C dynamic shared library gives undefined symbol

Building C dynamic shared library gives undefined symbol - c

Why building static library (.a) doesn't give any error and works correctly:
$(LIBRARY): assertion.o
$(AR) $(OUTPUT_STATIC_LIB_DIR)/$(LIBRARY) $(OUTPUT_DIR)/assertion.o
Simultaneously, when building shared library (.so) gives me such error:
$(SHARED_LIBRARY): assertion.o
$(CC) $(CFLAGS) -shared -o $(OUTPUT_LIB_DIR)/$(SHARED_LIBRARY) $(OUTPUT_DIR)/assertion.o
Error message:
Undefined symbols for architecture x86_64:
"_float_cmp_func", referenced from:

The code of your library does refer to "_float_cmp_func", which needs to be found at run-time.
But static library is not expected to be a sufficient binary module, it is just collection of object code that is designed to be included into a later build/link steps (together with other object code and libraries).
In contrast, shared library is "ready-to-use" binary module, so its dependencies should be resolved at link stage. So in this case you should add into your link step some module(s) where "_float_cmp_func" is implemented

Related

On linking of shared libraries, are they really final, and if so, why?

I am trying to understand more about linking and shared library.
Ultimately, I wonder if it's possible to add a method to a shared library. For instance, suppose one has a source file a.c, and a library lib.so (without the source file). Let's furthermore assume, for simplicity, that a.c declares a single method, whose name is not present in lib.so. I thought maybe it might be possible to, at linking time, link a.o to lib.so while instructing to create newLib.so, and forcing the linker to export all methods/variable in lib.so to that the newLib.so is now basically lib.so with the added method from a.so.
More generally, if one has some source file depending on a shared library, can one create a single output file (library or executable) that is not dependent on the shared library anymore ? (That is, all the relevant methods/variable from the library would have been exported/linked/inlined to the new executable, hence making the dependency void). If that's not possible, what is technically preventing it ?
A somehow similar question has been asked here: Merge multiple .so shared libraries.
One of the reply includes the following text: "If you have access to either source or object files for both libraries, it is straightforward to compile/link a combined SO from them.: without explaining the technical details. Was it a mistake or does it hold ? If so, how to do it ?

Once you have a shared library libfoo.so the only ways you can use it
in the linkage of anything else are:-
Link a program that dynamically depends on it, e.g.
$ gcc -o prog bar.o ... -lfoo
Or, link another shared library that dynamically depends on it, e.g.
$ gcc -shared -o libbar.so bar.o ... -lfoo
In either case the product of the linkage, prog or libbar.so
acquires a dynamic dependency on libfoo.so. This means that prog|libfoo.so
has information inscribed in it by the linker that instructs the
OS loader, at runtime, to find libfoo.so, load it into the
address space of the current process and bind the program's references to libfoo's exported symbols to
the addresses of their definitions.
So libfoo.so must continue to exist as well as prog|libbar.so.
It is not possible to link libfoo.so with prog|libbar.so in
such a way that libfoo.so is physically merged into prog|libbar.so
and is no longer a runtime dependency.
It doesn't matter whether or not you have the source code of the
other linkage input files - bar.o ... - that depend on libfoo.so. The
only kind of linkage you can do with a shared library is dynamic linkage.
This is in complete contrast with the linkage of a static library
You wonder about the statement in this this answer where it says:
If you have access to either source or object files for both libraries, it is straightforward to compile/link a combined SO from them.
The author is just observing that if I have source files
foo_a.c foo_b.c... bar_a.c bar_b.c
which I compile to the corresponding object files:
foo_a.o foo_b.o... bar_a.o bar_b.o...
or if I simply have those object files. Then as well as - or instead of - linking them into two shared libraries:
$ gcc -shared -o libfoo.so foo_a.o foo_b.o...
$ gcc -shared -o libbar.so bar_a.o bar_b.o...
I could link them into one:
$ gcc -shared -o libfoobar.so foo_a.o foo_b.o... bar_a.o bar_b.o...
which would have no dependency on libfoo.so or libbar.so even if they exist.
And although that could be straightforward it could also be false. If there is
any symbol name that is globally defined in any of foo_a.o foo_b.o... and
also globally defined in any of bar_a.o bar_b.o... then it will not matter
to the linkage of either libfoo.so or libbar.so (and it need not be dynamically
exported by either of them). But the linkage of libfoobar.so will fail for
multiple definition of name.
If we build a shared library libbar.so that depends on libfoo.so and has
itself been linked with libfoo.so:
$ gcc -shared -o libbar.so bar.o ... -lfoo
and we then want to link a program with libbar.so, we can do that in such a way
that we don't need to mention its dependency libfoo.so:
$ gcc -o prog main.o ... -lbar -Wl,-rpath=<path/to/libfoo.so>
See this answer to follow that up. But
this doesn't change the fact that libbar.so has a runtime dependency on libfoo.so.
If that's not possible, what is technically preventing it?
What technically prevents linking a shared library with some program
or shared library targ in a way that physically merges it into targ is that a
shared library (like a program) is not the sort of thing that a linker knows
how to physically merge into its output file.
Input files that the linker can physically merge into targ need to
have structural properties that guide the linker in doing that merging. That is the structure of object files.
They consist of named input sections of object code or data that are tagged with various attributes.
Roughly speaking, the linker cuts up the object files into their sections and distributes them into
output sections of the output file according to their attributes, and makes
binary modifications to the merged result to resolve static symbol references
or enable the OS loader to resolve dynamic ones at runtime.
This is not a reversible process. The linker can't consume a program or
shared library and reconstruct the object files from which it was made to
merge them again into something else.
But that's really beside the point. When input files are physically
merged into targ, that is called static linkage.
When input files are just externally referenced in targ to
make the OS loader map them into a process it has launched for targ,
that is called dynamic linkage. Technical development has given us
a file-format solution to each of these needs: object files for static linkage, shared libraries
for dynamic linkage. Neither can be used for the purpose of the other.

Linking a dynamic library that links in symbols from a static library: macOS vs Linux

I am porting a Linux application to macOS and there is a difference in the linking behavior that took some of my time to reveal itself. The project uses a two-stage CMake-based build process: one CMake tree creates a dynamic library that links to static library that is created in the second tree that is created later. The static library does not exist yet when the dynamic library is created. This works on Linux: dynamic library gets created with symbols from static library and they are forward-declared. When the second tree is built, the dynamic library gets linked to an executable which also links the static library and this way everything works fine. This does not work on macOS because in the first CMake tree, the compiler fails at the dynamic library's linking step because the static library from the second tree does not exist yet.
I have reduced my application to a minimal example (the code can be found at the end of my question).
The setup is as follows:
Minimal C program with a main() function
Dynamic library with one function
Static library with one function
The program calls a function from the dynamic library. The dynamic library is of course linked dynamically to the program.
The dynamic library calls a function from the static library. The static library is linked statically to the dynamic library.
If we stop linking the static library to dynamic library:
# This target_link_libraries is intentionally commented out.
#target_link_libraries(dynamic_lib static_lib)
we, of course, get errors when building a program. But the errors are different on macOS and Linux:
macOS/clang fails earlier at the step when the dynamic library gets linked vs
Linux/gcc fails later at the step when the program gets linked
I do recognize that the difference can be in the fact that I am using clang on macOS and gcc on Linux but this does not explain the issue to me.
I would like to know:
Why does this difference exist?
Can I get the Linux behavior on macOS by tweaking the compiler/linker flags?
The example has been published to Github: Linking a dynamic library that links in symbols from a static library (macOS vs Linux).
These are the key files from my example on Github:
CMakeLists.txt
cmake_minimum_required(VERSION 3.14)
project(untitled1 C)
add_compile_options("-fPIC")
set(CMAKE_C_STANDARD 99)
add_library(static_lib static_lib.c)
add_library(dynamic_lib SHARED dynamic_lib.c)
# THE ISSUE IS HERE:
# This target_link_libraries is intentionally commented out.
# on macOS the build process fails when linking dynamic_lib
# on Linux the build process fails when linking the
# 'untitled1' program.
#target_link_libraries(dynamic_lib static_lib)
add_executable(untitled1 main.c)
target_link_libraries(untitled1 dynamic_lib)
dynamic_lib.c
#include "dynamic_lib.h"
#include "static_lib.h"
void dynamic_lib_func() {
static_lib_func();
}
static_lib.c
#include "static_lib.h"
void static_lib_func() {}
macOS output
[ 25%] Building C object CMakeFiles/dynamic_lib.dir/dynamic_lib.c.o
/Library/Developer/CommandLineTools/usr/bin/cc -Ddynamic_lib_EXPORTS -g -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk -fPIC -fPIC -std=gnu99 -o CMakeFiles/dynamic_lib.dir/dynamic_lib.c.o -c /Users/stanislaw/workspace/code/Examples/untitled1/dynamic_lib.c
[ 50%] Linking C shared library libdynamic_lib.dylib
/Applications/CLion.app/Contents/bin/cmake/mac/bin/cmake -E cmake_link_script CMakeFiles/dynamic_lib.dir/link.txt --verbose=1
/Library/Developer/CommandLineTools/usr/bin/cc -g -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk -dynamiclib -Wl,-headerpad_max_install_names -o libdynamic_lib.dylib -install_name #rpath/libdynamic_lib.dylib CMakeFiles/dynamic_lib.dir/dynamic_lib.c.o
Undefined symbols for architecture x86_64:
"_static_lib_func", referenced from:
_dynamic_lib_func in dynamic_lib.c.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Linux output
[ 16%] Linking C shared library libdynamic_lib.so
[ 33%] Built target dynamic_lib
Scanning dependencies of target untitled1
[ 50%] Linking C executable untitled1
libdynamic_lib.so: undefined reference to `static_lib_func'
collect2: error: ld returned 1 exit status

I have managed to find a solution to the related issue that I also encountered while porting the same project from Linux to macOS: How to share a global variable between a main process and a dynamic library via a static library (macOS)?.
It turns out that this kind of "forward declaration" of the library symbols is possible on macOS:
Adding the -undefined dynamic_lookup flag makes macOS to pass the original error.
Adding this to my example's CMakeLists.txt file solves the issue:
target_link_options(dynamic_lib PRIVATE -undefined dynamic_lookup)

Link static library directly into an executable using ld.gold

I have an libfoo.a which contains _start and all required symbols for an executable. ld.bfd -o foo libfoo.a works smoothly in my case. However, ld.gold -o foo libfoo.a fails silently generating an executable with no symbols from libfoo.a. Creating an empty a.o and link it with ld.gold -o foo a.o libfoo.a works.
I was wondering is there any way to directly link a static library into an executable using ld.gold without creating a redundant empty object files?

You can specify the entry symbol explicitly with the -e _start option, and the linker will use that to decide that it needs to load the object that defines it.
Unfortunately, gold will not use the implicit start symbol to load an object from the archive library.

Creating a dylib which gets linked at runtime

I am trying to create a dynamic library which is meant to be linked and loaded into a host environment at runtime (e.g. similar to how class loading works in Java). As such, I want the dynamic library to be left with a few "dangling" references, which I expect it to pick up from its host environment when it is loaded into that environment.
My problem is that I cannot figure out how to create the dynamic library without explicitly linking it to existing symbols. I am hoping to produce a dynamic library that does not depend on a specific host executable (or host library), rather one that is able to be loaded (e.g. by dlopen) in any host as long as the host makes a couple symbols available for use.
Right now, any linking command I've tried results in a complaint of missing symbols. I'd like it to allow symbols to be missing (ideally, just particularly specified symbols).
For example, here's a transcript with the error on OS X:
$ cat frotz.c
void blort(void);
void run(void) {
blort();
}
$ cc -c -o frotz.o frotz.c
$ cc -dynamiclib -o libfrotz.dylib frotz.o
Undefined symbols for architecture x86_64:
"_blort", referenced from:
_run in frotz.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
If I do the same thing using a GNU toolchain (on Linux), it helpfully tells me:
$ gcc -shared -o libfrotz.so frotz.o
/usr/bin/ld: frotz.o: relocation R_X86_64_PC32 against undefined symbol `blort'
can not be used when making a shared object; recompile with -fPIC
and indeed, adding -fPIC to the C compile command seems to fix the problem in that environment. However, it doesn't seem to have any effect in OS X.
All the other dynamic-linking questions I could find on SO seem to be about the more usual arrangement of libraries, where a library is being built to be linked into an executable before that executable runs, rather than the other way around. The closest related question I found was this:
Can an executable be linked to a dynamic library after its built?
which unfortunately has very little info, none of it relevant to the question I'm asking here.
UPDATE: I distilled the info from the answer along with everything else I'd figured
out, and put together this example:
https://github.com/danfuzz/dl-example

As far as my knowledge goes, you want to use weak linkage:
// mark function as weakly-linked
extern void foo() __attribute__((weak));
// inform the linker about that too
clang -dynamiclib -o bar.dylib bar.o -flat_namespace -undefined dynamic_lookup
If a weak function can be resolved at runtime, it will then be resolved. If it can't, it will be NULL, instead of generating a runtime (or, obviously, link-time) error.

Using LD_PRELOAD to overload call to a C function of a shared library

I'm following this answer to override a call to a C function of a C library.
I think I did everything correctly, but it doesn't work:
I want to override the "DibOpen"-function. This is my code of the library which I pass to LD_PRELOAD environment-variable when running my application:
DIBSTATUS DibOpen(void **ctx, enum Board b)
{
printf("look at me, I wrapped\n");
static DIBSTATUS (*func)(void **, enum Board) = NULL;
if(!func)
func = dlsym(RTLD_NEXT, "DibOpen");
printf("Overridden!\n");
return func(pContextAddr, BoardType, BoardHdl);
}
The output of nm lib.so | grep DibOpen shows
000000000001d711 T DibOpen
When I run my program like this
LD_PRELOAD=libPreload.so ./program
I link my program with -ldl but ldd program does not show a link to libdl.so
It stops with
symbol lookup error: libPreload.so: undefined symbol: dlsym
. What can I do to debug this further? Where is my mistake?

When you create a shared library (whether or not it will be used in LD_PRELOAD), you need to name all of the libraries that it needs to resolve its dependencies. (Under some circumstances, a dlopened shared object can rely on the executable to provide symbols for it, but it is best practice not to rely on this.) In this case, you need to link libPreload.so against libdl. In Makefile-ese:
libPreload.so: x.o y.o z.o
$(CC) -shared -Wl,-z,defs -Wl,--as-needed -o $# $^ -ldl
The option -Wl,-z,defs tells the linker that it should issue an error if a shared library has unresolved undefined symbols, so future problems of this type will be caught earlier. The option -Wl,--as-needed tells the linker not to record a dependency on libraries that don't actually satisfy any undefined symbols. Both of these should be on by default, but for historical reasons, they aren't.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight