Static library loaded twice

Static library loaded twice - c

I have shared object A.so which statically links to libssl.a & another shared object B.so which also statically links libssl.a .
A.so & B.so has symbols from libssl.a in GLOBAL scope. I checked this by readelf -s A.so
I have an executable a.out which loads A.so and B.so. When a.out terminated I get a
double free error in one of the symbols from libssl.a in A.so.
Even though libssl.a is statically linked to each shared object, since they are exposed
globally is it possible that the same symbol is shared instead of picking it's local copy.
What is the workaround this ? How to make the symbols local here ?
Please help

This is indeed expected. One instance of libssl.a interposes (likely a subset of) the other, and the results are not pretty. You can use a version script (--version-script to ld, with -Wl, for cc) to control what is exported from A.so and B.so. If something is not exported, it cannot be interposed either.
Alternatively, you could compile libssl.a with visibility flags like -fvisibility=hidden. These flags only affect the dynamic linker and not static linking. You likely needed to compile it yourself anyway because shipped .a files tend to contain position-dependent code, meant for linking into executables. Only some platforms such as 32-bit x86 let you get away with linking such code into shared objects and only at the cost of text relocations.
The dlopen with RTLD_LOCAL as suggested in a comment should also work but it seems hackish to use dlopen for this purpose.
Another option is to use the same shared libssl.so in both libraries.

Related

On linking of shared libraries, are they really final, and if so, why?

I am trying to understand more about linking and shared library.
Ultimately, I wonder if it's possible to add a method to a shared library. For instance, suppose one has a source file a.c, and a library lib.so (without the source file). Let's furthermore assume, for simplicity, that a.c declares a single method, whose name is not present in lib.so. I thought maybe it might be possible to, at linking time, link a.o to lib.so while instructing to create newLib.so, and forcing the linker to export all methods/variable in lib.so to that the newLib.so is now basically lib.so with the added method from a.so.
More generally, if one has some source file depending on a shared library, can one create a single output file (library or executable) that is not dependent on the shared library anymore ? (That is, all the relevant methods/variable from the library would have been exported/linked/inlined to the new executable, hence making the dependency void). If that's not possible, what is technically preventing it ?
A somehow similar question has been asked here: Merge multiple .so shared libraries.
One of the reply includes the following text: "If you have access to either source or object files for both libraries, it is straightforward to compile/link a combined SO from them.: without explaining the technical details. Was it a mistake or does it hold ? If so, how to do it ?

Once you have a shared library libfoo.so the only ways you can use it
in the linkage of anything else are:-
Link a program that dynamically depends on it, e.g.
$ gcc -o prog bar.o ... -lfoo
Or, link another shared library that dynamically depends on it, e.g.
$ gcc -shared -o libbar.so bar.o ... -lfoo
In either case the product of the linkage, prog or libbar.so
acquires a dynamic dependency on libfoo.so. This means that prog|libfoo.so
has information inscribed in it by the linker that instructs the
OS loader, at runtime, to find libfoo.so, load it into the
address space of the current process and bind the program's references to libfoo's exported symbols to
the addresses of their definitions.
So libfoo.so must continue to exist as well as prog|libbar.so.
It is not possible to link libfoo.so with prog|libbar.so in
such a way that libfoo.so is physically merged into prog|libbar.so
and is no longer a runtime dependency.
It doesn't matter whether or not you have the source code of the
other linkage input files - bar.o ... - that depend on libfoo.so. The
only kind of linkage you can do with a shared library is dynamic linkage.
This is in complete contrast with the linkage of a static library
You wonder about the statement in this this answer where it says:
If you have access to either source or object files for both libraries, it is straightforward to compile/link a combined SO from them.
The author is just observing that if I have source files
foo_a.c foo_b.c... bar_a.c bar_b.c
which I compile to the corresponding object files:
foo_a.o foo_b.o... bar_a.o bar_b.o...
or if I simply have those object files. Then as well as - or instead of - linking them into two shared libraries:
$ gcc -shared -o libfoo.so foo_a.o foo_b.o...
$ gcc -shared -o libbar.so bar_a.o bar_b.o...
I could link them into one:
$ gcc -shared -o libfoobar.so foo_a.o foo_b.o... bar_a.o bar_b.o...
which would have no dependency on libfoo.so or libbar.so even if they exist.
And although that could be straightforward it could also be false. If there is
any symbol name that is globally defined in any of foo_a.o foo_b.o... and
also globally defined in any of bar_a.o bar_b.o... then it will not matter
to the linkage of either libfoo.so or libbar.so (and it need not be dynamically
exported by either of them). But the linkage of libfoobar.so will fail for
multiple definition of name.
If we build a shared library libbar.so that depends on libfoo.so and has
itself been linked with libfoo.so:
$ gcc -shared -o libbar.so bar.o ... -lfoo
and we then want to link a program with libbar.so, we can do that in such a way
that we don't need to mention its dependency libfoo.so:
$ gcc -o prog main.o ... -lbar -Wl,-rpath=<path/to/libfoo.so>
See this answer to follow that up. But
this doesn't change the fact that libbar.so has a runtime dependency on libfoo.so.
If that's not possible, what is technically preventing it?
What technically prevents linking a shared library with some program
or shared library targ in a way that physically merges it into targ is that a
shared library (like a program) is not the sort of thing that a linker knows
how to physically merge into its output file.
Input files that the linker can physically merge into targ need to
have structural properties that guide the linker in doing that merging. That is the structure of object files.
They consist of named input sections of object code or data that are tagged with various attributes.
Roughly speaking, the linker cuts up the object files into their sections and distributes them into
output sections of the output file according to their attributes, and makes
binary modifications to the merged result to resolve static symbol references
or enable the OS loader to resolve dynamic ones at runtime.
This is not a reversible process. The linker can't consume a program or
shared library and reconstruct the object files from which it was made to
merge them again into something else.
But that's really beside the point. When input files are physically
merged into targ, that is called static linkage.
When input files are just externally referenced in targ to
make the OS loader map them into a process it has launched for targ,
that is called dynamic linkage. Technical development has given us
a file-format solution to each of these needs: object files for static linkage, shared libraries
for dynamic linkage. Neither can be used for the purpose of the other.

Why doesn't ld search rpaths from a DSO itself at link time

I have libA.so, libB.so, and an executable 'foo'. 'foo' needs libB.so which itself needs libA.so. During linking foo explicitly links with libB because it directly uses symbols from it. 'foo' does not directly use symbols from libA. When linking 'foo', ld wants to check it can resolve symbols references from libB in libA but it can't find libA. I can make it find libA by using -Wl,rpath-link=, or I can have the linker ignore libA using -Wl,--allow-shlib-undefined.
The problem is I shouldn't have to set either of these options because libB.so contains an rpath that tells the linker where to find libA.so and the linker uses this rpath at runtime to successfully find libA. So why doesn't it use it at link time? Forcing foo's build configuration to know where libA is seems completely unnecessary in this case?

I shouldn't have to set either of these options because libB.so contains an rpath that tells the linker where to find libA.so and the linker uses this rpath at runtime
You are mixing up the static linker ld and the runtime linker (aka loader) ld.so.
On Linux, these come from binutils and GLIBC respectively. They are completely different programs, maintained by different sets of people.
It would be possible for ld to implement the search path that current version of ld.so uses, but this is
nontrivial amount of code, that would need to be written from scratch and
will break as soon as ld.so search mechanism is changed
Update:
isn't the search of the rpath executed by the dynamic linker 'the standard'
There is no standard that defines this (that I know of).
In addition, on Linux and Solaris the search path that ld.so uses could contain dynamic tokens like $ORIGIN and $PLATFORM, which are unknown at (static) link time.

Static vs Dynamic Linking

I'm trying to understand how the ELF looks like for a statically vs. a dynamically linked program.
I understand that this is how static linking works:
In my case, I have two files, foo.c and bar.c.
I also have their object files; foo.o and bar.o.
With the objdump command, I can see the relocations in each file.
How do I statically link the foo.o and bar.o?
How do I dynamically link the foo.o and bar.o?
How can I see the difference in the output files?

Linking dynamically is the default mode of most linkers these days. If you want to link statically you have to use the -static flag when linking. To clarify, when I say "linking dynamically" versus "linking statically" I mean the linking with external libraries, and not generating a library that in turn can be linked (dynamically or statically).
The difference can't be seen in the object files you pass to the linker, as it has nothing to do with the compiler and object-file generation, the result can only be seen in the resulting executable program after linking, and the biggest difference is that the executable will most likely be larger.
The resulting and fully linked executable will be larger because then all the libraries (for which there are static libraries) will actually be linked into the executable program quite literally. It's basically including the libraries object files together with your own object files. Actually, on POSIX platforms static libraries are just archives of object files.

How do I combine a shared object (.so) and a static library (.a) into a new shared object?

I have a library; call it libdog.so.
I do not have the source to libdog.so.
I do not have the .o files which went into libdog.so.
ldd libdog.so
libdogfood.so.1 => not found
libdog depends on libdogfood.
I have a static dogfood library, libdogfood.a and libdogfood.la.
I want to create a new library, libcompletedog.so, which has no
dependency on libdogfood.
I want libcompletedog to include all symbols from libdogfood.

Most UNIX systems (AIX is the exception) consider .so libraries a "final" product of the link, that can not be relinked into something else.
If your libdogfood.a is a 32-bit library, you might be able to link it into libdogfood.so.1, and thus satisfy the missing dependency:
gcc -shared -o libdogfood.so.1 \
-Wl,--whole-archive libdogfood.a -Wl,--no-whole-archive
If libdogfood.a contains 64-bit objects, above may still work (if the objects were compiled with -fPIC), but that's somewhat unlikely.

Basically you cannot do that, because libdog.so was compiled with -fPIC while libdogfood.a wasn't. Shared libraries need (in practice) to contain only position independent code
(otherwise there is too much relocation information inside them)

Restricting symbols in a Linux static library

I'm looking for ways to restrict the number of C symbols exported to a Linux static library (archive). I'd like to limit these to only those symbols that are part of the official API for the library. I already use 'static' to declare most functions as static, but this restricts them to file scope. I'm looking for a way to restrict to scope to the library.
I can do this for shared libraries using the techniques in Ulrich Drepper's How to Write Shared Libraries, but I can't apply these techniques to static archives. In his earlier Good Practices in Library Design paper, he writes:
The only possibility is to combine all object files which need
certain internal resources into one using 'ld -r' and then restrict the symbols
which are exported by this combined object file. The GNU linker has options to
do just this.
Could anyone help me discover what these options might be? I've had some success with 'strip -w -K prefix_*', but this feels brutish. Ideally, I'd like a solution that will work with both GCC 3 and 4.
Thanks!

I don't believe GNU ld has any such options; Ulrich must have meant objcopy, which has many such options: --localize-hidden, --localize-symbol=symbolname, --localize-symbols=filename.
The --localize-hidden in particular allows one to have a very fine control over which symbols are exposed. Consider:
int foo() { return 42; }
int __attribute__((visibility("hidden"))) bar() { return 24; }
gcc -c foo.c
nm foo.o
000000000000000b T bar
0000000000000000 T foo
objcopy --localize-hidden foo.o bar.o
nm bar.o
000000000000000b t bar
0000000000000000 T foo
So bar() is no longer exported from the object (even though it is still present and usable for debugging). You could also remove bar() all together with objcopy --strip-unneeded.

Static libraries can not do what you want for code compiled with either GCC 3.x or 4.x.
If you can use shared objects (libraries), the GNU linker does what you need with a feature called a version script. This is usually used to provide version-specific entry points, but the degenerate case just distinguishes between public and private symbols without any versioning. A version script is specified with the --version-script= command line option to ld.
The contents of a version script that makes the entry points foo and bar public and hides all other interfaces:
{ global: foo; bar; local: *; };
See the ld doc at: http://sourceware.org/binutils/docs/ld/VERSION.html#VERSION
I'm a big advocate of shared libraries, and this ability to limit the visibility of globals is one their great virtues.
A document that provides more of the advantages of shared objects, but written for Solaris (by Greg Nakhimovsky of happy memory), is at http://developers.sun.com/solaris/articles/linker_mapfiles.html
I hope this helps.

The merits of this answer will depend on why you're using static libraries. If it's to allow the linker to drop unused objects later then I have little to add. If it's for the purpose of organisation - minimising the number of objects that have to be passed around to link applications - this extension of Employed Russian's answer may be of use.
At compile time, the visibility of all symbols within a compilation unit can be set using:
-fvisibility=hidden
-fvisibility=default
This implies one can compile a single file "interface.c" with default visibility and a larger number of implementation files with hidden visibility, without annotating the source. A relocatable link will then produce a single object file where the non-api functions are "hidden":
ld -r interface.o implementation0.o implementation1.o -o relocatable.o
The combined object file can now be subjected to objcopy:
objcopy --localize-hidden relocatable.o mylibrary.o
Thus we have a single object file "library" or "module" which exposes only the intended API.
The above strategy interacts moderately well with link time optimisation. Compile with -flto and perform the relocatable link by passing -r to the linker via the compiler:
gcc -fuse-linker-plugin -flto -nostdlib -Wl,-r {objects} -o relocatable.o
Use objcopy to localise the hidden symbols as before, then call the linker a final time to strip the local symbols and whatever other dead code it can find in the post-lto object. Sadly, relocatable.o is unlikely to have retained any lto related information:
gcc -nostdlib -Wl,-r,--discard-all relocatable.o mylibrary.o
Current implementations of lto appear to be active during the relocatable link stage. With lto on, the hidden=>local symbols were stripped by the final relocatable link. Without lto, the hidden=>local symbols survived the final relocatable link.
Future implementations of lto seem likely to preserve the required metadata through the relocatable link stage, but at present the outcome of the relocatable link appears to be a plain old object file.

This is a refinement of the answers from EmployedRussian and JonChesterfield, which may be helpful if you're generating both dynamic and static libraries.
Start with the standard mechanism for hiding symbols in DSOs (the dynamic version of your lib). Compile all files with -fvisibility=hidden. In the header file which defines your API, change the declarations of the classes and functions you want to make public:
#define DLL_PUBLIC __attribute__ ((visibility ("default")))
extern DLL_PUBLIC int my_api_func(int);
See here for details. This works for both C and C++. This is sufficient for DSOs, but you'll need to add these build steps for static libraries:
ld -r obj1.o obj2.o ... objn.o -o static1.o
objcopy --localize-hidden static1.o static2.o
ar -rcs mylib.a static2.o
The ar step is optional - you can just link against static2.o.

My way of doing it is to mark everything that is not to be exported with INTERNAL,
include guard all .h files, compile dev builds with -DINTERNAL= and compile release builds with a single .c file that includes all other library .c files with -DINTERNAL=static.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight