On 'C', Linux,
Do I need static libraries to statically link, or the shared ones I have suffice?
If not, why not? (Don't they contain the same data?)
Yes, you need static libraries to build a statically linked executable.
Static libraries are bundles of compiled objects. When you statically link with to library, it is effectively the same as taking the compilation results of that library, unpacking them in your current project, and using them as if they were your own objects.
Dynamic libraries are already linked. This means that some information like relocations have already been fixed up and thrown out.
Additionally, dynamic libraries must be compiled as position-independent code. This is not a restriction on static libraries, and results in a significant difference in performance on some common platforms (like x86).
There exist tools like ELF Statifier which attempt to bundle dynamically-linked libraries into a dynamically-linked executable, but it is very difficult to generate a correctly-working result in all circumstances.
There is no such thing as static compilation, only static linking. And for that, you need static libraries. The difference between static and dynamic linking is that with the former, names are resolved at link-time (just after compile-time), wheras with the latter, they are resolved just as the program starts running.
Static and dynamic libraries may or may not contain the same information, depending on lots of factors. The decision on whether to statically or dynamically link your code is an important one, and will often influence application architecture.
All libraries you link into a statically linked program must be the static variant. While the dynamic (libfoo.so) and static (libfoo.a) libraries have the same functions in them, they are different format files and so you need the matching type for your program.
Another option is Ermine (http://magicErmine.com)
It's like statifier, but able to deal with memory randomization.
Related
I have two dynamically loadable libraries lib_smtp.so and and libpop.so etc. Both have a global variable named protocol which is initialized to "SMTP" and "POP" respectively. I have another static library libhttp.a where protocol is initialized to "HTTP".
Now for some reason i need to compile all dynamic linkable and loadable libraries statically and include in the executable. Doing so i am getting error "multiple definition of symbol" during linking of static libraries.
I am curious to know how linker resolves duplicate symbols during dynamic linking where all three mentioned libraries are getting linked ?
Is there some way i can do the same statically as linker is doing in dynamic linking ie without any conflict add all static libraries to executable which have same symbols? if not, why the process is different for statically linked libraries.
Dynamic linking in modern Linux and several other operating systems is based on the ELF binary format. The (ELF) dynamic libraries on which an executable or other shared library relies are prioritized. To resolve a given symbol, the dynamic linker checks each library in priority order until it finds one that defines the symbol.
That can be dicey when multiple dynamic objects define the same symbol and also multiple dynamic objects use that symbol. It can then be the case that the symbol is resolved differently in different dynamic objects.
Full details are out of scope for SO, but I don't know a better technical explanation than the one in Ulrich Drepper's paper "How to Write Shared Libraries".
In dynamic linking some facility called "symbol visibility" kicks in. Essentially this allows to expose only certain symbols across the object's (object in the sense of shared object) boundaries. It is good style to compile and link shared objects with symbols being hidden by default and only expose those explicitly that are required by callees.
Symbol visibility is applied during linking and so far only implemented in dynamic linkers. It's certainly possible to also have it in static linkage, Apple's GCC variant implements so called Mach-O relocateable object files which can be statically linked with visibility applied. But I don't know if the vanilla GCC, binutils ld or the gold linker can do this for plain old ELF.
There are related post here and here.
According to my understanding, static linking directly insert code(what code?machine code?) from library into executables. However, dynamic linking only insert reference(pointer?) point to somewhere in the library.
Then I am wondering why we need two separate version of library of same functionality? For example, for intel MKL, we have libmkl_sequential.a and libmkl_sequential.so. And static linking must link static library, dynamic linking must link dynamic library. Why dynamic linking can not just simply point to static library?
What is the real difference between content of .so and .a of same functionaly?
Code which you want to execute needs to be loaded in memory. A function linked statically becomes a part of your program and so they are both loaded together when the program starts.
Why dynamic linking can not just simply point to static library? Static library is a disk file, how would you want to point inside this? There must be a mechanism (loader & binder) which investigates the starting executable program, asks which functions it wants to use, and loads the corresponding libraries into memory.
Yes, the netto code (instructions) in both versions "libmkl_sequential.a" and "libmkl_sequential.so" may be identical, but static and dynamic types of libraries require different auxilliary metainformation dictated by the library format creator.
I have two dynamically loadable libraries lib_smtp.so and and libpop.so etc. Both have a global variable named protocol which is initialized to "SMTP" and "POP" respectively. I have another static library libhttp.a where protocol is initialized to "HTTP".
Now for some reason i need to compile all dynamic linkable and loadable libraries statically and include in the executable. Doing so i am getting error "multiple definition of symbol" during linking of static libraries.
I am curious to know how linker resolves duplicate symbols during dynamic linking where all three mentioned libraries are getting linked ?
Is there some way i can do the same statically as linker is doing in dynamic linking ie without any conflict add all static libraries to executable which have same symbols? if not, why the process is different for statically linked libraries.
Dynamic linking in modern Linux and several other operating systems is based on the ELF binary format. The (ELF) dynamic libraries on which an executable or other shared library relies are prioritized. To resolve a given symbol, the dynamic linker checks each library in priority order until it finds one that defines the symbol.
That can be dicey when multiple dynamic objects define the same symbol and also multiple dynamic objects use that symbol. It can then be the case that the symbol is resolved differently in different dynamic objects.
Full details are out of scope for SO, but I don't know a better technical explanation than the one in Ulrich Drepper's paper "How to Write Shared Libraries".
In dynamic linking some facility called "symbol visibility" kicks in. Essentially this allows to expose only certain symbols across the object's (object in the sense of shared object) boundaries. It is good style to compile and link shared objects with symbols being hidden by default and only expose those explicitly that are required by callees.
Symbol visibility is applied during linking and so far only implemented in dynamic linkers. It's certainly possible to also have it in static linkage, Apple's GCC variant implements so called Mach-O relocateable object files which can be statically linked with visibility applied. But I don't know if the vanilla GCC, binutils ld or the gold linker can do this for plain old ELF.
According to this expert,
Dynamic loading refers to mapping (or less often copying) an executable or library into a process's memory after is has started. Dynamic linking refers to resolving symbols - associating their names with addresses or offsets - after compile time.
Hence, correspondingly: static loading refers to mapping an executable or libary into memory before it has started, and static linking refers to resolving symbols at compile time.
Now, when you do static loading and static linking of a library, the library's binary code is appended to your binary code, and the (function and variable) references your binary code makes to the library are patched (not sure if that's the correct term) so that they point to the correct positions.
This means that before static linking a call to a function
foo()
would give you (in x86 ASM), among others, an instruction like:
call 0x00000000
and after static linking you have something like:
call 0x00001043
where 0x00001043 is the entry point of the function foo in the binary code that is output by the linker.
Now, when you do dynamic loading and dynamic linking, you will call a library function by way of a function pointer:
typedef int (*fun_ptr)(void);
library = dlopen("mylib.so");
fun_ptr foo = dlsym(library, "foo");
foo();
This mechanism is also how C++ virtual methods work. The address of the method to be called is resolved at runtime by making a function pointer to the method part of the instance (stored in the so-called vtable).
My question is this:
When you do static loading and dynamic linking of a shared library (for context, let's say a .so in linux), does this linking patch my binary's references like in the static loading & linking scenario, or does it work by way of function pointers like in the case of dynamic loading & linking and C++ virtual methods?
When you do static loading and dynamic linking of a shared library
You don't do 'static loading' of a shared library.
Even though it looks to you as an end-user that e.g. libc.so.6 is 'static loaded' at process startup, it is not in fact the case. Rather, the kernel 'static loads' the main binary and ld-linux.so, and then ld-linux dynamic loads all the other shared libraries.
does this linking patch my binary's references like in the static loading & linking scenario, or does it work by way of function pointers like in the case of dynamic loading
It depends.
Usually shared libraries are linked from position-independent code (PIC), and work the way of function pointers (the pointers are stored in the GOT -- global offset table).
But sometimes shared libraries are linked from non-PIC code, and require "text relocations", which work similar to "static linking".
I have been able to produce an example of runtime static linking using libbfd (which is the library sitting beneath GNU binutils' ld linker): https://github.com/bloff/runtime-static-linking-with-libbfd
What are the Compiler Options & Other mechanism for reducing the static library size?
OS : VxWorks
Compiler : GCC
Language : C
Use -Os to optimise for smaller code size, and leave out -g and any other debug options.
If you're really concerned with the executable size after linking a static library then you should also put only one function in each source file (and hence object file). Linkers usually pull entire object files out of a static library during linking.
Are you sure you need to include the static libs in you final image? The static libs are linked into the executable at link time, so unless you are going to make a system with a working compiler/linker you can safely remove the static libraries. Dynamic libs is another story ...
If you need to reduce the size of the static libraries, use "strip" with the right options. "strip mylib.a" without any options should do the right thing, but you might get a smaller library with a few extra options. Be careful so that you don't remove the symbol table from the library since the linker needs this table to do its "magic".
You can use --ffunction-sections and --fdata-sections, which tells gcc to put each function and global data variable in a separate section inside the object. You don't have to modify all your source files.