Why we need separate library for static and dynamic linking? - linker

There are related post here and here.
According to my understanding, static linking directly insert code(what code?machine code?) from library into executables. However, dynamic linking only insert reference(pointer?) point to somewhere in the library.
Then I am wondering why we need two separate version of library of same functionality? For example, for intel MKL, we have libmkl_sequential.a and libmkl_sequential.so. And static linking must link static library, dynamic linking must link dynamic library. Why dynamic linking can not just simply point to static library?
What is the real difference between content of .so and .a of same functionaly?

Code which you want to execute needs to be loaded in memory. A function linked statically becomes a part of your program and so they are both loaded together when the program starts.
Why dynamic linking can not just simply point to static library? Static library is a disk file, how would you want to point inside this? There must be a mechanism (loader & binder) which investigates the starting executable program, asks which functions it wants to use, and loads the corresponding libraries into memory.
Yes, the netto code (instructions) in both versions "libmkl_sequential.a" and "libmkl_sequential.so" may be identical, but static and dynamic types of libraries require different auxilliary metainformation dictated by the library format creator.

Related

Why aren't LIBs and DLLs interchangeable?

LIB files are static libraries that must be included at compile time, whereas DLL files can be "dynamically" accessed by a program during runtime. (DLLs must be linked, however, either implicitly before runtime with an import library (LIB) or explicitly via LoadLibrary).
My question is: why differentiate between these file types at all? Why can't LIB files be treated as DLLs and vice versa? It seems like an unnecessary distinction.
Some related questions:
DLL and LIB files - what and why?
Why are LIB files beasts of such a duplicitous nature?
DLL and LIB files
What exactly are DLL files, and how do they work?
You must differentiate between shareable objects and static libraries simply because they are really different objects.
A shareable object file, as a DLL or a SO, contains structures used by the loader to allow the dynamic link to and from other executable images (i.e. export tables).
A DLL is at all effects an executable image that, as an executable, can be loaded in memory and relocated (if not position independent code), but not only import symbols, as the executable do, but also expose exported symbols.
Exported symbols can be used by the loader to interlink different executable modules in memory.
A static library, on the other hand, is simply a collection of object modules to be linked in a single executable, or even a DLL.
An object module contains instruction bytecode and placeholder for external symbols which are referenced through the relocation table.
The linker collect object modules one by one each time they are referenced, i.e. a function call, and add the object to the linking code stream, than examine the relocation table of the object module, and replace the occurrence of each external symbol, with a displacement of the symbol inside the linked code. Eventually adding more object modules as new references are discovered. This is a recursive process that will end when no more undefined references remain.
At the end of linking process you have an image of your executable code in the memory. This image will be read and placed in memory by the loader, an OS component, that will fix some minor references and fills the import table with the addresses of symbols imported from DLL's.
Moreover if it is true that you can extract each single object module you need from an archive (library file), you can't extract single parts from a DLL because it is a merge of all modules without any reference for the start and the end of each.
It should be clear now that while an object module, the .obj file, or a collection of them, .lib file, is quite different from a DLL. Raw code the first, a fully linked and 'ready to run' piece of code the second.
The very reason for existence of shareable objects and static libraries is related to efficiency and resource rationalization.
When you statically link library modules you replicate same code for each executable you create using that static library, implying larger executable files that will take longer time to load wasting kernel execution time and memory space.
When you use shareable objects you load the code only the first time, then for all subsequent executables you need only to map the space where the DLL code lays in the new process memory space and create a new data segment (this must be unique for each process to avoid conflicts), effectively optimizing memory and system usage (for lighter loader workload).
So how we have to choose between the two?
Static linking is convenient when your code is used by a limited number of programs, in which case the effort to load a separate DLL module isn't worth.
Static linking also allows to easily reference to process defined global variables or other process local data. This is not possible, or not so easy, with a DLL because being a complete executable can't have undefined references so you must define any global inside the DLL, and this reference will be common for all processes accessing the DLL code.
Dynamic linking is convenient when the code is used by many programs making more efficient the loader work, and reducing the memory usage. Example of this are the system libraries, which are used by almost all programs, or compiler runtime.

What is the application of dynamic loading in c programming? [duplicate]

Routine is not loaded until it is called. All routines are kept on disk in a re-locatable load format. The main program is loaded into memory & is executed. This is called Dynamic Linking.
Why this is called Dynamic Linking? Shouldn't it be Dynamic Loading because Routine is not loaded until it is called in dynamic loading where as in dynamic linking, Linking postponed until execution time.
This answer assumes that you know basic Linux command.
In Linux, there are two types of libraries: static or shared.
In order to call functions in a static library you need to statically link the library into your executable, resulting in a static binary.
While to call functions in a shared library, you have two options.
First option is dynamic linking, which is commonly used - when compiling your executable you must specify the shared library your program uses, otherwise it won't even compile. When your program starts it's the system's job to open these libraries, which can be listed using the ldd command.
The other option is dynamic loading - when your program runs, it's the program's job to open that library. Such programs are usually linked with libdl, which provides the ability to open a shared library.
Excerpt from Wikipedia:
Dynamic loading is a mechanism by which a computer program can, at run
time, load a library (or other binary) into memory, retrieve the
addresses of functions and variables contained in the library, execute
those functions or access those variables, and unload the library from
memory. It is one of the 3 mechanisms by which a computer program can
use some other software; the other two are static linking and dynamic
linking. Unlike static linking and dynamic linking, dynamic loading
allows a computer program to start up in the absence of these
libraries, to discover available libraries, and to potentially gain
additional functionality.
If you are still in confusion, first read this awesome article: Anatomy of Linux dynamic libraries and build the dynamic loading example to get a feel of it, then come back to this answer.
Here is my output of ldd ./dl:
linux-vdso.so.1 => (0x00007fffe6b94000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f400f1e0000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f400ee10000)
/lib64/ld-linux-x86-64.so.2 (0x00007f400f400000)
As you can see, dl is a dynamic executable that depends on libdl, which is dynamically linked by ld.so, the Linux dynamic linker when you run dl. Same is true for the other 3 libraries in the list.
libm doesn't show in this list, because it is used as a dynamically loaded library. It isn't loaded until ld is asked to load it.
Dynamic loading means loading the library (or any other binary for that matter) into the memory during load or run-time.
Dynamic loading can be imagined to be similar to plugins , that is an exe can actually execute before the dynamic loading happens(The dynamic loading for example can be created using LoadLibrary call in C or C++)
Dynamic linking refers to the linking that is done during load or run-time and not when the exe is created.
In case of dynamic linking the linker while creating the exe does minimal work.For the dynamic linker to work it actually has to load the libraries too.Hence it's also called linking loader.
Hence the sentences you refer may make sense but they are still quite ambiguous as we cannot infer the context in which it is referring in.Can you inform us where did you find these lines and at what context is the author talking about?
Dynamic loading refers to mapping (or less often copying) an executable or library into a process's memory after it has started. Dynamic linking refers to resolving symbols - associating their names with addresses or offsets - after compile time.
Here is the link to the full answer by Jeff Darcy at quora
http://www.quora.com/Systems-Programming/What-is-the-exact-difference-between-Dynamic-loading-and-dynamic-linking/answer/Jeff-Darcy
I am also reading the "dinosaur book" and was confused with the loading and linking concept. Here is my understanding:
Both dynamic loading and linking happen at runtime, and load whatever they need into memory.
The key difference is that dynamic loading checks if the routine was loaded by the loader while dynamic linking checks if the routine is in the memory.
Therefore, for dynamic linking, there is only one copy of the library code in the memory, which may be not true for dynamic loading. That's why dynamic linking needs OS support to check the memory of other processes. This feature is very important for language subroutine libraries, which are shared by many programs.
Dynamic linker is a run time program that loads and binds all of the dynamic dependencies of a program before starting to execute that program. Dynamic linker will find what dynamic libraries a program requires, what libraries those libraries require (and so on), then it will load all those libraries and make sure that all references to functions then correctly point to the right place. For example, even the most basic “hello world” program will usually require the C library to display the output and so the dynamic linker will load the C library before loading the hello world program and will make sure that any calls to printf() go to the right code.
Dynamic Loading: Load routine in main memory on call.
Dynamic Linking: Load routine in main memory during execution time,if call happens before execution time it is postponed till execution time.
Dynamic loading does not require special support from Operating system, it is the responsibility of the programmer to check whether the routine that is to be loaded does not exist in main memory.
Dynamic Linking requires special support from operating system, the routine loaded through dynamic linking can be shared across various processes.
Routine is not loaded until it is called. All routines are kept on disk in a re-locatable load format. The main program is loaded into memory & is executed. This is called Dynamic Linking.
The statement is incomplete."The main program is loaded into main memory & is executed." does not specify when the program is loaded.
If we consider that it is loaded on call as 1st statement specifies then its Dynamic Loading
We use dynamic loading to achieve better space utilization
With dynamic loading a program is not loaded until it is called.All routines are kept on a disk in a relocatable load format.The main program is loaded into memory and is executed.
When a routine needs to call another routine, the calling routine first checks to see whether has been loaded.If not , the relocatable linking loader is called to load the desired routine into memory and update program's address tables to reflect this change.Then control is passed to newly loaded routine
Advantages
An unused routine is never loaded .This is most useful when the program code
is large where infrequently occurring cases are needed to handle such as
error routines.In this case although the program code is large ,used code
will be small.
Dynamic loading doesn't need special support from O.S.It is the
responsibility of user to design their program to take advantage of
method.However, O.S can provide libraries to help the programmer
There are two types of Linking Static And Dynamic ,when output file is executed without any dependencies(files=Library) at run time this type of linking is called Static where as Dynamic is of Two types 1.Dynamic Loading Linking 2.Dynamic Runtime Linking.These are Described Below
Dynamic linking refers to linking while runtime where library files are brought to primary memory and linked ..(Irrespective of Function call these are linked).
Dynamic Runtime Linking refers to linking when required,that means whenever there is a function call happening at that time linking During runtime..Not all Functions are linked and this differs in Code writing .

Is the static loading of shared libraries linked like dynamic loading or static linking?

According to this expert,
Dynamic loading refers to mapping (or less often copying) an executable or library into a process's memory after is has started. Dynamic linking refers to resolving symbols - associating their names with addresses or offsets - after compile time.
Hence, correspondingly: static loading refers to mapping an executable or libary into memory before it has started, and static linking refers to resolving symbols at compile time.
Now, when you do static loading and static linking of a library, the library's binary code is appended to your binary code, and the (function and variable) references your binary code makes to the library are patched (not sure if that's the correct term) so that they point to the correct positions.
This means that before static linking a call to a function
foo()
would give you (in x86 ASM), among others, an instruction like:
call 0x00000000
and after static linking you have something like:
call 0x00001043
where 0x00001043 is the entry point of the function foo in the binary code that is output by the linker.
Now, when you do dynamic loading and dynamic linking, you will call a library function by way of a function pointer:
typedef int (*fun_ptr)(void);
library = dlopen("mylib.so");
fun_ptr foo = dlsym(library, "foo");
foo();
This mechanism is also how C++ virtual methods work. The address of the method to be called is resolved at runtime by making a function pointer to the method part of the instance (stored in the so-called vtable).
My question is this:
When you do static loading and dynamic linking of a shared library (for context, let's say a .so in linux), does this linking patch my binary's references like in the static loading & linking scenario, or does it work by way of function pointers like in the case of dynamic loading & linking and C++ virtual methods?
When you do static loading and dynamic linking of a shared library
You don't do 'static loading' of a shared library.
Even though it looks to you as an end-user that e.g. libc.so.6 is 'static loaded' at process startup, it is not in fact the case. Rather, the kernel 'static loads' the main binary and ld-linux.so, and then ld-linux dynamic loads all the other shared libraries.
does this linking patch my binary's references like in the static loading & linking scenario, or does it work by way of function pointers like in the case of dynamic loading
It depends.
Usually shared libraries are linked from position-independent code (PIC), and work the way of function pointers (the pointers are stored in the GOT -- global offset table).
But sometimes shared libraries are linked from non-PIC code, and require "text relocations", which work similar to "static linking".
I have been able to produce an example of runtime static linking using libbfd (which is the library sitting beneath GNU binutils' ld linker): https://github.com/bloff/runtime-static-linking-with-libbfd

Dynamic library (.so) : Link time and run time

What is the main difference between using a .so file a link time or using it at run time (dlopen() etc) ?
What kind of validation is performed when used at link time ?
What is the role of the header file that lists the methods exposed out of .so and used in the target binary ?
How does the address space looks in both the cases ?
At link time the compiler will only check that the symbols are available in the library and will specify which library to link against later (in case you are linking against multiple libraries providing the same symbol)
The header file will only tell the compiler of what function prototypes are available and (depending on the programming language used) how to translate them into symbols. e.g. C++ extern "C".
If you are linking against a library then the linker will create a global address translation table in the executable which will be populated with the symbol addresses at runtime when the library is loaded. If you are opening the library with dlopen then you are responsible of creating variables holding the symbol pointers yourself via dlsym, but it allows you more flexibility like e.g. changing those during runtime, loading plugins or other functions that are unavailable at compile time.

Do I need static libraries to statically link?

On 'C', Linux,
Do I need static libraries to statically link, or the shared ones I have suffice?
If not, why not? (Don't they contain the same data?)
Yes, you need static libraries to build a statically linked executable.
Static libraries are bundles of compiled objects. When you statically link with to library, it is effectively the same as taking the compilation results of that library, unpacking them in your current project, and using them as if they were your own objects.
Dynamic libraries are already linked. This means that some information like relocations have already been fixed up and thrown out.
Additionally, dynamic libraries must be compiled as position-independent code. This is not a restriction on static libraries, and results in a significant difference in performance on some common platforms (like x86).
There exist tools like ELF Statifier which attempt to bundle dynamically-linked libraries into a dynamically-linked executable, but it is very difficult to generate a correctly-working result in all circumstances.
There is no such thing as static compilation, only static linking. And for that, you need static libraries. The difference between static and dynamic linking is that with the former, names are resolved at link-time (just after compile-time), wheras with the latter, they are resolved just as the program starts running.
Static and dynamic libraries may or may not contain the same information, depending on lots of factors. The decision on whether to statically or dynamically link your code is an important one, and will often influence application architecture.
All libraries you link into a statically linked program must be the static variant. While the dynamic (libfoo.so) and static (libfoo.a) libraries have the same functions in them, they are different format files and so you need the matching type for your program.
Another option is Ermine (http://magicErmine.com)
It's like statifier, but able to deal with memory randomization.

Resources