I am creating a shared object library which will be LD_PRELOADed with my program. In that shared library, I also want to use some variables from my program. What is the way of declaring such variables. Note that shared object library is compiled separately from my program.
Yes. You must link your program with --export-dynamic to make the symbol table of the program accessible to the libraries opened. If you wish to control exactly which symbols are available and using libtool for linking, you can use parameters like -export-symbols-regex to specify which are available. If the symbols required by the library are not available when the program loads, it will fail with an undefined symbol. Some platforms require slightly different link flags (especially Windows). Consider using libtool to make this easier if you are not already.
Related
I'm building a shared library. I need only one function in it to be public.
The shared library is built from a few object files and several static libraries. The linker complains that everything should be build with -fPIC. All the object files and most static libraries were built without this option.
This makes me ask a number of questions:
Do I have to rebuild every object file and every static library I need for this dynamic lib with -fPIC? Is it the only way?
The linker must be able to relocate object files statically, during linking. Correct? Otherwise if object files used hardcoded constant addresses they could overlap with each other. Shouldn't this mean that the linker has all the information necessary to create the global offset table for each object file and everything else needed to create a shared library?
Should I always use -fPIC for everything in the future as a default option, just in case something may be needed by a dynamic library some day?
I'm working on Linux on x86_64 currently, but I'm interested in answers about any platform.
You did not say which platform you use but on Linux it's a requirement to compile object files that go into your library as position independent code (PIC). This includes static libraries at least in practice.
Yes. See load time relocation of shared libraries and position independent code pic in shared libraries.
I only use -fPIC when compiling object files that go into libraries to avoid unecessary overhead.
My application needs to load one or more algorithms at run time and I use .so for this. The thing is that these libraries are not used by any other process but my applicaiton so there is no need to share the .text section with others. Some parts of the .so come from other static libraries that I compile beforehand.
In this case, do I still have to use -fpic flag for the static files?
EDIT
I found this article article. At page 7 it states this "So, if performance is important for a library or dynamically loadable module, you can compile it as non-PIC code. The primary downside to compiling the module as non-PIC is that loading time in-creases because the dynamic linker must make a large number of code patches when binding symbols."
Yes you do. Anything that will be loaded with dlopen must be compiled using -fpic (or -fPIC).
This is not about sharing the text segment, but about the different rules for accessing global data (including things that you might not realize are global data, such as the "procedure linkage table" trampolines used to call between global functions) in the main executable versus in shared libraries.
So I'm trying to wrap my head around static and dynamic linking. There are many resources on SO and on the web. I think I pretty much get it, but there's still one thing that seems to bother me. Also, please correct me if my overall understanding is wrong.
I think I understand static linking:
The linker unpacks the linked libraries, and actually includes the libraries' object files inside the produced executable. The unresolved-stubs in the application object files are then replaced by actual function-calling code, which calls functions in addresses known at build time.
Dynamic linking on the other hand is what puzzles me more: I understand that in dynamic linking, the stubs in the object-code which reference yet-unresolved names, are going to stay as stubs until runtime.
Then at runtime, the dynamic loader of the OS would look through precompiled libraries stored at standard filesystem locations. It would look in the object-files of the libraries, inside their symbol tables (?) and try to find a matching function definition for each unresolved-stub. It would then load the matching object-files into memory, and replace the stubs to point to the function definitions.
So the part I'm missing is this: where does the OS dynamic loader look - does it look in the symbol tables for all object-files in the system-libraries directory? Or does it only look in object-files specified somewhere in the application-executable file? Is this the reason why at compile time we must specify all dynamic dependencies of our program? Also, is it true dynamic libraries expose a symbol-table too?
So the part I'm missing is this: where does the OS dynamic loader look
- does it look in the symbol tables for all object-files in the system-libraries directory?
No dynamic linker I'm aware of does this.
Or does it only look in object-files
specified somewhere in the application-executable file?
Nor exactly this, either.
Details vary, but generally, a dynamic linker looks for specific shared libraries by name in various directories. The directories searched may be built into the linker, specified by the operating system, specified in the object being linked, or a combination. The linker does not (generally) examine libraries' symbol tables until after it locates them by name and selects them for linking.
Is this the
reason why at compile time we must specify all dynamic dependencies of
our program?
Yes, though under some circumstances we do not need to specify all dynamic dependencies at compile time. Some dynamic linkers support on-demand dynamic loading as directed by the program itself. This can be used to implement plugin systems, among other purposes.
Also, is it true dynamic libraries expose a symbol-table
too?
Yes. Dynamic libraries have their own symbol tables because
The dynamic linker uses them to do its work, and
Dynamic libraries can have their own dynamic linking requirements, which are not necessarily reflected in the main program's.
In the normal usage, "dynamic linking" is performed by the loader. "Static linking" is performed by the linker.
Generally, linkers can create either executable files or shared libraries. The linker output for both is an instruction stream that tells the loaders how to place the executable or library in memory.
Dynamic linking on the other hand is what puzzles me more: I understand that in dynamic linking, the stubs in the object-code which reference yet-unresolved names, are going to stay as stubs until runtime
That is not [usually] correct. The linker will locate the shared library in which the symbol exists. The executable will have an instruction to find the symbol in that shared library. Linkers generally puke if they cannot find all the symbols that need to be resolved.
So the part I'm missing is this: where does the OS dynamic loader look - does it look in the symbol tables for all object-files in the system-libraries directory?
This a system specific question. In well designed operating systems, the shared libraries are designated by the system manager. The loader uses the library specified by the system. Poorly designed systems frequently use some kind of search path to find the shared libraries (which created a massive security hole).
I'm reading a book about gcc and the following paragraph puzzles me now:
Furthermore, shared libraries make it possible to update a library with-
out recompiling the programs which use it (provided the interface to the
library does not change).
This only refers to the programs which are not yet linked, right?
I mean, in C isn't executable code completely independent from the compiler? In which case any alteration to the library, whether its interface or implementation is irrelevant to the executable code?
A shared library is not linked until the program is executed, so the library can be upgraded/changed without recompiling (nor relinking).
EG, on Linux, one might have
/bin/myprogram
depending upon
/usr/lib64/mylibrary.so
Replacing mylibrary.so with a different version (as long as the functions/symbols that it exports are the same/compatible) will affect myprogram the next time that myprogram is started. On Linux, this is handled by the system program /lib64/ld-linux-x864-64.so.2 or similar, which the system runs automatically when the program is started.
Contrast with a static library, which is linked at compile-time. Changes to static libraries require the application to be re-linked.
As an added benefit, if two programs share the same shared library, the memory footprint can be smaller, as the kernel can “tell” that it's the same code, and not copy it into RAM twice. With static libraries, this is not the case.
No, this is talking about code that is linked. If you link to a static libary, and change the library, the executable will not pick up the changes because it contains its own copy of the original version of the library.
If you link to a shared library, also known as dynamic linking, the executable does not contain a copy of the library. When the program is run, it loads the current version of the library into memory. This allows you to fix the library, and the fixes will be picked up by all users of the library without needing to be relinked.
Libraries provide interfaces (API) to the outside world. Applications using the libraries (a .DLL for example) bind to the interface (meaning they call functions from the API). The library author is free to modify the library and redistribute a newer version as long as they don't modify the interface.
If the library authors were to modify the interface, they could potentially break all of the applications that depend on that function!
I have a dll which I'd like to use in a c program,
Do you think is efficient to have a dll (lots of common functions) and then create a program that will eventually use them, or have all the source code?
To include the dll, What syntax must be followed?
Do you think is efficient to have a dll (lots of common functions) and then create a program that will eventually use them,or have all the source code.
For memory and disk space, it is more efficient to use a shared library (a DLL is the Windows implementation of shared libraries), assuming that at least two programs use this component. If only one program will ever use this component, then there is no memory or disk space savings to be had.
Shared libraries can be slightly slower than statically linking the code; however, this is likely to be incredibly minor, and shared libraries carry a number of benefits that make it more than worthwhile (such as the ability to load and handle symbols dynamically, which allows for plugin-like architectures). That said, there are also some disadvantages (if you are not careful about where your DLLs live, how they are versioned, and who can update them, then you can get into DLL hell).
To include the dll, What syntax must be followed?
This depends. There are two ways that shared libraries can be used. In the first way, you tell the linker to reference the shared library, and the shared library will automatically be loaded on program startup, and you would basically reference the code like normal (include the various headers and just use the name of the symbol when you want to reference it). The second way is to dynamically load the shared library (on Windows this is done via LoadLibrary while it is done on UNIX with dlopen). This second way makes it possible to change the behavior of the program based on the presence or absence of symbols in the shared library and to inspect the available set of symbols. For the second way, you would use GetProcAddress (Windows) or dlsym (UNIX) to obtain a pointer to a function defined in the library, and you would pass around function pointers to reference the functions that were loaded.
You can put your functions into either a static library ( a .lib) which is merged into your application at compile time and is basically the same as putting the .c files in the project.
Or you can use a dll where the functions are included at run time. the advantage of a dll is that two programs which use the same functions can use the same dll (saving disk space) and you can upgrade the dll without changing the program - neither of these probably matters for you.
The dll is automatically loaded when your program runs there is nothing special you need to do to include it ( you can load a dll specifically in your code - there are sometimes special reasons to do this)
Edit - if you need to create a stub lib for an existing dll see http://support.microsoft.com/kb/131313