To use or not to use -fpic - c

My application needs to load one or more algorithms at run time and I use .so for this. The thing is that these libraries are not used by any other process but my applicaiton so there is no need to share the .text section with others. Some parts of the .so come from other static libraries that I compile beforehand.
In this case, do I still have to use -fpic flag for the static files?
EDIT
I found this article article. At page 7 it states this "So, if performance is important for a library or dynamically loadable module, you can compile it as non-PIC code. The primary downside to compiling the module as non-PIC is that loading time in-creases because the dynamic linker must make a large number of code patches when binding symbols."

Yes you do. Anything that will be loaded with dlopen must be compiled using -fpic (or -fPIC).
This is not about sharing the text segment, but about the different rules for accessing global data (including things that you might not realize are global data, such as the "procedure linkage table" trampolines used to call between global functions) in the main executable versus in shared libraries.

Related

Does everything that may end up in a shared library always need to be compiled with -fPIC?

I'm building a shared library. I need only one function in it to be public.
The shared library is built from a few object files and several static libraries. The linker complains that everything should be build with -fPIC. All the object files and most static libraries were built without this option.
This makes me ask a number of questions:
Do I have to rebuild every object file and every static library I need for this dynamic lib with -fPIC? Is it the only way?
The linker must be able to relocate object files statically, during linking. Correct? Otherwise if object files used hardcoded constant addresses they could overlap with each other. Shouldn't this mean that the linker has all the information necessary to create the global offset table for each object file and everything else needed to create a shared library?
Should I always use -fPIC for everything in the future as a default option, just in case something may be needed by a dynamic library some day?
I'm working on Linux on x86_64 currently, but I'm interested in answers about any platform.
You did not say which platform you use but on Linux it's a requirement to compile object files that go into your library as position independent code (PIC). This includes static libraries at least in practice.
Yes. See load time relocation of shared libraries and position independent code pic in shared libraries.
I only use -fPIC when compiling object files that go into libraries to avoid unecessary overhead.

Why plugin dynamic library can't find the symbol in the application?

I have an application, which has been compiled several libraries into it through static link.
And this application will load a plugin through dlopen when it runs.
But it seems that the plugin can't resolve the symbol in the application, which I can find them through "nm".
So what can I do? Recompile the libraries into shared mode, and link them to the plugin?
You have to use the gcc flag -rdynamic when linking your application, which exports the symbols of the application for dynamic linkage with shared libraries.
From the gcc documentation:
Pass the flag -export-dynamic to the ELF linker, on targets that support it. This instructs the linker to add all symbols, not only used ones, to the dynamic symbol table. This option is needed for some uses of dlopen or to allow obtaining backtraces from within a program.
This should eliminate your problem.
The usual suggestion to add -rdynamic is too heavyweight in practice as it causes linker to export all functions in executable. This will slow down program startup (due to increased time for relocation processing) and, more importantly, will eventually make the interface between your application and plugins too wide so it would be hard to maintain in future (e.g. you won't be able to remove any function from your application in fear that it may be used by some unknown external plugin). Normally you should strive to expose a minimal and well-defined API for plugin authors.
I thus recommend to provide explicit exports file via -Wl,--dynamic-list when linking (see example usage in Clang sources).

At dynamic linking, does the dynamic loader look at all object files for definitions, or only at those specified by the executable?

So I'm trying to wrap my head around static and dynamic linking. There are many resources on SO and on the web. I think I pretty much get it, but there's still one thing that seems to bother me. Also, please correct me if my overall understanding is wrong.
I think I understand static linking:
The linker unpacks the linked libraries, and actually includes the libraries' object files inside the produced executable. The unresolved-stubs in the application object files are then replaced by actual function-calling code, which calls functions in addresses known at build time.
Dynamic linking on the other hand is what puzzles me more: I understand that in dynamic linking, the stubs in the object-code which reference yet-unresolved names, are going to stay as stubs until runtime.
Then at runtime, the dynamic loader of the OS would look through precompiled libraries stored at standard filesystem locations. It would look in the object-files of the libraries, inside their symbol tables (?) and try to find a matching function definition for each unresolved-stub. It would then load the matching object-files into memory, and replace the stubs to point to the function definitions.
So the part I'm missing is this: where does the OS dynamic loader look - does it look in the symbol tables for all object-files in the system-libraries directory? Or does it only look in object-files specified somewhere in the application-executable file? Is this the reason why at compile time we must specify all dynamic dependencies of our program? Also, is it true dynamic libraries expose a symbol-table too?
So the part I'm missing is this: where does the OS dynamic loader look
- does it look in the symbol tables for all object-files in the system-libraries directory?
No dynamic linker I'm aware of does this.
Or does it only look in object-files
specified somewhere in the application-executable file?
Nor exactly this, either.
Details vary, but generally, a dynamic linker looks for specific shared libraries by name in various directories. The directories searched may be built into the linker, specified by the operating system, specified in the object being linked, or a combination. The linker does not (generally) examine libraries' symbol tables until after it locates them by name and selects them for linking.
Is this the
reason why at compile time we must specify all dynamic dependencies of
our program?
Yes, though under some circumstances we do not need to specify all dynamic dependencies at compile time. Some dynamic linkers support on-demand dynamic loading as directed by the program itself. This can be used to implement plugin systems, among other purposes.
Also, is it true dynamic libraries expose a symbol-table
too?
Yes. Dynamic libraries have their own symbol tables because
The dynamic linker uses them to do its work, and
Dynamic libraries can have their own dynamic linking requirements, which are not necessarily reflected in the main program's.
In the normal usage, "dynamic linking" is performed by the loader. "Static linking" is performed by the linker.
Generally, linkers can create either executable files or shared libraries. The linker output for both is an instruction stream that tells the loaders how to place the executable or library in memory.
Dynamic linking on the other hand is what puzzles me more: I understand that in dynamic linking, the stubs in the object-code which reference yet-unresolved names, are going to stay as stubs until runtime
That is not [usually] correct. The linker will locate the shared library in which the symbol exists. The executable will have an instruction to find the symbol in that shared library. Linkers generally puke if they cannot find all the symbols that need to be resolved.
So the part I'm missing is this: where does the OS dynamic loader look - does it look in the symbol tables for all object-files in the system-libraries directory?
This a system specific question. In well designed operating systems, the shared libraries are designated by the system manager. The loader uses the library specified by the system. Poorly designed systems frequently use some kind of search path to find the shared libraries (which created a massive security hole).

linking, loading, and virtual memory

I know these questions have been asked before - but I still can't reconcile everything together into an overall picture.
static vs dynamic library
static libraries have their code copied and linked into the resulting executable
static libraries have only copy and link the required modules into the executable, not the entire library implementation
static libraries don't need to be compiled as PIC as they are apart of the resulting executable
dynamic libraries copy and link in stubs that describe how to load/link (?) the function implementation at runtime
dynamic libraries can be PIC or relocatable
why are there separate static and dynamic libraries? All of the above seems to be be the job of the static or dynamic linker. Why do I need 2 libraries that implement scanf?
(bonus #1) what does a shared library refer to? I've heard it being used as (1) the overall umbrella term, synonymous to library, (2) directly to a dynamic library, (3) using virtual memory to map the same physical memory of a library to multiple address spaces. Can you do this only with dynamic libraries? (4) having different versions of the same dynamic library in memory.
(bonus #2) are the standard libraries (libc, libc++, stdlibc++, ..) linked dynamically or statically by default? I never need to dlopen()..
static vs dynamic linking
how is this any different than static vs dynamic libraries? I don't understand why there isn't just 1 library, and we use either a static or dynamic linker (other than the PIC issue). Instead of talking about static vs dynamic libraries, should we instead be discussing the more general static s dynamic linking?
is symbol resolution still performed at compile-time for both?
static vs dynamic loading
Static loading means copying the full executable into MM before executing it
Dynamic loading means that only the executable header copied into MM before executing, additional functionality is loaded into MM when requested. How is this any different from paging?
If the executable is dynamically linked, why would it not be dynamically loaded?
both static loading and dynamic loading may or may not perform relocation
I know there are a lot of things I'm confused about here - and I'm not necessary looking for someone to address each issue. I'm hoping by listing out everything that is confusing me, that someone that understands this will see where a lapse in my understanding is at a broad level, and be able to paint a larger picture about how these things cooperate together..
why 2 types of lib loading
dynamic saves space (you dont have hundreds of copies of the same code in all binaries using foo.lib
dynamic allows foo.lib vendor can ship a new version of the library and existing code takes advantage of it
static makes dependency management easier - in theory a binary can be one file
What is 'shared library'
unix name for dynamic library. Windows calls it DLL
Are standard libraries static or dynamic
depends on platform. On some you can choose on others its chosen for you. For example on windwos there are compiler switchs to say if you want static or dynamic runtimes. Not dont confuse dynamic library usage with dlopen - see later
'why we talk about 2 different types of library'
Typically a static library is in a different format from a dynamic one. Typically a static library is input to the linker just like any other compile unit. A dynamic library is typically output by the linker. They are used differently even though they both deliver the same chunk of code to your app
Symbol resolution is finalized at load time for a DLL
Full dynamic loading. This is the realm of dlopen. This is where you want to call entry points in a library that might not have even existing when you compiled. Use cases:
plugins that conform to a well known interface but there can be many implementations (PAM and NSS are good examples). The app chooses to load one or more implementations from specified files at run time
an app needs to load a library and call an arbitrary function. Imagine how , for example , how a scripting language can load and call an arbitrary method
To use a .so on unix you dont need to use dlopen. You can have it loaded for you (Same on windows). To really dynamically load a shared lib / dll you need dlopen or LoadLibrary
Note that statically linked libraries load faster, since there is less disk searching for all the runtime library files. If the libraries are small, and very unusual, probably better to link statically. If there are serious version dependencies / functional differences like MFC, the DLLs need different names.

Difference between "dynamically loading a library file" and "specifying .so path in Makefile"?

I recently came across a code which loads a .so file with dl_open() and works with dlsym() etc. I understand that dl_open() this would load the dynamic library file. What is the difference between dynamically loading a library file and specifying .so path in Makefile?
Another question is that if I want to dynamically load a library file, do I need to compile it with -rdynamic option?
Aren't both these compiled with -fPIC flag?
Dynamically loading a library file is frequently used in implementing software plugins.
unlike specifying .so path in Makefile or static linking, dynamic linking will allow a computer program to startup in the absence of these libraries, to discover available libraries, and to potentially gain additional functionality.
link
If you statically link an .so file in your Makefile then you won't be able to build the application unless it is present. It has the advantage of no nasty surprises at run time.
When creating a shared object, assuming you are using gcc, then -fpic only means the code can be relocated at run-time, you need -shared as well. I don't know the -rdynamic option, but compilers differ.
Loading the module at run-time allows the module load to be optional. For example, say you have a huge application with 300 modules, each representing different functionality. Does it make sense to map all 300 when a user might only use 10% of them? (code is loaded on demand anyway) It can also be used to load versions from different libraries at runtime, giving flexibility. Downside is that you can end-up loading incompatible versions.

Resources