Program location in the memory and static/shared libraries - c

When I run a program (in linux) does it all get loaded into the physical memory? If so, is using shared libraries, instead of static libraries, help in terms of caching? In general, when should I use shared libraries and when should I use static libraries? My codes are either written in C or in C++ if that matters.

This article hits covers some decent ground on what you want. This article goes much deeper about the advantages of shared libraries
SO also has covered this topic in depth
Difference between static and shared libraries?
When to use dynamic vs. static libraries
Almost all the above mentioned articles are shared library biased. Wikipedia tries to rescue static libraries :)
From wiki,
There are several advantages to statically linking libraries with an
executable instead of dynamically linking them. The most significant
is that the application can be certain that all its libraries are
present and that they are the correct version. This avoids dependency
problems. Usually, static linking will result in a significant
performance improvement.
Static linking can also allow the application
to be contained in a single executable file, simplifying distribution
and installation.
With static linking, it is enough to include those
parts of the library that are directly and indirectly referenced by
the target executable (or target library).
With dynamic libraries, the
entire library is loaded, as it is not known in advance which
functions will be invoked by applications. Whether this advantage is
significant in practice depends on the structure of the library.

Shared libraries are used mostly when you have functionality that could be used and "shared" across different programs. In that case, you will have a single point where all the programs will get their methods. However, this creates a dependency problem since now your compiled programs are dependent on that specific version of the library.
Static libraries are used mostly when you don't want to have dependency issues and don't want your program to care which X or Y libraries are installed on your target system.
So, which one to use?. for that you should answer the following questions:
Will your program be used on different platforms or Linux distributions? (e.g. Red Hat, Debian, SLES11-SP1)
Do you have replicated code that is being used by different binaries?
Do you envision that in the future other programs could benefit from your work?
I think this is a case by case decision, and it is not a one size fits all kind of answer.

Related

Static vs. Shared library for sharing RTOS library with third party (so without source code)

I know similar questions have been asked, but it's still unclear to me.
I have written a library with multiple drivers and modules for Zephyr RTOS. Now I would like to share part of that library with a company, but not as source code. The idea is to compile the relevant source code for the specific hardware they have, and then share it. This way I can control for which products it's used, and of course I don't want to share my source code with them.
At first I have tried just sharing a static library, but that didn't compile for them. Shared libraries are not yet supported by Zephyr's CMake extensions, hence I haven't tried that yet. If it's the way to go I will dive into it.
What are my options? Shared library vs. static library (+ object files?)? What is recommended?
More info
Zephyr uses Device Trees. Hence, the drivers / modules I provide are compiled for a specific hardware target. I would like the company to provide me with the relevant hardware definitions so that I can provide them a pre-compiled library of my drivers/modules that works for their specific target. This library might have to be updated sometimes to include bugfixes / new functionality.
As the binaries are compiled with application + library, what would the trade-off be for Static vs. Shared library?
I think shared libraries and header files is the way to go . As they offer advantages. Like smaller binary size and flexibility to update. You can find a nice description here.
https://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html
C++ recommends using shared libraries because it provides flexibility on Linux or Linux like systems
https://www.bogotobogo.com/cplusplus/libraries.php

linking, loading, and virtual memory

I know these questions have been asked before - but I still can't reconcile everything together into an overall picture.
static vs dynamic library
static libraries have their code copied and linked into the resulting executable
static libraries have only copy and link the required modules into the executable, not the entire library implementation
static libraries don't need to be compiled as PIC as they are apart of the resulting executable
dynamic libraries copy and link in stubs that describe how to load/link (?) the function implementation at runtime
dynamic libraries can be PIC or relocatable
why are there separate static and dynamic libraries? All of the above seems to be be the job of the static or dynamic linker. Why do I need 2 libraries that implement scanf?
(bonus #1) what does a shared library refer to? I've heard it being used as (1) the overall umbrella term, synonymous to library, (2) directly to a dynamic library, (3) using virtual memory to map the same physical memory of a library to multiple address spaces. Can you do this only with dynamic libraries? (4) having different versions of the same dynamic library in memory.
(bonus #2) are the standard libraries (libc, libc++, stdlibc++, ..) linked dynamically or statically by default? I never need to dlopen()..
static vs dynamic linking
how is this any different than static vs dynamic libraries? I don't understand why there isn't just 1 library, and we use either a static or dynamic linker (other than the PIC issue). Instead of talking about static vs dynamic libraries, should we instead be discussing the more general static s dynamic linking?
is symbol resolution still performed at compile-time for both?
static vs dynamic loading
Static loading means copying the full executable into MM before executing it
Dynamic loading means that only the executable header copied into MM before executing, additional functionality is loaded into MM when requested. How is this any different from paging?
If the executable is dynamically linked, why would it not be dynamically loaded?
both static loading and dynamic loading may or may not perform relocation
I know there are a lot of things I'm confused about here - and I'm not necessary looking for someone to address each issue. I'm hoping by listing out everything that is confusing me, that someone that understands this will see where a lapse in my understanding is at a broad level, and be able to paint a larger picture about how these things cooperate together..
why 2 types of lib loading
dynamic saves space (you dont have hundreds of copies of the same code in all binaries using foo.lib
dynamic allows foo.lib vendor can ship a new version of the library and existing code takes advantage of it
static makes dependency management easier - in theory a binary can be one file
What is 'shared library'
unix name for dynamic library. Windows calls it DLL
Are standard libraries static or dynamic
depends on platform. On some you can choose on others its chosen for you. For example on windwos there are compiler switchs to say if you want static or dynamic runtimes. Not dont confuse dynamic library usage with dlopen - see later
'why we talk about 2 different types of library'
Typically a static library is in a different format from a dynamic one. Typically a static library is input to the linker just like any other compile unit. A dynamic library is typically output by the linker. They are used differently even though they both deliver the same chunk of code to your app
Symbol resolution is finalized at load time for a DLL
Full dynamic loading. This is the realm of dlopen. This is where you want to call entry points in a library that might not have even existing when you compiled. Use cases:
plugins that conform to a well known interface but there can be many implementations (PAM and NSS are good examples). The app chooses to load one or more implementations from specified files at run time
an app needs to load a library and call an arbitrary function. Imagine how , for example , how a scripting language can load and call an arbitrary method
To use a .so on unix you dont need to use dlopen. You can have it loaded for you (Same on windows). To really dynamically load a shared lib / dll you need dlopen or LoadLibrary
Note that statically linked libraries load faster, since there is less disk searching for all the runtime library files. If the libraries are small, and very unusual, probably better to link statically. If there are serious version dependencies / functional differences like MFC, the DLLs need different names.

Does the code have an increase in performance if it is built using a single static library instead of multiple static libraries?

is it possible to have better performance in the code with all routines within the same library.
or, to rephrase it, does the performance of the code get degraded when some part of the code is moved into another library?
Question, would your program be running only once or would be run frequently?
If its the former, and if we assume the shared libraries are not in memory then yes, the static binary would have slight increase in performance that too only by a matter of milli-seconds.
Most likely if you are linking against libc or msvcrt (on Windows) those are already in memory and you don't save much anyways other than just having a huge binary.
Let us consider the latter case ...
I don't think that the performance improvements are worthwhile to statically build and have a huge binary. If your applications uses common shared libraries (or DLLs), then all of those libraries would have been loaded in memory already.
Hope that helps.
See here for additional responses Static vs. Dynamic Library Performance.

Should static libraries be always built with same compiler options as the application?

We have a reusable library which gets delivered across to multiple products. Most of the products are in VxWorks and use gcc compiler. But, each of them will be on different architectures like PPC, MIPS and in PPC itself there are more types like 8531, 8620 etc.
Currently, I am building static libs for each of these boards seperately and provide. Is there anyway that a common library can be built, which can be used across all these different architectures?
Also, currently I try to ensure that compiler options are same as that of the products. Is it necessary? Is there any information available in the internet which classifies which options are important to maintained same for static libraries and applications?
No there is no other way - you must built the libraries (static or not) for each platform.
As you probably already know static library is really just a container storing a buch of object files. Each object file contains binary code specific to platform that the library was built for (read: different set of assembly instructions).
Yes, keeping the compiler options the same when you are building a library and the binary (program) that uses it the same is a very good practice. This way you are avoiding potentially very nasty problems. Some optimization options are binary incompatible (e.g.: you may compile a function in a library with a optimization that will cause it to return (or expect) a data by register), but your main program may expect that the function returns it by address on stack - big trouble.
It depends of each option: platform and architecture options must be the same, obviously.
Another ones like optimization, debug, profiling can be different.
Imagine that a library may be provided by an external developer, so, you don't really not know how did he compiled it, only platform and architecture requirements.
Also, currently I try to ensure that compiler options are same as that of the products. Is it necessary?
'2. Necessary - no. In fact, most libraries can be considered to be standalone and not tied to any particular product (i.e. they are usable from many products). As such, per-product specific flags just don't belong into the library, or vice versa (library-implementation specific flags are not supposed to appear when compiling products' objects).

Compile shared library into a program?

I wrote a program, that uses a shared library installed on my system. This library is seldom installed on other systems. How do I compile my program so that the library doesn't need to be installed on other systems? I have the source code for the library available. What's the best way?
The other systems of course have the same architecture and OS.
Compile it as a static library and link that into the executable.
Though the OP had solved his problem by answering a different question, there are (at least) two ways to wedge a shared library into your binary in case
there is no source code available
there is no compiler (or build-chain) available
static link does not work or it's not obvious how do it
to preserve memory layout - static link will change it and may "wake-up" hidden bugs
for "permanent link" LD_PRELOAD library into executable
The first is statifier (open source but limited to x86 and x86_64 and only object code)
The second that I know of is magic ermine (by the same developer). It is closed source, but the developer is friendly to opensource projects and ermine has the advantage of supporting more platforms as well as the ability to include all necessary data files within its virtual file system.
http://statifier.sourceforge.net/ and http://www.magicermine.com/

Resources