Shared objects overhead

Shared objects overhead - c

We have a very modular application with a lot of shared objects (.so). Some people argue that on low-end platforms with limited memory/flash, it is better to statically link everything into one big executable as shared objects have overhead.
What is your opinion on this ?
Best Regards,
Paul

The costs of shared libraries are roughly (per library):
At least 4k of private dirty memory.
At least 12k of virtual address space.
Several filesystem access syscalls, mmap and mprotect syscalls, and at least one page fault at load time.
Time to resolve symbol references in the library code.
Plus position-independent code costs:
Loss of one general-purpose register (this can be huge on x86 (32bit) but mostly irrelevant on other archs).
Extra level of indirection accessing global/static variables (and constants).
If you have a large long-running application, the costs may not matter to you at all, unless you're on a tiny embedded system. On the other hand, if you're writing something that may be invoked many times for short tasks (like a language interpreter) these costs can be huge. Putting all of the standard modules in their own .so files rather than static linking them by default is a huge part of why Perl, Python, etc. are so slow to start.
Personally, I would go with the strategy of using dynamic loaded modules as a extensibility tool, not as a development model.

Unless memory is extremely tight, the size of one copy of these files is not the primary determining factor. Given that this is an embedded system, you probably have a good idea of what applications will be using your libraries and when. If your application opens and closes the multiple libraries it references dutifully, and you never have all the libraries open simultaneously, then the shared library will be a significant savings in RAM.
The other factor you need to consider is the performance penalty. Opening a shared library takes a small (usually trivial) amount of time; if you have a very slow processor or hard-to-hit real-time requirements, the static library will not incur the loading penalty of the shared library. Profile to find whether this is significant or not.
To sum up, shared libraries can be significantly better than static libraries in some special cases. In most cases, they do little to no harm. In simple situations, you get no benefit from shared libraries.
Of course, the shared library will be a significant savings in Flash if you have multiple applications (or versions of your application) which use the same library. If you use a static library, one copy (which is about the same size as the shared library[1]) will be compiled into each. This is useful when you're on a PC workstation. But you knew that. You're working with a library which is only used by one application.
[1] The memory difference of the individual library files is small. Shared libraries add an index and symbol table so that dlopen(3) can load the library. Whether or not this is significant will depend on your use case; compile for each and then compare the sizes to determine which is smaller in Flash. You'll have to run and profile to determine which consumes more RAM; they should be similar except for the initial loading of the shared library.

Having lots of libraries of course means more meta-data must be stored, and also some of that meta-data (library section headers etc.) will need to be stored in RAM when loaded. But the difference should be pretty negligible, even on (moderately modern) embedded systems.
I suggest you try both alternatives, and measure the used space in both FLASH and RAM, and then decide which is best.

Related

Which uses more RAM at run time, dynamic linking or static linking?

I know that dynamic linking are smaller on disk but do they use more RAM at run time. Why if so?

The answer is "it depends how you measure it", and also "it depends which platform you're running on".
Static linking uses less runtime RAM, because for dynamic linking the entire shared object needs to be loaded into memory (I'll be qualifying this statement in a second), whilst with static linking only those functions you actually need are loaded.
The above statement isn't 100% accurate. Only the shared object pages that actually contain code you use are loaded. This is still much less efficient than statically linking, which compresses those functions together.
On the other hand, dynamic linking uses much less runtime RAM, as all programs using the same shared object use the same in RAM copy of the code (I'll be qualifying this statement in a second).
The above is a true statement on Unix like systems. On Windows, it is not 100% accurate. On Windows (at least on 32bit Intel, I'm not sure about other platforms), DLLs are not compiled with position independent code. As such, each DLL carries the (virtual memory) load address it needs to be loaded at. If one executable links two DLLs that overlap, the loader will relocate one of the DLLs. This requires patching the actual code of the DLL, which means that this DLL now carries code that is specific to this program's use of it, and cannot be shared. Such collisions, however, should be rare, and are usually avoidable.
To illustrate with an example, statically linking glibc will probably cause you to consume more RAM at run time, as this library is, in all likelihood, already loaded in RAM before your program even starts. Statically linking some unique library only your program uses will save run time RAM. The in-between cases are in-between.

Different processes calling the same dll/so file can share the read-only memory pages this includes code or text pages.
However each dll loaded in a given peogram has to have its own page for writable global or static data. These pages may be 4/16/64k or bigger depending on the OS. If one statically linked, the static data can be shared in one pages.

Programs, when running on common operating systems like Linux, Windows, MacOSX, Android, ...., are running as processes having some virtual address space. This uses virtual memory (implemented by the kernel driving the MMU).
Read a good book like Operating Systems: Three Easy Pieces to understand more.
So programs don't consume directly RAM. The RAM is a resource managed by the kernel. When RAM becomes scarce, your system experiments thrashing. Read also about the page cache and about memory overcommitment (a feature that I dislike and that I often disable).
The advantage of using a shared library, when the same library is used by several processes, is that its code segment is appearing (technically is paged) only once in RAM.
However, dynamic linking has a small overhead (even in memory), e.g. to resolve relocations. So if a library is used by only one process, that might consume slightly more RAM than if it was statically linked. In practice you should not bother most of the time, and I recommend using dynamic linking systematically.
And in practice, for huge processes (such as your browser), the data and the heap consumes much more RAM than the code.
On Linux, Drepper's paper How To Write Shared Libraries explains a lot of things in details.
On Linux, you might use proc(5) and pmap(1) to explore virtual address spaces. For example, try cat /proc/$$/maps and cat /proc/self/maps and pmap $$ in a terminal. Use ldd(1) to find out the dynamic libraries dependencies of a program, e.g. ldd /bin/cat. Use strace(1) to find out what syscalls(2) are used by a process. Those relevant to the virtual address space include mmap(2) and munmap, mprotect(2), mlock(2), the old sbrk(2) -obsolete- and execve(2).

Embedded systems: static or dynamic linking

For Embedded systems where the program runs independently on a micro-controller :
Are the programs always static linked ? or in certain occasions it may be dynamic linked ?

From Wikipedia:
a dynamic linker is the part of an operating system that loads and
links the shared libraries needed by an executable when it is executed
(at "run time"), by copying the content of libraries from persistent
storage to RAM, and filling jump tables and relocating pointers.
So it implies that dynamic linking is possible only if:
1) You have a some kind of OS
2) You have some kind of persistent storage / file system.
On a bare-metal micros it is usually not the case.

Simply put: if there is a full-grown operation system like Linux running on the microcontroller, then dynamic linking is possible (and common).
Without such an OS, you very, very likely use static linking. For this the linker will (basically) not only link the modules and libraries, but also includes the functions which are done by the OS program loader.
Lets stay at these (smaller) embedded systems for now.
Apart from static or dynamic linking, the linker also does relocation. This does - simply put - change internal (relative) addresses of the program to the absolute addresses on the target device.

It is not common on simple embedded systems primarily because it is neither necessary nor supported by the operating system (if any). Dynamic linking implies a certain amount of runtime operating system support.
The embedded systems RTOS VxWorks supports dynamic linking in the sense that it can load and link partially linked object code from a network or file system at runtime. Similarly Larger embedded RTOSs such as QNX support dynamic linking, as does embedded Linux.
So yes large embedded systems may support dynamic linking. In many cases it is used primarily as a means to link LGPL licensed code to a closed source application. It can also be used as a means of simplifying and minimising the impact of deploying changes and updates to large systems.

Does the code have an increase in performance if it is built using a single static library instead of multiple static libraries?

is it possible to have better performance in the code with all routines within the same library.
or, to rephrase it, does the performance of the code get degraded when some part of the code is moved into another library?

Question, would your program be running only once or would be run frequently?
If its the former, and if we assume the shared libraries are not in memory then yes, the static binary would have slight increase in performance that too only by a matter of milli-seconds.
Most likely if you are linking against libc or msvcrt (on Windows) those are already in memory and you don't save much anyways other than just having a huge binary.
Let us consider the latter case ...
I don't think that the performance improvements are worthwhile to statically build and have a huge binary. If your applications uses common shared libraries (or DLLs), then all of those libraries would have been loaded in memory already.
Hope that helps.
See here for additional responses Static vs. Dynamic Library Performance.

Program location in the memory and static/shared libraries

When I run a program (in linux) does it all get loaded into the physical memory? If so, is using shared libraries, instead of static libraries, help in terms of caching? In general, when should I use shared libraries and when should I use static libraries? My codes are either written in C or in C++ if that matters.

This article hits covers some decent ground on what you want. This article goes much deeper about the advantages of shared libraries
SO also has covered this topic in depth
Difference between static and shared libraries?
When to use dynamic vs. static libraries
Almost all the above mentioned articles are shared library biased. Wikipedia tries to rescue static libraries :)
From wiki,
There are several advantages to statically linking libraries with an
executable instead of dynamically linking them. The most significant
is that the application can be certain that all its libraries are
present and that they are the correct version. This avoids dependency
problems. Usually, static linking will result in a significant
performance improvement.
Static linking can also allow the application
to be contained in a single executable file, simplifying distribution
and installation.
With static linking, it is enough to include those
parts of the library that are directly and indirectly referenced by
the target executable (or target library).
With dynamic libraries, the
entire library is loaded, as it is not known in advance which
functions will be invoked by applications. Whether this advantage is
significant in practice depends on the structure of the library.

Shared libraries are used mostly when you have functionality that could be used and "shared" across different programs. In that case, you will have a single point where all the programs will get their methods. However, this creates a dependency problem since now your compiled programs are dependent on that specific version of the library.
Static libraries are used mostly when you don't want to have dependency issues and don't want your program to care which X or Y libraries are installed on your target system.
So, which one to use?. for that you should answer the following questions:
Will your program be used on different platforms or Linux distributions? (e.g. Red Hat, Debian, SLES11-SP1)
Do you have replicated code that is being used by different binaries?
Do you envision that in the future other programs could benefit from your work?
I think this is a case by case decision, and it is not a one size fits all kind of answer.

Are there operating systems that aren't based off of or don't use a file/directory system?

It seems like there isn't anything inherent in an operating system that would necessarily require that sort of abstraction/metaphor.
If so, what are they? Are they still used anywhere? I'd be especially interested in knowing about examples that can be run/experimented with on a standard desktop computer.

Examples are Persistent Haskell, Squeak Smalltalk, and KeyKOS and its descendants.
It seems like there isn't anything inherent in an operating system
that would necessarily require that sort of abstraction/metaphor.
There isn't any necessity, it's completely bogus. In fact, forcing everything to be accessible via a human readable name is fundamentally flawed, and precludes security due to Zooko's triangle.
Examples of hierarchies similar to this appear as well in DNS, URLs, programming language module systems (Python and Java are two good examples), and torrents, X.509 PKI.
One system that fixes some of the problems caused by DNS/URLs/X.509 PKI is Waterken's YURL.
All these systems exhibit ridiculous problems because the system is designed around some fancy hierarchy instead of for something that actually matters.
I've been planning on writing some blogs explaining why these types of systems are bad, I'll update with links to them when I get around to it.

I found this http://pages.stern.nyu.edu/~marriaga/papers/beyond-the-hfs.pdf but it's from 2003. Is something like that what you are looking for?

About 1995, I started to design an object oriented operating system
(SOOOS) that has no file system.
Almost everything is an object that exists in virtual memory
which is mapped/paged directly to the disk
(either local or networked, I.e. redudimentary cloud computing).
There is a lot of overhead in programs to read and write data in specific formats.
Image never reading and writing files.
In SOOOS there are no such things as files and directories,
Autonomous objects, which would essentially replace files, can be organized
suiting your needs, not simply a restrictive hierarchical file system.
There are no low level drive format structures (I.e. clusters)
that have additional level of abstraction and translation overhead.
SOOOS Data storage overhead is simply limited to page tables
that can be quickly indexed as with basic virtual memory paging.
Autonomous objects each have their own dynamic
virtual memory space which serves as the persistent data store.
When active they are given a task context and added to the active process task list
and then exist as processes.
A lot of complexity is eliminated in my design, simply instanciate objects
in a program and let the memory manager and virtual memory system handle
everything consistently with minimal overhead.
Booting the operating system is simply a matter of loading the basic kernal
setting up the virtual memory page tables to the key OS objects and
(re)starting the OS object tasks. When the computer is turned-off,
shutdown is essentially analogous to hibernation
so the OS is nearly in instant-on status,
The parts (pages) of data and code are loaded only as needed.
For example to edit a document, instead of starting a program by loading the entire
executable in memory, simply load the task control structure of the
autonomous object and set the instruction pointer to the function to be performed.
The code is paged in only as the instruction pointer traverses its virtual memory.
Data is always immediately ready to be used and simply paged in only as accessed
with no need to parse files and manage data structures which often
have a distict represention in memory from secondary storage.
Simply use the program's native memory allocation mechanism and
abstract data types without disparate and/or redundent data structures.
Object Linking and Embedding type of program interaction,
memory mapped IO, and interprocess communication you
get practically for free as one would implement
memory sharing using the facilities of the processor's Memory Management Unit.