Combining static libraries

Combining static libraries - c

Suppose I have three C static libraries say libColor.a which depends on *libRGB.*a which in turn depends on libPixel.a . The library libColor.a is said to depend on library libRGB.a since there are some references in libColor.a to some of symbols defined in libRGB.a. How do I combine all the above libraries to a new libNewColor.a which is independent?
Independent means the new library should have all symbols defined. So while linking I just need to give -lNewColor. The size of the new library should be minimal i.e it should not contain any symbols in libRGB.a which is not used by libColor.a etc.
I tried my luck using various options in ar command (used to create and update static libraries/archives).

A little used feature of the GNU archiver is the archive script, it is a simple but powerful interface, and it can do exactly what you want, for example if the following script is called script.ar:
CREATE libNewColor.a
ADDLIB libColor.a
ADDLIB libRGB.a
ADDLIB libPixel.a
SAVE
END
Then you could invoke ar as follows:
ar -M < script.ar
and you would get libNewColor.a that contains all of the .o files from libColor.a libRGB.a and libPixel.a.
Additionally you can also add regular .o files as well with the ADDMOD command:
CREATE libNewColor.a
ADDLIB libColor.a
ADDLIB libRGB.a
ADDLIB libPixel.a
ADDMOD someRandomCompiledFile.o
SAVE
END
Furthermore it is super easy to generate these scripts in Makefiles, so I typically create a somewhat generic makefile rule for creating archives which actually generates the script and invokes ar on the script. Something like this:
$(OUTARC): $(OBJECTS)
$(SILENT)echo "CREATE $#" > $(ODIR)/$(ARSCRIPT)
$(SILENT)for a in $(ARCHIVES); do (echo "ADDLIB $$a" >> $(ODIR)/$(ARSCRIPT)); done
$(SILENT)echo "ADDMOD $(OBJECTS)" >> $(ODIR)/$(ARSCRIPT)
$(SILENT)echo "SAVE" >> $(ODIR)/$(ARSCRIPT)
$(SILENT)echo "END" >> $(ODIR)/$(ARSCRIPT)
$(SILENT)$(AR) -M < $(ODIR)/$(ARSCRIPT)
Though now that I look at it I guess it doesn't work if $(OBJECTS) is empty (i.e. if you just want to combine archives without adding extra object files) but I will leave it as an exercise for the reader to fix that issue if needed... :D
Here are the docs for this feature:
https://sourceware.org/binutils/docs/binutils/ar-scripts.html#ar-scripts

1/ Extract ALL of the object files from each library (using ar) and try to compile your code without the libraries or any of the object files. You'll probably get an absolute bucket-load of undefined symbols. If you get no undefined symbols, go to step 5.
2/ Grab the first one and find out which object file satisfies that symbol (using nm).
3/ Write down that object file then compile your code, including the new object file. You'll get a new list of undefined symbols or, if there's none, go to step 5.
4/ Go to step 2.
5/ Combine all the object files in your list (if any) into a single library (again with ar).
Bang! There you have it. Try to link your code without any of the objects but with the new library.
This whole thing could be relatively easily automated with a shell script.

A static library is not much more than an archive of some object files (.o). What you can do is extract all the objects in the two libraries (using "ar x") and then use "ar" to link them together in a new library.

Related

What does it mean to 'add an index to an archive file'?

My C textbook creates an archive using the -s command line option which adds an index to the following archive file.
ar -rcs libfile.a file1.o file2.o
However, I don't understand what an index is or it's purpose?

Converting comments into an answer.
There's a long, complicated history tied up in part with the ranlib program. Basically, for the linker to be able to scan a library efficiently, some program — either ar itself or ranlib — adds a special 'index' file which identifies the symbols defined in the library and the object file within the archive that defines each of those symbols. Many versions of ar add that automatically if any of the saved files is an object file. Some, it seems, require prodding with the -s option. Others don't add the index file at all and rely on ranlib to do the job.
The ar on macOS documents:
-s — Write an object-file index into the archive, or update an existing one, even if no other change is made to the archive. You may use this modifier flag either with any operation, or alone. Running ar s on an archive is equivalent to running ranlib on it.
I've not needed to use this option explicitly on macOS for a long time (nor have I run ranlib) — I think things changed sometime in the middle of the first decade of the millennium.
Don't object files already come with symbol tables? Wouldn't it seem redundant to add another symbol table to the start of the archive file?
Each object file contains information about what's in that one object file (and information about referenced objects as well as defined ones); the archive index contains much simpler information about which of the many object files in the archive defines each symbol, so that the linker doesn't have to scan each object file in the archive separately.
So, would it be correct to say that the index at the start of the archive is just a giant symbol table which replaces the individual symbol tables in each object file so the linker has an easier job?
Not replaces — augments. It allows the linker to identify which object file(s) to pull into the linked executable. The linker then processes the object files just as it does any other object file, checking for definitions and references, marking newly defined references as satisfied and identifying previously unused references that are not yet defined. But the linker doesn't have to read every file in the archive to work out which symbols are defined by the file — it knows from the index file which ones are defined.
To clarify the index allows the linker to find the specific object file which defines a symbol rather than having to scan every object file to resolve a symbol?
Yes, that’s right.

CMake: multiple targets use the same source file

add_library(target1 funtion.c target1.c )
add_library(target2 funtion.c target2.c )
add_executable(main.out main.c)
target_link_libraries(main.out target1 target2 ${LDFLAGS})
Here is my CMakeLists.txt above.
Both targets need to use the source file function.c. It is able to run though. My concern is that maybe it is not a good behavior for writing CMakeList.txt?

It's totally fine to use the same source file whatever number of times. Sometimes it's even necessary, if you want to compile the same source with different pre-processor/compiler flags.
But if you are concerned with compilation time, you could:
move funtion.c to separate static library and link target1 and target2 libraries against it.
Use object library for function.c and archive output object file to target1 and target2.

Either you have not given enough information in your question or adding function.c to target1 and target2 will not work when you will link them together with main.out because you will have duplicated symbols.
If you are sure they will be no duplicated symbols (for instance because function.c is built with different compilation flags) then your example is correct.

UNIX: Static library linked to a static library [duplicate]

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
How to pack multiple library archives (.a) into one archive file?
I have a situation where I must provide only a single static library (.a file) to an executable file to build it.
However, I split this lib in 2 parts because one part is common to other executable files and the other is needed only by one.
So now I have lib1 (for exe1) and lib2 (for all exes)
The problem is that I can't provide two libs, so I must merge for exe1, lib2 into lib1
I tried my compiling the lib1.o with -llib2 but even if it works, it looks like if nothing happened
Are there any other way? I'm can only think about using raw object files but I don't like this idea

There's no need for two static libraries; when a static library is used, only the functions (or variables) that are needed are copied to the executable - unlike a shared library where everything in the library is accessible to the executable.
Mechanically, the other question referenced describes what you need to do:
Extract all the object files from one library
Add them to the other library
Or:
files=$(ar t lib1.a)
ar x lib1.a
ar r lib2.a $files
rm -f $files lib1.a

You can even compile each source file, produce all .o and create two different libs by using ar.
The whole library will be produced using all .o (the ones you put in lib1.a and lib2.a together), the smaller one will use just a reduced set of .o files.
Than... a single Makefile, .o files produced once, two libraryes coming out from this job: the complete one (libaplus2.a) and the reduced one (lib1.a).

Is it possible to get CMake to build both a static and shared library at the same time?

Same source, all that, just want a static and shared version both. Easy to do?

Yes, it's moderately easy. Just use two "add_library" commands:
add_library(MyLib SHARED source1.c source2.c)
add_library(MyLibStatic STATIC source1.c source2.c)
Even if you have many source files, you can place the list of sources in a Cmake variable, so it's still easy to do.
On Windows you should probably give each library a different name, since there is a ".lib" file for both shared and static. But on Linux and Mac you can even give both libraries the same name (e.g. libMyLib.a and libMyLib.so):
set_target_properties(MyLibStatic PROPERTIES OUTPUT_NAME MyLib)
But I don't recommend giving both the static and dynamic versions of the library the same name. I prefer to use different names because that makes it easier to choose static vs. dynamic linkage on the compile line for tools that link to the library. Usually I choose names like libMyLib.so (shared) and libMyLib_static.a (static). (Those would be the names on linux.)

Since CMake version 2.8.8, you can use "object libraries" to avoid the duplicated compilation of the object files. Using Christopher Bruns' example of a library with two source files:
# list of source files
set(libsrc source1.c source2.c)
# this is the "object library" target: compiles the sources only once
add_library(objlib OBJECT ${libsrc})
# shared libraries need PIC
set_property(TARGET objlib PROPERTY POSITION_INDEPENDENT_CODE 1)
# shared and static libraries built from the same object files
add_library(MyLib_shared SHARED $<TARGET_OBJECTS:objlib>)
add_library(MyLib_static STATIC $<TARGET_OBJECTS:objlib>)
From the CMake docs:
An object library compiles source files but does not archive or link
their object files into a library. Instead other targets created by
add_library() or add_executable() may reference the objects using an
expression of the form $<TARGET_OBJECTS:objlib> as a source, where
objlib is the object library name.
Simply put, the add_library(objlib OBJECT ${libsrc}) command instructs CMake to compile the source files to *.o object files. This collection of *.o files is then referred to as $<TARGET_OBJECT:objlib> in the two add_library(...) commands that invoke the appropriate library creation commands that build the shared and static libraries from the same set of object files. If you have lots of source files, then compiling the *.o files can take quite long; with object libraries you compile them only once.
The price you pay is that the object files must be built as position-independent code because shared libraries need this (static libs don't care). Note that position-independent code may be less efficient, so if you aim for maximal performance then you'd go for static libraries. Furthermore, it is easier to distribute statically linked executables.

There is generally no need to duplicate ADD_LIBRARY calls for your purpose. Just make use of
$> man cmake | grep -A6 '^ *BUILD_SHARED_LIBS$'
BUILD_SHARED_LIBS
Global flag to cause add_library to create shared libraries if on.
If present and true, this will cause all libraries to be built shared unless the library was
explicitly added as a static library. This variable is often added to projects as an OPTION
so that each user of a project can decide if they want to build the project using shared or
static libraries.
while building, first (in one out-of-source directory) with -DBUILD_SHARED_LIBS:BOOL=ON, and with OFF in the other.

Please be aware that previous answers won't work with MSVC:
add_library(test SHARED ${SOURCES})
add_library(testStatic STATIC ${SOURCES})
set_target_properties(testStatic PROPERTIES OUTPUT_NAME test)
CMake will create test.dll together with test.lib and test.exp for shared target. Than it will create test.lib in the same directory for static target and replace previous one. If you will try to link some executable with shared target it will fail with error like:
error LNK2001: unresolved external symbol __impl_*.`.
Please use ARCHIVE_OUTPUT_DIRECTORY and use some unique output directory for static target:
add_library(test SHARED ${SOURCES})
add_library(testStatic STATIC ${SOURCES})
set_target_properties(
testStatic PROPERTIES
OUTPUT_NAME test
ARCHIVE_OUTPUT_DIRECTORY testStatic
)
test.lib will be created in testStatic directory and won't override test.lib from test target. It works perfect with MSVC.

It's possible to pack eveything in the same compilation breath, as suggested in the previous answers, but I would advise against it, because in the end it's a hack that works only for simple projects. For example, you may need at some point different flags for different versions of the library (esp. on Windows where flags are typically used to switch between exporting symbols or not). Or as mentionned above, you may want to put .lib files into different directories depending on whether they correspond to static or shared libraries. Each of those hurdles will require a new hack.
It may be obvious, but one alternative that has not been mentionned previously is to make the type of the library a parameter:
set( ${PROJECT_NAME}_LIBTYPE CACHE STRING "library type" )
set_property( CACHE ${PROJECT_NAME}_LIBTYPE PROPERTY STRINGS "SHARED;STATIC" )
add_library( ${PROJECT_NAME} ${PROJECT_NAME}_LIBTYPE ${SOURCE_FILES} )
Having shared and static versions of the library in two different binary trees makes it easier to handle different compilation options. I don't see any serious drawback in keeping compilation trees distinct, especially if your compilations are automated.
Note that even if you intend to mutualize compilations using an intermediate OBJECT library (with the caveats mentionned above, so you need a compelling reason to do so), you could still have end libraries put in two different projects.

What exactly does "ar" utility do?

I don't really understand what ar utility does on Unix systems.
I know it can be somehow used for creating c libraries, but all that man page tells me is that it is used to make archives from files, which sounds similar to, for example, tar....

The primary purpose is to take individual object files (*.o) and bundle them together into a static library file (*.a). The .a file contains an index that allows the linker to quickly locate symbols in the library.
Tar doesn't create files that linkers understand.

ar is a general purpose archiver, just like tar. It just "happens" to be used mostly for creating static library archives, one of its traditional uses, but you can still use it for general purpose archiving, though tar would probably be a better choice. ar is also used for Debian .deb packages.

Exactly, ar is an archiver. It simply takes a set of object files (*.o) and put them in an archive that you call a static library.

It takes code in the form of object files (.obj, .o, etc) and makes a static library (archive). The library can then be included when linking with ld to include the object code into your executable.
Take a look at the example usage in the Wikipedia article.

You might want to run man ar to get the full picture. Here's a copy of that on the web.
To quote:
The GNU ar program creates, modifies, and extracts from archives. An
archive is a single file holding a collection of other files in a
structure that makes it possible to retrieve the original individual
files (called members of the archive).
ar is considered a binary utility because archives of this sort are
most often used as libraries holding commonly needed subroutines.

ar is specifically for archives (or libraries) of object code; tar is for archives of arbitrary files. Anybody's guess why GNU refers to these as 'archives', in other environments this utility is called the 'librarian', and the resulting files just libraries.