discover static library dependencies - c

I know that I can discover the header file dependencies required when building an object file using a few tools (such as gcc -MD ...)
Is there a similar way to determine the static libraries that will be used when a component is linked?
In particular I am looking at some multi-level make files with lots of indirection and I would like to just be able to get a list of the depedencies for that build so I can streamline my build system's rebuild requests.
ex:
make foo.mak
foo.mak
OBJS = bar.o \
bar2.o
DEPS = core\
msg\
utils\
EXTRA_FLAGS += -Wall -Werror
include ../common/common.mak
within common.mak
the members of DEPS will be expanded in various ways depending opn what type of build this is. they may be static, shared or even kernel libraires and they may get pre- or post- fixes.
I would want to get
ABC_core_DEF.a
GEH_msg_IJK.a
(assuming that core and msg were the only dependencies to have expanded to actual static includes and that the pre and post fixes were as shown.)

If your build system supports a mode where the compilation commands are shown (e.g., some setting like VERBOSE=1), you could try and grep this output for items looking like -l (or whatever other kind of linker options your target toolchain uses).

Related

Tracking compiler flags within a Shake Action

I need to track compiler flags used as part of a rule, e.g when supplied as arguments to a function. Does Shake have a way to track such flags as inputs in the same vein as need? As a trivial example, I would like for Shake to rebuild all .o files when the rule changes to pass -O2 to the C compiler instead of -O0.
There are two approaches to tracking things like whether to use optimisation or not.
1) Use Oracles
Oracles very closely match what you are asking for. To track something like
-O0 vs -O2 you'd need an oracle that tracks optimisation level:
newtype OptLevel = OptLevel ()
deriving (Show,Typeable,Eq,Hashable,Binary,NFData)
type instance RuleResult OptLevel = String
rules = do
addOracle $ \(OptLevel _) -> return $
if <whatever you use to decide> then "-O0" else "-O2"
"foo.o" %> \_ -> do
level <- askOracle $ OptLevel ()
cmd "gcc" level ...
Now the optimisation level is a tracked dependency which will update if anything changes. This example is based on the docs for addOracle.
2) Use different output files
For compiler flags another approach is to use different build directories, i.e. so build/opt (and build/opt/obj etc) has binaries and .o files built with -O, build/debug without, build/profile with profiling flags, and build/test with test flags. Some others, like build/doc and build/hsc, have generated files that aren't dependent on compiler flags.
The advantage of this approach is that you can keep all the files cached at once, and refreshing the debug or test build doesn't destroy the opt one.
The disadvantage is that it's just for a set of hardcoded
flags. But it's also not hard to add a new mode, you just need a new
(directory, flags) pair for it.

Statically linking libclang in C code

I'm trying to write a simple syntax checker for C code using the frontend available in libclang. Due to deployment concerns, I need to be able to statically link all the libraries in libclang, and not pass around the .so file that has all the libraries.
I'm building clang/llvm from source, and in llvm/Release+Asserts/lib I have a bunch of .a files that I think I should be able to use, but it never seems to work (the linker spews out thousands of errors about missing symbols). However, when I compile it using the libclang.so also present in that directory as follows:
clang main.c -o bin/dlc -I../llvm/tools/clang/include -L../llvm/Release+Asserts/lib/ -lclang
Everything seems to work well.
What is the minimum set of .a files I need to include to make this work? I've tried including absolutely all of the .a files in the build output directory, with them provided to clang/gcc in different orders, without any success. I only need the functions mentioned in libclang's Index.h, but there don't seem to be any resources or documentation on what the various libclang*.a files are for. It would be very helpful to know which files libclang.so pulls in.
The following is supposed to work, as long the whole project has all static libraries (I counted 116 in my Release/lib directory).
clang main.c -o bin/dlc -I../llvm/tools/clang/include ../llvm/Release/lib/*.a
[edit: clang main.c -o bin/dlc -I../llvm/tools/clang/include ../llvm/Release/lib/libclang.a ../llvm/Release/lib/*.a]
Note that the output binary is not static, so you don't need any -static flag for gcc or ld, if you're using this syntax.
If that doesn't work you might need to list the libraries in order: if some library requires a function available in another library, then it may be necessary to list it first in the command line. See comments about link order at:
http://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Link-Options.html#Link-Options

Is using --start-group and --end-group when linking faster than creating a static library?

If one builds static libraries in one's build scripts and one wants to use those static libraries in linking the final executable, the order one mentions the .a files is important:
g++ main.o hw.a gui.a -o executable
If gui.a uses something defined in hw.a the link will fail, because at the time hw.a is processed, the linker doesn't yet know that the definition is needed later, and doesn't include it in the being.generated executable. Manually fiddling around with the linker line is not practical, so a solution is to use --start-group and --end-group which makes the linker run twice through the libraries until no undefined symbols are found anymore.
g++ main.o -Wl,--start-group hw.a gui.a -Wl,--end-group -o executable
However the GNU ld manual says
Using this option has a significant performance cost. It is best to use it only when there are unavoidable circular references between two or more archives.
So I thought that it may be better to take all .a files and put them together into one .a file with an index (-s option of GNU ar) which says in what order the files need to be linked. Then one gives only that one .a file to g++.
But I wonder whether that's faster or slower than using the group commands. And are there any problems with that approach? I also wonder, is there better way to solve these interdependency problems?
EDIT: I've written a program that takes a list of .a files and generates a merged .a file. Works with the GNU common ar format. Packing together all static libs of LLVM works like this
$ ./arcat -o combined.a ~/usr/llvm/lib/libLLVM*.a
I compared the speed against unpacking all .a files manually and then putting them into a new .a file using ar, recomputing the index. Using my arcat tool, I get consistent runtimes around 500ms. Using the manual way, time varies greatly, and takes around 2s. So I think it's worth it.
Code is here. I put it into the public domain :)
You can determine the order using the lorder and tsort utilities, for example
libs='/usr/lib/libncurses.a /usr/lib/libedit.a'
libs_ordered=$(lorder $libs | tsort)
resulting in /usr/lib/libedit.a /usr/lib/libncurses.a because libedit depends on libncurses.
This is probably only a benefit above --start-group if you do not run lorder and tsort again for each link command. Also, it does not allow mutual/cyclic dependencies like --start-group does.
Is there a third option where you just build a single library to begin with? I had a similar problem and I eventually decided to go with the third option.
In my experience, group is slower than just unifying the .a files. You can extract all files from the archive, then create a new .a file from from the smaller files
However, you have to be careful about a circumstance where both files happen to contain the same definition (you can explicitly check for this by using nm to see what definitions are contained in each library)

How to force use of static library over shared?

In my SConscript I have the following line:
Program("xtest", Split("main.cpp"), LIBS="mylib fltk Xft Xinerama Xext X11 m")
How do I get scons to use mylib.a instead of mylib.so, while linking dynamically with the other libraries?
EDIT: Looking to use as few platform specific hacks as possible.
Passing the full filepath wrapped in a File node will force static linking. For example:
lib = File('/usr/lib/libfoo.a')
Program('bar', 'main.c', LIBS = [lib])
Will produce the following linker command line
g++ -o bar main.o /usr/lib/libfoo.a
Notice how the "-l" flag is not passed to the linker for this LIBS entry. This effectively forces static linking. The alternative is to modify LINKFLAGS to get what you want with the caveat that you are bypassing the library dependency scanner -- the status of the library will not be checked for rebuilds.
To make this platform independent you append the env['SHLIBSUFFIX'] onto the library you want to use. env['SHLIBSUFFIX'] gives you this environments suffix for shared libraries.
You also have the ['SHLIBPREFIX'], ['LIBPREFIX'], ['LIBSUFFIX'] and ['PROGSUFFIX'], all useful for situations like this.
Edit:
I obviously haven't made myself understood, so I will clarify.
The return value of these lookups are strings to the pre/suffixes that platform uses. In that way you can refer to the file you need on each platform. Note that you cannot use it as a pure string, it has to be embedded as a file node as BennyG suggests. Working with nodes are anyway the best solution as file nodes are much more versatile than a string.
Hope this helps.

Is it possible to get CMake to build both a static and shared library at the same time?

Same source, all that, just want a static and shared version both. Easy to do?
Yes, it's moderately easy. Just use two "add_library" commands:
add_library(MyLib SHARED source1.c source2.c)
add_library(MyLibStatic STATIC source1.c source2.c)
Even if you have many source files, you can place the list of sources in a Cmake variable, so it's still easy to do.
On Windows you should probably give each library a different name, since there is a ".lib" file for both shared and static. But on Linux and Mac you can even give both libraries the same name (e.g. libMyLib.a and libMyLib.so):
set_target_properties(MyLibStatic PROPERTIES OUTPUT_NAME MyLib)
But I don't recommend giving both the static and dynamic versions of the library the same name. I prefer to use different names because that makes it easier to choose static vs. dynamic linkage on the compile line for tools that link to the library. Usually I choose names like libMyLib.so (shared) and libMyLib_static.a (static). (Those would be the names on linux.)
Since CMake version 2.8.8, you can use "object libraries" to avoid the duplicated compilation of the object files. Using Christopher Bruns' example of a library with two source files:
# list of source files
set(libsrc source1.c source2.c)
# this is the "object library" target: compiles the sources only once
add_library(objlib OBJECT ${libsrc})
# shared libraries need PIC
set_property(TARGET objlib PROPERTY POSITION_INDEPENDENT_CODE 1)
# shared and static libraries built from the same object files
add_library(MyLib_shared SHARED $<TARGET_OBJECTS:objlib>)
add_library(MyLib_static STATIC $<TARGET_OBJECTS:objlib>)
From the CMake docs:
An object library compiles source files but does not archive or link
their object files into a library. Instead other targets created by
add_library() or add_executable() may reference the objects using an
expression of the form $<TARGET_OBJECTS:objlib> as a source, where
objlib is the object library name.
Simply put, the add_library(objlib OBJECT ${libsrc}) command instructs CMake to compile the source files to *.o object files. This collection of *.o files is then referred to as $<TARGET_OBJECT:objlib> in the two add_library(...) commands that invoke the appropriate library creation commands that build the shared and static libraries from the same set of object files. If you have lots of source files, then compiling the *.o files can take quite long; with object libraries you compile them only once.
The price you pay is that the object files must be built as position-independent code because shared libraries need this (static libs don't care). Note that position-independent code may be less efficient, so if you aim for maximal performance then you'd go for static libraries. Furthermore, it is easier to distribute statically linked executables.
There is generally no need to duplicate ADD_LIBRARY calls for your purpose. Just make use of
$> man cmake | grep -A6 '^ *BUILD_SHARED_LIBS$'
BUILD_SHARED_LIBS
Global flag to cause add_library to create shared libraries if on.
If present and true, this will cause all libraries to be built shared unless the library was
explicitly added as a static library. This variable is often added to projects as an OPTION
so that each user of a project can decide if they want to build the project using shared or
static libraries.
while building, first (in one out-of-source directory) with -DBUILD_SHARED_LIBS:BOOL=ON, and with OFF in the other.
Please be aware that previous answers won't work with MSVC:
add_library(test SHARED ${SOURCES})
add_library(testStatic STATIC ${SOURCES})
set_target_properties(testStatic PROPERTIES OUTPUT_NAME test)
CMake will create test.dll together with test.lib and test.exp for shared target. Than it will create test.lib in the same directory for static target and replace previous one. If you will try to link some executable with shared target it will fail with error like:
error LNK2001: unresolved external symbol __impl_*.`.
Please use ARCHIVE_OUTPUT_DIRECTORY and use some unique output directory for static target:
add_library(test SHARED ${SOURCES})
add_library(testStatic STATIC ${SOURCES})
set_target_properties(
testStatic PROPERTIES
OUTPUT_NAME test
ARCHIVE_OUTPUT_DIRECTORY testStatic
)
test.lib will be created in testStatic directory and won't override test.lib from test target. It works perfect with MSVC.
It's possible to pack eveything in the same compilation breath, as suggested in the previous answers, but I would advise against it, because in the end it's a hack that works only for simple projects. For example, you may need at some point different flags for different versions of the library (esp. on Windows where flags are typically used to switch between exporting symbols or not). Or as mentionned above, you may want to put .lib files into different directories depending on whether they correspond to static or shared libraries. Each of those hurdles will require a new hack.
It may be obvious, but one alternative that has not been mentionned previously is to make the type of the library a parameter:
set( ${PROJECT_NAME}_LIBTYPE CACHE STRING "library type" )
set_property( CACHE ${PROJECT_NAME}_LIBTYPE PROPERTY STRINGS "SHARED;STATIC" )
add_library( ${PROJECT_NAME} ${PROJECT_NAME}_LIBTYPE ${SOURCE_FILES} )
Having shared and static versions of the library in two different binary trees makes it easier to handle different compilation options. I don't see any serious drawback in keeping compilation trees distinct, especially if your compilations are automated.
Note that even if you intend to mutualize compilations using an intermediate OBJECT library (with the caveats mentionned above, so you need a compelling reason to do so), you could still have end libraries put in two different projects.

Resources