Do Dynamic Libraries (*.so) always result in a smaller binary? - c

I've been experimenting with static and dynamic libraries using GCC on GNU/Linux. My understanding is that static libraries are directly inserted into the executable, while dynamic libraries remain separate and are loaded later. One benefit of this is that the resulting binaries end up smaller.
My problem is I'm not seeing much of a reduction in size when compiling with a dynamic library:
When compiling with the dynamic library, I expected the final binary to be closer to the size of it's corresponding object file (useStringLib.o), yet it looks almost the same size as the binary compiled with the static library.
Note: Both libstringfun.a and libstringfun.so use the same code (a bunch of custom string functions I wrote). If it isn't clear useString is the binary that calls into the libraries.
To build the files for the dynamic library I used these commands:
gcc -fPIC -c lib_stringfun.c
gcc lib_stringfun.o -shared -o libstringfun.so
To link the dynamic library with the binary I used this command:
gcc -o useString useStringLib.o -L. -lstringfun

Related

Clang: compile IR, C files and apply opt in one line

I'm building an IR level Pass for LLVM which instrument the functions with calls to my runtime library.
So far I have used the following lines to compile any C file with my pass and link it with the runtime library and guaranteeing that the runtime library function calls are inlined.
Compiling source to IR...
clang -S -emit-llvm example.c -o example-codeIR.ll -I ../runtime
Running Pass with opt...
opt -load=../build/PSS/libPSSPass.so -PSSPass -overwrite -always-inline -S -o example-codeOpt.ll example-codeIR.ll
Linking IR with runtime library...
llvm-link -o example-linked.bc example-codeOpt.ll ../runtime/obj/PSSutils.ll
Compiling bitcode to binary...
clang -ldl -O3 -o example example-linked.bc ../initializer/so/shim.so
Now I would like to test my pass with the LLVM testsuite and the only thing I can do is pass flags to the test suite. I can't control the steps of of compilation and generate so many files for each test case.
Is there a way to do the same as above without having to save intermediate files and yet keep the order of the steps?
I have tried the following:
clang -ldl -Xclang -load -Xclang ../build/PSS/libPSSPass.so ../initializer/so/shim.so ../runtime/obj/PSSutils.ll $<
But I ran into the problem that I can't compile both IR and .c files.
If I compile the runtime library to be an object file the functions in it will not get inlined anymore which is the main goal of the above steps.
So to Answer my question:
first of all, call to shared objects are never inlined. hence, the above mentioned shared objects should be compiled to objects instead. The -flto=thin flag should be used when compiling the objects to build a summary of the functions so the linker can perform link time optimizations.
And in the final step of compiling the target you will need to also compile it with -flto=thin flag and the compiler will do the magic for you.

How to write common functions for reusing in C

I was trying to write a common function for other files could reuse it, the example as following, I have three files:
The first file: cat test1.h
void say();
The second file: cat test1.c
void say(){
printf("This is c example!");
}
The third file: cat test2.c
include "test1.h"
void main(){
say();
}
but when I ran: gcc -g -o test2 test2.c
it threw error as:
undefined reference to `say'
Additionally: I knew this would work:gcc -g -o test2 test1.c test2.c
but I don't wanna do this, because the other team would use the server, and I hope them directly use my binary code not source code. I hope that just like we use printf() function, we just need include .
You can build yourself a library from the object files containing your useful functions, and store the header(s) that describe them in a convenient location. You and your colleagues then compile with the headers and link that library with any executables that use any of those functions. That's very much the same general mechanism that the C compiler uses to include the standard headers and automatically link with the standard C library.
The mechanics vary a bit depending on platform (Windows vs Unix being the primary distinction, though there are differences between Unix platforms too), and also on the type of library (static archive vs dynamic linked / loaded libraries — also known as shared objects or shared libraries).
In broad outline, for a Unix system with a static library, you'd:
Compile library object files libfile1.o, libfile2.o, … using (for example) gcc -c libfile1.c libfile2.c.
Create an archive from the object files — using for example ar r libname.a libfile1.o libfile2.o.
Copy the headers to a standard location such as /usr/local/include.
Copy the library to a standard location such as /usr/local/lib.
You'd compile any code that uses the library functions with -I/usr/local/include (if that is not already a standard compilation option).
You'd link the programs with -L/usr/local/lib -lname (you might not need to specify -L… but you would need to specify -lname).
Including a header file does not make a function available. It simply informs the compiler that the function will be provided at a later time.
You should compile the file with the function into a shareable object file (or a library if there is more than one function that you want to share). Mind the switch -c which tells gcc not to build an executable file:
gcc -o test1.o test1.c -c
Similarly, compile the main function into its own object file. Now you or anyone else can link the object file with their main program:
gcc -o test2 test2.o test1.o
The process can be automated using make.
Other programmers can use compiled object files (`*.o') in their programs. They need only to have a header file with function prototypes, extern data declarations and type definitions.
You can also wrap many object files into the library.
On many systems you can also create the dynamic linked libraries which do not have to be linked into the executable.
you also need to compile test1:
gcc -g -o test2 test1.c test2.c.

Create non-PIC shared libraries with ld

I have a bunch of object files that have been compiled without the -fPIC option. So the calls to the functions do not use #PLT. (source code is C and is compiled with clang).
I want to link these object files into a shared library that I can load at runtime using dlopen. I need to do this because I have to do a lot of setup before the actual .so is loaded.
But every time I try to link with the -shared option, I get the error -
relocation R_X86_64_PC32 against symbol splay_tree_lookup can not be used when making a shared object; recompile with -fPIC
I have no issues recompiling from source. But I don't want to use -fPIC. This is part of a research project where we are working on a custom compiler. PIC wouldn't work for the type of guarantees we are trying to provide in the compiler.
Is there some flag I can use with ld so that it generate load time relocating libraries. In fact I am okay with no relocations. I can provide a base address for the library and dlopen can fail if the virtual address is not available.
The command I am using for compiling my c files are equivalent to -
clang -m64 -c foo.c
and for linking I am using
clang -m64 -shared *.o -o foo.so
I say equivalent because it is a custom compiler (forked off clang) and has some extra steps. But it is equivalent.
It is not possible to dynamically load your existing non PIC objects with the expectation of it working without problems.
If you cannot recompile the original code to create a proper shared library that supports PIC, then I suggest you create a service executable that links to a static library composed of those objects. The service executable can then provide IPC/RPC/REST API/shared memory/whatever to allow your object code to be used by your program.
Then, you can author a shared library which is compiled with PIC that provides wrapper APIs that launches and communicates with the service executable to perform the actual work.
On further thought, this wrapper API library may as well be static. The dynamic aspect of it is performed by launching the service executable.
Recompiling the library's object files with the -fpic -shared options would be the best option, if this is possible!
man ld says:
-i Perform an incremental link (same as option -r).
-r
--relocatable
Generate relocatable output---i.e., generate an output file that can in turn serve as input to ld. This is often called partial linking. As a side effect, in environments that support standard Unix magic numbers, this option also sets the output file’s magic number to "OMAGIC". If this option is not specified, an absolute file is produced. When linking C++ programs, this option will not resolve references to constructors; to do that, use -Ur.
When an input file does not have the same format as the output file, partial linking is only supported if that input file does not contain any relocations. Different output formats can have further restrictions; for example some "a.out"-based formats do not support partial linking with input files in other formats at all.
I believe you can partially link your library object files into a relocatable (PIC) library, then link that library with your source code object file to make a shared library.
ld -r -o libfoo.so *.o
cp libfoo.so /foodir/libfoo.so
cd foodir
clang -m32 -fpic -c foo.c
clang -m32 -fpic -shared *.o -o foo.so
Regarding library base address:
(Again from man ld)
--section-start=sectionname=org
Locate a section in the output file at the absolute address given by org. You may use this option as many times as necessary to locate multiple sections in the command line. org must be a single hexadecimal integer; for compatibility with other linkers, you may omit the leading 0x usually associated with hexadecimal values. Note: there should be no white space between sectionname, the equals sign ("="), and org.
You could perhaps move your library's .text section?
--image-base value
Use value as the base address of your program or dll. This is the lowest memory location that will be used when your program or dll is loaded. To reduce the need to relocate and improve performance of your dlls, each should have a unique base address and not overlap any other dlls. The default is 0x400000 for executables, and 0x10000000 for dlls. [This option is specific to the i386 PE targeted port of the linker]

C: trouble statically linking binary that uses openssl AES-256 encryption

I'm trying to produce a static binary that can be run on generic linux machine (same machine architecture, same bitnes, same endienness and compatible kernel system call interface). When I pass -static to gcc I get the following warning:
# gcc -static testme.c -lssl -lcrypto -ldl -lltdl -static-libgcc
/usr/lib/gcc/i486-slackware-linux/4.8.2/../../../libcrypto.a(dso_dlfcn.o): In function `dlfcn_globallookup':
dso_dlfcn.c:(.text+0x21): warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
#
Although the binary runs some systems I've tried it on I suppose the warning means my static binary will not run on a system with different glibc version.
I included -ldl and -lltdl along the other libs to be linked, in an attempt to eliminate the warning, but still get the same. What am I doing wrong ? and how can I get rid of the warning ?
The actual code can be seen on the obash testing branch on github.
once you produce obash it should then be used to produce a reusable static binary by calling it with "-r" flag.
Since it's openssl that pulls dlopen into the equation is there an alternative way of doing acceptably safe symmetric encrypting that will not use dlopen and be safe to use even when statically linked ?

Statically linking against LAPACK

I'm attempting to do a release of some software and am currently working through a script for the build process. I'm stuck on something I never thought I would be, statically linking LAPACK on x86_64 linux. During configuration AC_SEARCH_LIB([main],[lapack]) works, but compilation of the lapack units do not work, for example undefiend reference to 'dsyev_' --no lapack/blas routine goes unnoticed.
I've confirmed I have the libraries installed and even compiled them myself with the appropriate options to make them static with the same results.
Here is an example I had used in my first experience with LAPACK a few years ago that works dynamically, but not statically: http://pastebin.com/cMm3wcwF
The two methods I'm using to compile are the following,
gcc -llapack -o eigen eigen.c
gcc -static -llapack -o eigen eigen.c
Your linking order is wrong. Link libraries after the code that requires them, not before. Like this:
gcc -o eigen eigen.c -llapack
gcc -static -o eigen eigen.c -llapack
That should resolve the linkage problems.
To answer the subsequent question why this works, the GNU ld documentation say this:
It makes a difference where in the command you write this option; the
linker searches and processes libraries and object files in the order
they are specified. Thus, foo.o -lz bar.o' searches libraryz' after
file foo.o but before bar.o. If bar.o refers to functions in `z',
those functions may not be loaded.
........
Normally the files found this way are library files—archive files
whose members are object files. The linker handles an archive file by
scanning through it for members which define symbols that have so far
been referenced but not defined. But if the file that is found is an
ordinary object file, it is linked in the usual fashion.
ie. the linker is going to make one pass through a file looking for unresolved symbols, and it follows files in the order you provide them (ie. "left to right"). If you have not yet specified a dependency when a file is read, the linker will not be able to satisfy the dependency. Every object in the link list is parsed only once.
Note also that GNU ld can do reordering in cases where circular dependencies are detected when linking shared libraries or object files. But static libraries are only parsed for unknown symbols once.

Resources