What is selective linking in GCC? - c

In this article, I found this line: The GNU linker uses selective linking, which keeps other unreferenced functions out of the linker’s output image.
I am not sure what this means exactly. But what I think is, if I include stdio.h in my source code, and use only printf from that, then the resulting exe contains only code for printf extracted from stdio.c and other functions defined in that file are discarded.
Is what I say is correct? If not, what does selective linking mean? Also, in the above case, does compiler include entire file, or only the used functions?

The GNU linker uses selective linking, which keeps other unreferenced functions out of the linker’s output image.
That only applies when linking .a files. A .a file is a collection of .o files. That selective linking means it links in only a function rather than the entire .o where that function resides.
if I include stdio.h in my source code, and use only printf from that, then the resulting exe contains only code for printf extracted from stdio.c and other functions defined in that file are discarded.
C standard library functions normally reside in libc.so, unless you explicitly link statically. So, either you link libc.a and that copies printf function into your executable (selective linking), or you link libc.so and no copy of printf is made.
stdio.c is only ever used to build libc.a and libc.so.
As #JohannesSchaub-litb mentions in the comment:
-ffunction-sections is needed when compiling .o files to take advantage of selective linking linker feature:
Together with a linker garbage collection (linker --gc-sections option) these options may lead to smaller statically-linked executables (after stripping).
--gc-sections linker option enables selective linking:
--gc-sections decides which input sections are used by examining symbols and relocations. The section containing the entry symbol and all sections containing symbols undefined on the command-line will be kept, as will sections containing symbols referenced by dynamic objects. Note that when building shared libraries, the linker must assume that any visible symbol is referenced.

Related

Undefined reference to function of another lib

Yeah, I know many people asked that question before, but I still can't understand the problem in my case
I have 2 libs, let's say liba & libb. libb uses liba but is compiled in .a so it should link at compile time.
I have the following GCC command:
gcc -o my_program obj/mymain.o obj/myutils.o liba/liba.a libb/libb.a -Iinclude -Iliba -Ilibb
But GCC is returning me a lot of "Undefined reference to ..." from libb functions to liba functions.
What is happening? What should I do?
Thank you
The evaluation of commands on a link compile command is very important.
When the compiler sees .o files, they get added to the target binary automatically, so all .o files are present. That leaves a list of undefined entities which need to be found.
The next stage is to look through the libraries. Each library is searched, and the .o elements of each library which fulfills an undefined reference is added to the target binary. That always resolves some issues. However, it may also have further requirements. So adding part of a library may add to the required elements to be satisfied.
When a library requires another library, it needs to be specified after something which required it, and before the libraries which satisfy its requirements.
There is a chance if the .o files also require the same parts of a library, this issue can crop up when code is deleted from a .o (removing the mechanism which pulls in the library part).

Static vs Dynamic Linking

I'm trying to understand how the ELF looks like for a statically vs. a dynamically linked program.
I understand that this is how static linking works:
In my case, I have two files, foo.c and bar.c.
I also have their object files; foo.o and bar.o.
With the objdump command, I can see the relocations in each file.
How do I statically link the foo.o and bar.o?
How do I dynamically link the foo.o and bar.o?
How can I see the difference in the output files?
Linking dynamically is the default mode of most linkers these days. If you want to link statically you have to use the -static flag when linking. To clarify, when I say "linking dynamically" versus "linking statically" I mean the linking with external libraries, and not generating a library that in turn can be linked (dynamically or statically).
The difference can't be seen in the object files you pass to the linker, as it has nothing to do with the compiler and object-file generation, the result can only be seen in the resulting executable program after linking, and the biggest difference is that the executable will most likely be larger.
The resulting and fully linked executable will be larger because then all the libraries (for which there are static libraries) will actually be linked into the executable program quite literally. It's basically including the libraries object files together with your own object files. Actually, on POSIX platforms static libraries are just archives of object files.

How to link a function which is never referenced in C

The linker's default behavior is to exclude all functions which is never referenced. However I want to include such a function for debugging purpose : when program runs abnormally, I can manually set PC to the address of this function and giving some information output.
Is there a way to do so?
The linker's default behavior is to exclude all functions which is never referenced.
That statement is false for all linkers I am familiar with: if you explicitly list the object file in which foo() is defined on the link line, then foo() is always included in the executable or shared library being linked (well, except when you specify --function-sections -Wl,--gc-sections on the link line).
It is true that if foo() is defined in an object file, and that object file is in an archive library, and that object file does not satisfy any references to any symbols it defines coming from other files already being used in the link, then that object file will not be pulled into the link.
The solution then is to either
list foo.o explicitly on the link line, or
use -Wl,--whole-archive -lfoo -Wl,--no-whole-archive (or equivalent flags for your linker, if it has them), or
add -u foo to force foo.o to be pulled into the link.
First create an object file of that c file which contains the function.
gcc <Yourfile_Name>.c -o functions.o
And use ld to create the executable. I don't know how to do this with Microsoft Visual Studio Compiler.

Statically linking against LAPACK

I'm attempting to do a release of some software and am currently working through a script for the build process. I'm stuck on something I never thought I would be, statically linking LAPACK on x86_64 linux. During configuration AC_SEARCH_LIB([main],[lapack]) works, but compilation of the lapack units do not work, for example undefiend reference to 'dsyev_' --no lapack/blas routine goes unnoticed.
I've confirmed I have the libraries installed and even compiled them myself with the appropriate options to make them static with the same results.
Here is an example I had used in my first experience with LAPACK a few years ago that works dynamically, but not statically: http://pastebin.com/cMm3wcwF
The two methods I'm using to compile are the following,
gcc -llapack -o eigen eigen.c
gcc -static -llapack -o eigen eigen.c
Your linking order is wrong. Link libraries after the code that requires them, not before. Like this:
gcc -o eigen eigen.c -llapack
gcc -static -o eigen eigen.c -llapack
That should resolve the linkage problems.
To answer the subsequent question why this works, the GNU ld documentation say this:
It makes a difference where in the command you write this option; the
linker searches and processes libraries and object files in the order
they are specified. Thus, foo.o -lz bar.o' searches libraryz' after
file foo.o but before bar.o. If bar.o refers to functions in `z',
those functions may not be loaded.
........
Normally the files found this way are library files—archive files
whose members are object files. The linker handles an archive file by
scanning through it for members which define symbols that have so far
been referenced but not defined. But if the file that is found is an
ordinary object file, it is linked in the usual fashion.
ie. the linker is going to make one pass through a file looking for unresolved symbols, and it follows files in the order you provide them (ie. "left to right"). If you have not yet specified a dependency when a file is read, the linker will not be able to satisfy the dependency. Every object in the link list is parsed only once.
Note also that GNU ld can do reordering in cases where circular dependencies are detected when linking shared libraries or object files. But static libraries are only parsed for unknown symbols once.

Restricting symbols in a Linux static library

I'm looking for ways to restrict the number of C symbols exported to a Linux static library (archive). I'd like to limit these to only those symbols that are part of the official API for the library. I already use 'static' to declare most functions as static, but this restricts them to file scope. I'm looking for a way to restrict to scope to the library.
I can do this for shared libraries using the techniques in Ulrich Drepper's How to Write Shared Libraries, but I can't apply these techniques to static archives. In his earlier Good Practices in Library Design paper, he writes:
The only possibility is to combine all object files which need
certain internal resources into one using 'ld -r' and then restrict the symbols
which are exported by this combined object file. The GNU linker has options to
do just this.
Could anyone help me discover what these options might be? I've had some success with 'strip -w -K prefix_*', but this feels brutish. Ideally, I'd like a solution that will work with both GCC 3 and 4.
Thanks!
I don't believe GNU ld has any such options; Ulrich must have meant objcopy, which has many such options: --localize-hidden, --localize-symbol=symbolname, --localize-symbols=filename.
The --localize-hidden in particular allows one to have a very fine control over which symbols are exposed. Consider:
int foo() { return 42; }
int __attribute__((visibility("hidden"))) bar() { return 24; }
gcc -c foo.c
nm foo.o
000000000000000b T bar
0000000000000000 T foo
objcopy --localize-hidden foo.o bar.o
nm bar.o
000000000000000b t bar
0000000000000000 T foo
So bar() is no longer exported from the object (even though it is still present and usable for debugging). You could also remove bar() all together with objcopy --strip-unneeded.
Static libraries can not do what you want for code compiled with either GCC 3.x or 4.x.
If you can use shared objects (libraries), the GNU linker does what you need with a feature called a version script. This is usually used to provide version-specific entry points, but the degenerate case just distinguishes between public and private symbols without any versioning. A version script is specified with the --version-script= command line option to ld.
The contents of a version script that makes the entry points foo and bar public and hides all other interfaces:
{ global: foo; bar; local: *; };
See the ld doc at: http://sourceware.org/binutils/docs/ld/VERSION.html#VERSION
I'm a big advocate of shared libraries, and this ability to limit the visibility of globals is one their great virtues.
A document that provides more of the advantages of shared objects, but written for Solaris (by Greg Nakhimovsky of happy memory), is at http://developers.sun.com/solaris/articles/linker_mapfiles.html
I hope this helps.
The merits of this answer will depend on why you're using static libraries. If it's to allow the linker to drop unused objects later then I have little to add. If it's for the purpose of organisation - minimising the number of objects that have to be passed around to link applications - this extension of Employed Russian's answer may be of use.
At compile time, the visibility of all symbols within a compilation unit can be set using:
-fvisibility=hidden
-fvisibility=default
This implies one can compile a single file "interface.c" with default visibility and a larger number of implementation files with hidden visibility, without annotating the source. A relocatable link will then produce a single object file where the non-api functions are "hidden":
ld -r interface.o implementation0.o implementation1.o -o relocatable.o
The combined object file can now be subjected to objcopy:
objcopy --localize-hidden relocatable.o mylibrary.o
Thus we have a single object file "library" or "module" which exposes only the intended API.
The above strategy interacts moderately well with link time optimisation. Compile with -flto and perform the relocatable link by passing -r to the linker via the compiler:
gcc -fuse-linker-plugin -flto -nostdlib -Wl,-r {objects} -o relocatable.o
Use objcopy to localise the hidden symbols as before, then call the linker a final time to strip the local symbols and whatever other dead code it can find in the post-lto object. Sadly, relocatable.o is unlikely to have retained any lto related information:
gcc -nostdlib -Wl,-r,--discard-all relocatable.o mylibrary.o
Current implementations of lto appear to be active during the relocatable link stage. With lto on, the hidden=>local symbols were stripped by the final relocatable link. Without lto, the hidden=>local symbols survived the final relocatable link.
Future implementations of lto seem likely to preserve the required metadata through the relocatable link stage, but at present the outcome of the relocatable link appears to be a plain old object file.
This is a refinement of the answers from EmployedRussian and JonChesterfield, which may be helpful if you're generating both dynamic and static libraries.
Start with the standard mechanism for hiding symbols in DSOs (the dynamic version of your lib). Compile all files with -fvisibility=hidden. In the header file which defines your API, change the declarations of the classes and functions you want to make public:
#define DLL_PUBLIC __attribute__ ((visibility ("default")))
extern DLL_PUBLIC int my_api_func(int);
See here for details. This works for both C and C++. This is sufficient for DSOs, but you'll need to add these build steps for static libraries:
ld -r obj1.o obj2.o ... objn.o -o static1.o
objcopy --localize-hidden static1.o static2.o
ar -rcs mylib.a static2.o
The ar step is optional - you can just link against static2.o.
My way of doing it is to mark everything that is not to be exported with INTERNAL,
include guard all .h files, compile dev builds with -DINTERNAL= and compile release builds with a single .c file that includes all other library .c files with -DINTERNAL=static.

Resources