Undefined reference when linking with shared object - c

I'm investigating the topic of shared libraries. The way I understood it, when linking a source file with a shared library to form an executable, unresolved symbols will remain unresolved until their first call, then lazy binding will resolve them. Based on that, I assumed that using a function that wasn't defined anywhere won't throw linker error, as it will leave the resolving job to the dynamic linker. But when I typed the following commands in the terminal:
gcc -c foo.c -fPIC
gcc -shared foo.o -o libfoos.so
gcc main.c -Wl,-rpath=. libfoos.so
I got an "undefined reference to 'foo2' " error.
This was all done with the following files in the same directory:
foo.h:
#ifndef __FOO_H__
#define __FOO_H__
int foo(int num);
#endif /* __FOO_H__ */
main.c:
#include <stdio.h>
#include "foo.h"
int main()
{
int a = 5;
printf("%d * %d = %d\n", a, a, foo(a));
printf("%d + %d = %d\n", a, a, foo2(a));
return (0);
}
and foo.c:
#include "foo.h"
int foo(int num)
{
return (num * num);
}
So my questions are:
Is it true that symbols remain unresolved until they are called for the first time? If so, then how come I'm getting an error at linking time?
I'm guessing that maybe some check needs to be made as for the very existence of the symbols (foo and foo2 my example) in the shared library, already at linking time. If so, then why not resolving them already at the same time, since we're accessing some information in the library anyway?
Thanks!

Is it true that symbols remain unresolved until they are called for the first time?
I think you may be confusing the requirements and semantics of the source language (C) with the execution semantics of dynamic shared object formats and implementations, such as ELF.
The C language does not specify when symbols are resolved, only that there must be a definition for each identifier that is used to access an object or call a function.
Different DSO formats have different properties in and around this. With ELF, for example, resolution of dynamic symbols can be deferred until the symbol is first referenced, or it can be performed immediately upon loading the DSO. This is configurable both at runtime and at compile time. The semantics of other DSO formats may be different in this and other regards.
Bottom line: no, it is not necessarily true that dynamic symbols are resolved only when they are first referenced, but that might be the default for your particular implementation and environment.
If so, then how come I'm getting an error at
linking time?
The linker is checking the C language requirements at build time. It is perfectly reasonable and in fact desirable for it to do so even when building shared objects, for if there is an unresolvable symbol used then one would like to know about the problem and fix it before people try to use the program. This is not related to whether dynamic symbol resolution is deferred at runtime.
I'm guessing that maybe some check needs to be made as for the very existence of the symbols (foo and foo2 my example) in the shared
library, already at linking time.
Yes, that's basically it.
If so, then why not resolving them
already at the same time, since we're accessing some information in
the library anyway?
How do you know that doesn't happen?
In a DSO system that does not feature symbol relocation, that can be done and is done. The dynamism in such a DSO system is primarily in whether a given library is loaded at all. DSOs in such a system have fixed load addresses and the symbols exported from them also have fixed addresses. This allows executables to be (much) smaller and for system memory to be used (much) more efficiently, relative to statically-linked executables.
But there are big practical problems with such an approach. For example, you have to contend with address-space collisions between different DSOs, updating DSOs is difficult and risky, and having well-known addresses is a security risk. Therefore, most modern DSO systems feature symbol relocation. In such a system, DSOs' load addresses are determined dynamically, at runtime, and typically even the relative offsets represented by their exported symbols are not fixed. This is the kind of DSO system that supports deferred symbol resolution, and with such a system, symbols from other DSOs cannot be resolved at build time because they are not known until run time, and they might even vary from run to run.

Related

Make unresolved linking dependencies reported at runtime instead of at compilation/program load time for the purposes of unit testing

I have a home-grown unit testing framework for C programs on Linux using GCC. For each file in the project, let's say foobar.c, a matching file foobar-test.c may exist. If that is the case, both files are compiled and statically linked together into a small executable foobar-test which is then run. foobar-test.c is expected to contain main() which calls all the unit test cases defined in foobar-test.c.
Let's say I want to add a new test file barbaz-test.c to exercise sort() inside an existing production file barbaz.c:
// barbaz.c
#include "barbaz.h"
#include "log.h" // declares log() as a linking dependency coming from elsewhere
int func1() { ... res = log(); ...}
int func2() {... res = log(); ...}
int sort() {...}
Besides sort() there are several other functions in the same file which call into log() defined elsewhere in the project.
The functionality of sort() does not depend on log(), so testing it will never reach log(). Neither func1() nor func2() require testing and won't be reachable from the new test case I am about to prepare.
However, the barbaz-test executable cannot be successfully linked until I provide stub implementations of all dependencies coming from barbaz.c. A usual stub looks like this:
// in barbaz-test.c
#include "barbaz.h"
#include "log.h"
int log() {
assert(false && "stub must not be reached");
return 0;
}
// Actual test case for sort() starts here
...
If barbaz.c is large (which is often the case for legacy code written with no regard to the possibility to test it), it will contain many linking dependencies. I cannot start writing a test case for sort() until I provide stubs for all of them. Additionally, it creates a burden of maintaining these stubs, i.e. updating their prototypes whenever the production counterpart is updated, not forgetting to delete stubs which no longer are required etc.
What I am looking for is an option to have late runtime binding performed for missing symbols, similarly to how it is done in dynamic languages, but for C. If an unresolved symbol is reached during the test execution, that should lead to a failure. Having a proper diagnostic about the reason would be ideal, but a simple NULL pointer dereference would be good enough.
My current solution is to automate the initial generation of source code of stubs. It is done by analyzing of linking error messages and then looking up declarations for missing symbols in the headers. It is done in an ad-hoc manner, e.g. it involves "parsing" of C code with regular expressions.
Needless to say, it is very fragile: depends on specific format of linker error messages and uniformly formatted function declarations for regexps to recognize. It does not solve the future maintenance burden such stubs create either.
Another approach is to collect stubs for the most "popular" linking dependencies into a common object file which is then always linked into the test executables. This leaves a shorter list of "unique" dependencies requiring attention for each new file. This approach breaks down when a slightly specialized version of a common stub function has to be prepared. In such cases linking would fail with "the same symbol defined twice".
I may have stumbled on a solution myself, inspired by this discussion: Why can't ld ignore an unused unresolved symbol?
The linker can for sure determine if certain linking dependencies are not reachable. But it is not allowed to remove them by default because the compiler has put all function symbols into the same ELF section. The linker is not allowed to modify sections, but is allowed to drop whole sections.
A solution would be to add -fdata-sections and -ffunction-sections to compiler flags, and --gc-sections to linker flags.
The former options will create one section per function during the compilation. The latter will allow linker to remove unreachable code.
I do not think these flags can be safely used in a project without doing some benchmarking of the effects first. They affect size/speed of the production code.
man gcc says:
Only use these options when there are significant benefits from doing so. When you specify these options, the assembler and linker create larger object and executable files and are also slower. These options affect code generation. They prevent optimizations by the compiler and assembler using relative locations inside a translation unit since the locations are unknown until link time.
And it goes without saying that the solution only applies to the GCC/GNU Binutils toolchain.

why do we need the shared library during compile time

Why we need the presence of the shared library during the compile time of my executable? My reasoning is that since shared library is not included into my executable and is loaded during the runtime, it is not supposed to be needed during compile time. Or Am I missing something?
#include<stdio.h>
int addNumbers(int, int); //prototype should be enough, no?
int main(int argc, char* argv[]){
int sum = addNumbers(1,2);
printf("sum is %d\n", sum);
return 0;
}
I had the libfoo.so in my current dir but I changed its name to libfar.so to find that shared lib is needed at compile or it doesn't compile.
gcc -o main main.c -L. -lfoo gives main.c:(.text+0x28): undefiend reference to 'addNumber'
I think it should be enough to only have the name of the shared library. The shared library itself is not needed since it is found in the LD_LIBRARY_PATH and loaded dynamically at runtime. Is there something else needed other than the name of the shared lib?
Nothing is needed at compile time, because C has a notion of separate compilation of translation units. But once all the different sources have been compiled, it is time to link everything together. The notion of shared library is not present in the standard but is it now a common thing, so here is how a common linker proceeds:
it looks in all compiled modules for identifiers with external linkage either defined or only declared
it looks in libraries (both static and dynamic) for identifiers already used and not defined. It then links the modules from static libraries, and stores references from dynamic libraries. But at least on Unix-likes, it needs to access the shared library for potential required (declared and not defined) identifiers in order to make sure they are already defined or can be found in other linked libraries be them static or dynamic
This produces the executable file. Then at load time, the dynamic loader knows all the dynamic modules that are required and loads them in memory (if they are not already there) along with the actual executable and builds a (virtual) memory map
gcc -o main main.c -L. -lfoo
This command does (at least) two steps: compile main.c into an object file and link all resources into an executable main. The error you see is from the last step, the linker.
The linker is responsible for generating the final executable machine code. It requires the shared object library because it needs to generate the machine code which loads it and executes any functions used in it.

Undefined reference to symbol on a static library

I'm trying to compile a binary linking it with a static library libfoo.a:
gcc -L. -o myapp myapp.o -lfoo
But I'm getting the following error from the linker:
libfoo.c:101: undefined reference to `_TRACE'
The problem is I don't have the source code for libfoo.a library.
I tried to get the reference for _TRACE symbol in the library and I got this:
nm libfoo.a | grep TRACE
U _TRACE
Assuming that _TRACE will not affect the inner workings in libfoo.a, is it possible to get the linker to define some placeholder value for this symbol so that I can compile my code?
Assuming that _TRACE will not affect the inner workings in libfoo.a
That seems an unreasonably hopeful assumption.
is it possible to get the linker to define some placeholder value for this symbol so that I can compile my code?
The first thing to do is to check libfoo's documentation. It is unusual for a static library to depend on a symbol that the user is expected to define; in fact, such an arrangement does not work cleanly with traditional linkers. I see several plausible explanations:
You need to link some other (specific) library after libfoo to provide a definition for that symbol.
Code that uses libfoo is expected to #include an associated header, and that header provides a tentative definition of _TRACE.
Programs that use libfoo are required to be built with a specific toolchain and maybe specific options.
It's just broken.
Only in case (4) is it appropriate to try to manually provide a definition for the symbol in question, and in that case your best bet is to proceed more or less as in case (1), by building an object that provides the definition and linking it after the library. Of course, that leaves you trying to guess what the definition should be.
If _TRACE is a global variable then defining it as an intmax_t with initial value 0 might work even if the library expects a different-size integer or one with different signedness. If it is supposed to be a function, however, then you're probably toast. There are too many signatures it could have, too many possible expectations for behavior. There is no reason to think that you could provide a suitable place-holder.
As I suspected, the _TRACE function is a sort of debugging function. And I was right assuming it would not affect the inner workings of libfoo.a.
I solved the problem defining the _TRACE function as:
int _TRACE(char*, ...) { return 0; }
Of course, this solution is only temporary and cannot be used in production, but it fits my purposes of compiling the code.
If you're using GCC 5.1 or later and/or C++11, there was an ABI change.
You can spot this issue using nm -C: if the symbol is defined (not U) but has a [abi:cxx11] appended to it, then it was compiled with the new ABI. From the link:
If you get linker errors about undefined references to symbols that involve types in the std::__cxx11 namespace or the tag [abi:cxx11] then it probably indicates that you are trying to link together object files that were compiled with different values for the _GLIBCXX_USE_CXX11_ABI macro.
If you have access to the source code (not your case specifically), you can use -Wabi-tag -D_GLIBCXX_USE_CXX11_ABI=0, the latter forcing the compiler not to use the new ABI.
All your code, including libraries, should be consistent.

How the compiler knows where my main function is?

I am working on a project that contains multiple modules (source files, header files, libraries). One of the files in all that soup contains my main function.
My questions are:
How the compiler knows which modules to compile and which not?
How does the compiler recognize the module with the main() inside?
The compiler itself doesn't care about what file contains which functions; main() is not special. However, in the linking stage, all these symbols from different files (and compilation units, possibly) are matched. The linker has a hidden "template" which has code at a fixed address that the OS will always call when you run a program. That code will call your main; hence, the linker looks for a main in all files. If it isn't there, you get an unresolved symbol error, exactly like if you used a function that you forgot to implement.
The same as for any other function applies to main: You can only have one implementation; having two main in two files that get linked together, you get a linker error, because the linker can't decide which of these to use.
How the compiler knows which modules to compile and which not?
It does not. You tell him which ones you want to compile, typically though the compilation statement(s) present in a makefile.
How does the compiler recognize the module with the main() inside?
Altogether it's a big process, already answered in this related question.
To summarize, while compiling a program with standard C library, the entry point of your program is set to _start. Now that has a reference to main() function internally. So, at compilation time, there is no (need for) checking the presence of main(). At linking time, linker should be able to locate one instance of main() which it can link to. That way, main() will serve as the entry point to your program.
So, to answer
How the compiler knows where my main function is?
It does (and need) not. It's the job of a linker, specifically.
The assembly code (often referred as startup code by embedded people) that starts up the program specifically calls main().
The prototype for main() is included in compiler documentation.
When you compile a program, an object file is produced. The object file from your source code is then linked with a startup runtime component (usually called crt0.o[bj]) and the C library components, etc.
If main() is changed to an unrecognizable signature, the compilation unit will complain about an unresolved external reference to _main or __main.

How do linkers decide what parts of libraries to include?

Assume library A has a() and b(). If I link my program B with A and call a(), does b() get included in the binary? Does the compiler see if any function in the program call b() (perhaps a() calls b() or another lib calls b())? If so, how does the compiler get this information? If not, isn't this a big waste of final compile size if I'm linking to a big library but only using a minor feature?
Take a look at link-time optimization. This is necessarily vendor dependent. It will also depend how you build your binaries. MS compilers (2005 onwards at least) provide something called Function Level Linking -- which is another way of stripping symbols you don't need. This post explains how the same can be achieved with GCC (this is old, GCC must've moved on but the content is relevant to your question).
Also take a look at the LLVM implementation (and the examples section).
I suggest you also take a look at Linkers and Loaders by John Levine -- an excellent read.
It depends.
If the library is a shared object or DLL, then everything in the library is loaded, but at run time. The cost in extra memory is (hopefully) offset by sharing the library (really, the code pages) between all the processes in memory that use that library. This is a big win for something like libc.so, less so for myreallyobscurelibrary.so. But you probably aren't asking about shared objects, really.
Static libraries are a simply a collection of individual object files, each the result of a separate compilation (or assembly), and possibly not even written in the same source language. Each object file has a number of exported symbols, and almost always a number of imported symbols.
The linker's job is to create a finished executable that has no remaining undefined imported symbols. (I'm lying, of course, if dynamic linking is allowed, but bear with me.) To do that, it starts with the modules named explicitly on the link command line (and possibly implicitly in its configuration) and assumes that any module named explicitly must be part of the finished executable. It then attempts to find definitions for all of the undefined symbols.
Usually, the named object modules expect to get symbols from some library such as libc.a.
In your example, you have a single module that calls the function a(), which will result in the linker looking for module that exports a().
You say that the library named A (on unix, probably libA.a) offers a() and b(), but you don't specify how. You implied that a() and b() do not call each other, which I will assume.
If libA.a was built from a.o and b.o where each defines the corresponding single function, then the linker will include a.o and ignore b.o.
However, if libA.a included ab.o that defined both a() and b() then it will include ab.o in the link, satisfying the need for a(), and including the unused function b().
As others have mentioned, there are linkers that are capable of splitting individual functions out of modules, and including only those that are actually used. In many cases, that is a safe thing to do. But it is usually safest to assume that your linker does not do that unless you have specific documentation.
Something else to be aware of is that most linkers make as few passes as they can through the files and libraries that are named on the command line, and build up their symbol table as they go. As a practical matter, this means that it is good practice to always specify libraries after all of the object modules on the link command line.
It depends on the linker.
eg. Microsoft Visual C++ has an option "Enable function level linking" so you can enable it manually.
(I assume they have a reason for not just enabling it all the time...maybe linking is slower or something)
Usually (static) libraries are composed of objects created from source files. What linkers usually do is include the object if a function that is provided by that object is referenced. if your source file only contains one function than only that function will be brought in by the linker. There are more sophisticated linkers out there but most C based linkers still work like outlined. There are tools available that split C source that contain multiple functions into artificially smaller source files to make static linking more fine granular.
If you are using shared libraries then you don't impact you compiled size by using more or less of them. However your runtime size will include them.
This lecture at Academic Earth gives a pretty good overview, linking is talked about near the later half of the talk, IIRC.
Without any optimization, yes, it'll be included. The linker, however, might be able to optimize out by statically analyzing the code and trying to remove unreachable code.
It depends on the linker, but in general only functions that are actually called get included in the final executable. The linker works by looking up the function name in the library and then using the code associated with the name.
There are very few books on linkers, which is strange when you think how important they are. The text for a good one can be found here.
It depends on the options passed to the linker, but typically the linker will leave out the object files in a library that are not referenced anywhere.
$ cat foo.c
int main(){}
$ gcc -static foo.c
$ size
text data bss dec hex filename
452659 1928 6880 461467 70a9b a.out
# force linking of libz.a even though it isn't used
$ gcc -static foo.c -Wl,-whole-archive -lz -Wl,-no-whole-archive
$ size
text data bss dec hex filename
517951 2180 6844 526975 80a7f a.out
It depends on the linker and how the library was built. Usually libraries are a combination of object files (import libraries are a major exception to this). Older linkers would pull things into the output file image at a granularity of the object files that were put into the library. So if function a() and function b() were both in the same object file, they would both be in the output file - even if only one of the 2 functions were actually referenced.
This is a reason why you'll often see library-oriented projects with a policy of a single C function per source file. That way each function is packaged in its own object file and linkers have no problem pulling in only what is referenced.
Note however that newer linkers (certainly newer Microsoft linkers) have the ability to pull in only parts of object files that are referenced, so there's less of a need today to enforce a one-function-per-source-file policy - though there are reasonable arguments that that should be done anyway for maintainability.

Resources