How to deal with symbol collisions between statically linked libraries? - c

One of the most important rules and best practices when writing a library, is putting all symbols of the
library into a library specific namespace. C++ makes this easy, due to the namespace keyword. In
C the usual approach is to prefix the identifiers with some library specific prefix.
Rules of the C standard put some constraints on those (for safe compilation): A C compiler may look at only the first
8 characters of an identifier, so foobar2k_eggs and foobar2k_spam may be interpreted as the same
identifiers validly – however every modern compiler allows for arbitrary long identifiers, so in our times
(the 21st century) we should not have to bother about this.
But what if you're facing some libraries of which you cannot change the symbol names / idenfiers? Maybe you got
only a static binary and the headers or don't want to, or are not allowed to adjust and recompile yourself.

At least in the case of static libraries you can work around it quite conveniently.
Consider those headers of libraries foo and bar. For the sake of this tutorial I'll also give you the source files
examples/ex01/foo.h
int spam(void);
double eggs(void);
examples/ex01/foo.c (this may be opaque/not available)
int the_spams;
double the_eggs;
int spam()
{
return the_spams++;
}
double eggs()
{
return the_eggs--;
}
example/ex01/bar.h
int spam(int new_spams);
double eggs(double new_eggs);
examples/ex01/bar.c (this may be opaque/not available)
int the_spams;
double the_eggs;
int spam(int new_spams)
{
int old_spams = the_spams;
the_spams = new_spams;
return old_spams;
}
double eggs(double new_eggs)
{
double old_eggs = the_eggs;
the_eggs = new_eggs;
return old_eggs;
}
We want to use those in a program foobar
example/ex01/foobar.c
#include <stdio.h>
#include "foo.h"
#include "bar.h"
int main()
{
const int new_bar_spam = 3;
const double new_bar_eggs = 5.0f;
printf("foo: spam = %d, eggs = %f\n", spam(), eggs() );
printf("bar: old spam = %d, new spam = %d ; old eggs = %f, new eggs = %f\n",
spam(new_bar_spam), new_bar_spam,
eggs(new_bar_eggs), new_bar_eggs );
return 0;
}
One problem becomes apparent immediately: C doesn't know overloading. So we have two times two functions with
identical name but of different signature. So we need some way to distinguish those. Anyway, lets see what a
compiler has to say about this:
example/ex01/ $ make
cc -c -o foobar.o foobar.c
In file included from foobar.c:4:
bar.h:1: error: conflicting types for ‘spam’
foo.h:1: note: previous declaration of ‘spam’ was here
bar.h:2: error: conflicting types for ‘eggs’
foo.h:2: note: previous declaration of ‘eggs’ was here
foobar.c: In function ‘main’:
foobar.c:11: error: too few arguments to function ‘spam’
foobar.c:11: error: too few arguments to function ‘eggs’
make: *** [foobar.o] Error 1
Okay, this was no surprise, it just told us, what we already knew, or at least suspected.
So can we somehow resolve that identifer collision without modifying the original libraries'
source code or headers? In fact we can.
First lets resolve the compile time issues. For this we surround the header includes with a
bunch of preprocessor #define directives that prefix all the symbols exported by the library.
Later we do this with some nice cozy wrapper-header, but just for the sake of demonstrating
what's going on were doing it verbatim in the foobar.c source file:
example/ex02/foobar.c
#include <stdio.h>
#define spam foo_spam
#define eggs foo_eggs
# include "foo.h"
#undef spam
#undef eggs
#define spam bar_spam
#define eggs bar_eggs
# include "bar.h"
#undef spam
#undef eggs
int main()
{
const int new_bar_spam = 3;
const double new_bar_eggs = 5.0f;
printf("foo: spam = %d, eggs = %f\n", foo_spam(), foo_eggs() );
printf("bar: old spam = %d, new spam = %d ; old eggs = %f, new eggs = %f\n",
bar_spam(new_bar_spam), new_bar_spam,
bar_eggs(new_bar_eggs), new_bar_eggs );
return 0;
}
Now if we compile this...
example/ex02/ $ make
cc -c -o foobar.o foobar.c
cc foobar.o foo.o bar.o -o foobar
bar.o: In function `spam':
bar.c:(.text+0x0): multiple definition of `spam'
foo.o:foo.c:(.text+0x0): first defined here
bar.o: In function `eggs':
bar.c:(.text+0x1e): multiple definition of `eggs'
foo.o:foo.c:(.text+0x19): first defined here
foobar.o: In function `main':
foobar.c:(.text+0x1e): undefined reference to `foo_eggs'
foobar.c:(.text+0x28): undefined reference to `foo_spam'
foobar.c:(.text+0x4d): undefined reference to `bar_eggs'
foobar.c:(.text+0x5c): undefined reference to `bar_spam'
collect2: ld returned 1 exit status
make: *** [foobar] Error 1
... it first looks like things got worse. But look closely: Actually the compilation stage
went just fine. It's just the linker which is now complaining that there are symbols colliding
and it tells us the location (source file and line) where this happens. And as we can see
those symbols are unprefixed.
Let's take a look at the symbol tables with the nm utility:
example/ex02/ $ nm foo.o
0000000000000019 T eggs
0000000000000000 T spam
0000000000000008 C the_eggs
0000000000000004 C the_spams
example/ex02/ $ nm bar.o
0000000000000019 T eggs
0000000000000000 T spam
0000000000000008 C the_eggs
0000000000000004 C the_spams
So now we're challenged with the exercise to prefix those symbols in some opaque binary. Yes, I know
in the course of this example we have the sources and could change this there. But for now, just assume
you have only those .o files, or a .a (which actually is just a bunch of .o).
objcopy to the rescue
There is one tool particularily interesting for us: objcopy
objcopy works on temporary files, so we can use it as if it were operating in-place. There is one
option/operation called --prefix-symbols and you have 3 guesses what it does.
So let's throw this fella onto our stubborn libraries:
example/ex03/ $ objcopy --prefix-symbols=foo_ foo.o
example/ex03/ $ objcopy --prefix-symbols=bar_ bar.o
nm shows us that this seemed to work:
example/ex03/ $ nm foo.o
0000000000000019 T foo_eggs
0000000000000000 T foo_spam
0000000000000008 C foo_the_eggs
0000000000000004 C foo_the_spams
example/ex03/ $ nm bar.o
000000000000001e T bar_eggs
0000000000000000 T bar_spam
0000000000000008 C bar_the_eggs
0000000000000004 C bar_the_spams
Lets try linking this whole thing:
example/ex03/ $ make
cc foobar.o foo.o bar.o -o foobar
And indeed, it worked:
example/ex03/ $ ./foobar
foo: spam = 0, eggs = 0.000000
bar: old spam = 0, new spam = 3 ; old eggs = 0.000000, new eggs = 5.000000
Now I leave it as an exercise to the reader to implement a tool/script that automatically extracts the
symbols of a library using nm, writes a wrapper header file of the structure
/* wrapper header wrapper_foo.h for foo.h */
#define spam foo_spam
#define eggs foo_eggs
/* ... */
#include <foo.h>
#undef spam
#undef eggs
/* ... */
and applies the symbol prefix to the static library's object files using objcopy.
What about shared libraries?
In principle the same could be done with shared libraries. However shared libraries, the name tells it,
are shared among multiple programs, so messing with a shared library in this way is not such a good idea.
You will not get around writing a trampoline wrapper. Even worse you cannot link against the shared library
on the object file level, but are forced to do dynamic loading. But this deserves its very own article.
Stay tuned, and happy coding.

Rules of the C standard put some constraints on those (for safe compilation): A C compiler may look at only the first 8 characters of an identifier, so foobar2k_eggs and foobar2k_spam may be interpreted as the same identifiers validly – however every modern compiler allows for arbitrary long identifiers, so in our times (the 21st century) we should not have to bother about this.
This is not just an extension of modern compilers; the current C standard also requires the compiler to support reasonably long external names. I forget the exact length but it's something like 31 characters now if I remember right.
But what if you're facing some libraries of which you cannot change the symbol names / idenfiers? Maybe you got only a static binary and the headers or don't want to, or are not allowed to adjust and recompile yourself.
Then you're stuck. Complain to the author of the library. I once encountered such a bug where users of my application were unable to build it on Debian due to Debian's libSDL linking libsoundfile, which (at least at the time) polluted the global namespace horribly with variables like dsp (I kid you not!). I complained to Debian, and they fixed their packages and sent the fix upstream, where I assume it was applied, since I never heard of the problem again.
I really think this is the best approach, because it solves the problem for everyone. Any local hack you do will leave the problem in the library for the next unfortunate user to encounter and fight with again.
If you really do need a quick fix, and you have source, you could add a bunch of -Dfoo=crappylib_foo -Dbar=crappylib_bar etc. to the makefile to fix it. If not, use the objcopy solution you found.

If you're using GCC, the --allow-multiple-definition linker switch is a handy debugging tool. This hogties the linker into using the first definition (and not whining about it). More about it here.
This has helped me during development when I have the source to a vendor-supplied library available and need to trace into a library function for some reason or other. The switch allows you to compile and link in a local copy of a source file and still link to the unmodified static vendor library. Don't forget to yank the switch back out of the make symbols once the voyage of discovery is complete. Shipping release code with intentional name space collisions is prone to pitfalls including unintentional name space collisions.

Related

How to create 4KB Linux binaries that render a 3D scene?

I just learned about the 4k demo scene contest. It consists in creating a 4KB executable which renders a nice 3D scene. The cited demo was build for Windows, so I was wondering, how one could create 4KB OpenGL scenes on Linux.
A bare "hello world" already consumes 8KB:
$ cat ex.c
#include <stdio.h>
int main()
{
printf("Hello world\n");
}
$ gcc -Os ex.c -o ex
$ ls -l ex
-rwxrwxr-x 1 cklein cklein 8374 2012-05-11 13:56 ex
The main reason why with the standard settings you can't make a small tool is that a lot of symbols and references to standard libraries are pulled into your binary. You must be explicit to to remove even that basic stuff.
Here's how I did it:
http://phresnel.org/gpl/4k/ntropy2k7/
Relevant Options:
Mostly self-explaining:
gcc main.c -o fourk0001 -Os -mfpmath=387 \
-mfancy-math-387 -fmerge-all-constants -fsingle-precision-constant \
-fno-math-errno -Wall -ldl -ffast-math -nostartfiles -nostdlib \
-fno-unroll-loops -fshort-double
Massage:
strip helps you get rid of unneeded symbols embedded in your binary:
strip -R .note -R .comment -R .eh_frame -R .eh_frame_hdr -s fourk0001
Code:
You may have to tweak and trial and error a lot. Sometimes, a loop gives smaller code, sometimes a call, sometimes a force inlined function. In my code, e.g., instead of having a clean linked list that contains all flame transforms in fancy polymorphic style, I have a fixed array where each element is a big entity containing all parameters, used or unused, as a union of all flames as per Scott Draves flame paper.
Your tricks won't be portable, other versions of g++ might give suboptimal results.
Note that with above parameters, you do not write a main() function, but rather a _start() function.
Also note that using libraries is a bit different. Instead of linking SDL and standard library functions the classy, convenient way, you must do it manually. E.g.
void *libSDL = dlopen( "libSDL.so", RTLD_LAZY );
void *libC = dlopen( "libc.so", RTLD_LAZY );
#if 1
SDL_SetVideoMode_t sym_SDL_SetVideoMode = dlsym(libSDL, "SDL_SetVideoMode");
g_sdlbuff = sym_SDL_SetVideoMode(WIDTH,HEIGHT,32,SDL_HWSURFACE|SDL_DOUBLEBUF);
#else
((SDL_SetVideoMode_t)dlsym(libSDL, "SDL_SetVideoMode"))(WIDTH,HEIGHT,32,SDL_HWSURFACE|SDL_DOUBLEBUF);
#endif
//> need malloc, probably kinda craft (we only use it once :| )
//> load some sdl cruft (cruft!)
malloc_t sym_malloc = dlsym( libC, "malloc" );
sym_rand = dlsym( libC, "rand" );
sym_srand = dlsym( libC, "srand" );
sym_SDL_Flip = dlsym(libSDL, "SDL_Flip");
sym_SDL_LockSurface = dlsym(libSDL, "SDL_LockSurface");
sym_SDL_UnlockSurface = dlsym(libSDL, "SDL_UnlockSurface");
sym_SDL_MapRGB = dlsym(libSDL, "SDL_MapRGB");
And even though no assembler has to be harmed, your code might yield UB.
edit:
Oops, I lied about assembly.
void _start() {
...
asm( "int $0x80" :: "a"(1), "b"(42) );
}
this will make your program return 42.
A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux is an interesting article that goes through a step-by-step process to create an ELF executable as small as possible.
I don't want to spoil the ending, but the author gets it down to a lot smaller than 4K ;)
Take a look at this article in KSplice blog from a while back. It talks about linking without the standard libraries.
https://blogs.oracle.com/ksplice/entry/hello_from_a_libc_free

c compiler error with linking

Here is the error I get from the gcc call:
gcc -o rr4 shells2.c graph1.c rng.c;
Undefined symbols:
"_getdisc", referenced from:
_main in cckR7zjP.o
ld: symbol(s) not found
The "cckR7zjP.o" keeps changing every time I call the compiler. The code for the method is in the file graph1.c; its header file is called graph2.h, and I am importing it to the file with the main method called shells2.c using:
#include "graph2.h"
The method or function definition is:
int getdisc(int i){ return disc[i];}
which attempts to return the ith member of the array disc created by
static int *disc;
that I already initialized in some other method! I think the problematic call is:
for (iter = 0; iter < n; iter++) {
if (getdisc(iter) == cln)
avgbtwn += get_betweenness(iter);
}
This seems like a linker problem I checked with some other questions, and I think I am linking my method properly (and am using the same method elsewhere in the code) but I still can't figure this out.
Edit: So I switched the order of the command in linux to
gcc -o rr4 graph1.c rng.c shells2.c
as per Soren's suggestion and the function compiled as normal, does anyone know why?
Further it seems when i put a trailing line break in the file graph1.c alleviates the problem.
There used to be a issue in the old GCC 2.x compilers/linkers where the linker couldn't resolve linking when the symbols were not group together -- think of it as that the linker would only looks for symbols that is still needed, and it would drop symbols which were unused.
To most people the problem would manifest itself as a problem of the ordering of libraries (specified with -l or as .a).
I see from the comments that you use a mac, so it might just be that the mac version of the compiler/linker still has that problem -- anyway since reordering the source files solved the problem, then you certainly have some variation of this bug.
So possible solutions;
Group all your source files into larger files -- bad solution -- but the linker is less likely to fail with this symptom -- or
Try to compiler all the files to .o first and then link the .o files (using a makefile would usually do this, but may or may not resolve the problem) and possibly combine the .o into a single .a (man ar), or
Change the order of the source files to have the shells2.c last (which worked for you), or
See if upgrading your compiler helps
Sorry for the long laundry list, but this is clearly just a compiler bug which just need a simple work around.
That's definitely an error with getdisc not being visible to the linker but, if what you say is correct, that shouldn't happen.
The gcc command line you have includes graph1.c which you assure use contains the function.
Don't worry about the object file name, that's just a temprary name created by the compiler to pass to the linker.
Can you confirm (exact cut and paste) the gcc command line you're using, and show us the function definition with some context around it?
In addition, make sure that graph1.c is being compiled as expected by inserting immediately before the getdisc function, the following line:
xyzzy plugh twisty;
If your function is being seen by the compiler, that should cause an error first. It may be something like ifdef statements causing your code not to be compiled.
By way of testing, the following transcript shows that what you are trying to do works just fine:
pax> cat shells2.c
#include "graph2.h"
int main (void) {
int x = getdisc ();
return x;
}
pax> cat graph2.h
int getdisc (void);
pax> cat graph1.c
int getdisc (void) {
return 42;
}
pax> gcc -o rr4 shells2.c graph1.c
pax> ./rr4
pax> echo $?
42
We have to therefore assume that what you're actually doing is something different, and that's unusually tactful for me :-)
What you're experiencing is what would happen with something like:
pax> gcc -o rr4 shells2.c
/tmp/ccb4ZOpG.o: In function `main':
shells2.c:(.text+0xa): undefined reference to `getdisc'
collect2: ld returned 1 exit status
or if getdisc was not declared correctly in graph1.c.
That last case could be for many reasons including, but not limited to:
mis-spelling of getdisc.
#ifdef type statements meaning the definition is never seen (though you seem to have discounted that in a comment).
some wag using #define to change getdisc to something else (unlikely, but possible).

Using GotoBLAS2 with C

I'm sort of a newbie to C coding but I've written a Matlab program for simulating neural networks and I wish to translate it to C code because our supercomputer cluster won't allow running more than a few Matlab simulations at once. To that end, I've found GotoBLAS to take care of the matrix math.
Unfortunately I'm not sure how to use it as I don't have a lot of experience in C and using external libraries. I'm assuming that 'dgemm' is a function in GotoBLAS from reading the BLAS guide pdf. I've been able to successfully compile GotoBLAS, but when I do:
gcc -o outputprog main.c -Wall -L -lgoto2.a
I get the messages:
undefined reference to 'dgemm'
As I understand it, I should be including some .h file (or maybe not) from GotoBLAS but I'm not sure which one (or if this is right at all).
Any help with this would be appreciated. Let me know if more information is needed.
One problem could be that the -L option expects a 'directory' name after it, and therefore gcc (or the linker invoked by gcc) is treating -lgoto2.a as a directory. The compiler does not complain about non-existent directories; it simply ignores them. Which directory did you expect to find the library in? (For the purposes of this answer, I'll assume it is in /usr/local/lib.)
Another problem could be that the library is not called libgoto2.a.a or libgoto2.a.so or something similar. You would not normally specify the .a suffix. (For the purposes of this answer, I'll assume that the library is either libgoto2.a or libgoto2.so.)
It appears that you don't need to specify where the headers are found; that means they're in a sufficiently conventional location that the compiler looks there anyway. If that's correct, the library too may be in a sufficiently conventional location too, and the -L option may be unnecessary.
So, you might be able to use:
gcc -Wall -o outputprog main.c -lgoto2
Or you might need to use:
gcc -Wall -o outputprog main.c -L/usr/local/lib -lgoto2
After some extensive discussion in the comments, and the information that the library is in the current directory and named libgoto2.a and that the symbol dgemm is still missing, I downloaded GotoBLAS2 version 1.13 and tried to compile it on a semi-supported platform (MacOS X, probably pretending to be Linux, with x86_64 architecture). The build was not completely successful - problems in some assembler code. However, poking around at the headers, there is one that looks like giving the solution to your problems:
cblas.h
In this, amongst many other function definitions, we find:
void cblas_dgemm(enum CBLAS_ORDER Order, enum CBLAS_TRANSPOSE TransA,
enum CBLAS_TRANSPOSE TransB, blasint M, blasint N, blasint K,
double alpha, double *A, blasint lda, double *B, blasint ldb,
double beta, double *C, blasint ldc);
All the function symbols in the header are prefixed with cblas_. Your code should be using:
#include "cblas.h"
You should be calling the functions using the Fortran name (in lower case) prefixed with cblas_:
cblas_dgemm(...);
And the correct link line to use is the first option listed above:
gcc -Wall -o outputprog main.c -lgoto2
At a pinch, you could define macros to map the regular (unprefixed) names to the correct C function names, but I'm not convinced it is worth it:
#define DGEMM cblas_dgemm
or (safer, because it checks the length of the argument list, but more verbose):
#define DGEMM(a,b,c,d,e,f,g,h,i,j,k,l,m,n) cblas_dgemm(a,b,c,d,e,f,g,h,i,j,k,l,m,n)
You can then write:
DGEMM(a, ..., n);
and the correct function would be called.
Experimentation with the partially successful build of GotoBLAS2 mentioned above shows that:
cblas.h is not self-contained (contrary to good coding standards).
common.h must be included before it.
common.h includes a lot of other headers:
config.h
common_x86_64.h
param.h
common_param.h
common_interface.h
common_macro.h
common_s.h
common_d.h
common_q.h
common_c.h
common_z.h
common_x.h
common_level1.h
common_level2.h
common_level3.h
common_lapack.h
The following code stands a chance of linking with a complete library:
#include "common.h"
#include "cblas.h"
void check_dgemm(void)
{
double A[33] = { 0.0 };
double B[33] = { 0.0 };
double C[33] = { 0.0 };
cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
3, 3, 3, 2.0, A, 3, B, 3, 3.0, C, 3);
}
int main(void)
{
check_dgemm();
return 0;
}
(In my admittedly broken build of the library, the complaints went from being 'cblas_dgemm() not found' to a number of other functions missing. This is a vast improvement!)
Ok I was able to find the answer on the GotoBLAS mailing list https://lists.tacc.utexas.edu/mailman/listinfo/gotoblas (which is not listed on the website as far as I can see). Here's a quick step by step on using GotoBLAS2 with C and GCC compiler.
Build GotoBLAS2 libraries (.so and .a), there's good documentation on that included with the libraries so I won't post it here. Include BOTH of these files in the libs directory of your choice as set by -L. I was only including one because I thought they were just different versions of the same library which was not correct.
Also link to -lgfortran as well if you wish to compile with gcc. -lpthread might also be useful, although I'm not sure, I've seen examples with it but it compiles without. Your gcc should look something like this:
gcc -Wall -o outprog -L./GotoLIBSDIR -lgoto2 -lgfortran -lpthread(maybe) main.c
Finally, call function_() instead of function(), so for example, dgemm_() when using gfortran to compile the fortran interfaces.
Alternatively to the fortran interface the cblas interface can be used as cblas_dgemm(). You still need to link to -lgfortran for this as otherwise linking to libgoto2.so will fail, and you need to link to that file to be able to use cblas_dgemm() correctly.
There doesn't appear to be any need to include any of the .h files or anything else.
Hopefully someone else will find this useful. Thanks for all the help!

Undefined Variable error

I'm a newbie to programming in C, and I'm having trouble understanding the error that is coming up when I attempt to compile my program. I've got the following program, which is fairly simple and which my professor says is correct:
#include <stdio.h>
#define TRUE 1
#define FALSE 0
#define BOOL int
extern int x;
extern BOOL do_exp;
int exp_or_log();
main()
{
x = 10;
do_exp = TRUE;
printf("2^%d = %d\n", x, exp_or_log()); //should print 1024
x = 145;
do_exp = FALSE;
printf("log(%d) = %d\n", x, exp_or_log()); //should print 7
}
But when I try to compile it, I get:
"_x", referenced from:
_x$non_lazy_ptr in ccWdLlxk.o
"_exp_or_log", referenced from:
_main in ccWdLlxk.o
_main in ccWdLlxk.o
"_do_exp", referenced from:
_do_exp$non_lazy_ptr in ccWdLlxk.o
ld: symbol(s) not found
I don't even have enough of an idea of what that means to know where to begin trying to figure out the problem. If anyone has a helpful explanation, or even just a general idea of what I should look at to begin problem shooting, I'd really appreciate it.
x, do_exp, and exp_or_log() are all defined in another file, I'm guessing supplied by your professor. You need to link together with that file. This is usually done by adding its filename along with yours on your compile line.
You've declared to the compiler that these variables and functions are available, but not necessarily defined in this particular source file:
extern int x;
extern BOOL do_exp;
int exp_or_log();
And they are not defined in that source file. However, the linker needs to be able to resolve those names, and the error message you're getting indicates that the linker can't find those names in any of its input files.
You need to wither provide the linker (ld) with a library that has these things, or you need a C file that defines them, and ahave that C file also compiled and linked in.
It's not the compiler that is complaining:
ld: symbol(s) not found
The linker (ld) cannot find the referenced symbols. You haven't provided their definitions.
First, note how you used the extern keyword on two variable definitions.
extern int x;
extern BOOL do_exp;
This means:
These variable are created elsewhere
(extern ally). You should be aware
that they exist, but they exist
somewhere else.
If these variable are intentionally declared in another file, you need to link compile that other file with yours, and link them together.
However, I suspect it is more likely that you just meant to declare them.
int x;
BOOL do_exp;
Report back on this, then we'll begin dealing with your function exp_or_log.
I'm guessing you're using a *nix machine from the output, so you would need to:
cc -c Anna_program.c
This should produce Anna_program.o. In your error the gibberish .o file was the same as this one, but was temporary, so was given a psudo-random name. The -c flag has the effect of only compiling the source file, and leaves off linking, which produces the executable, for later.
Then you can do:
cc Anna_program.o other_file.o -o Anna_program
And produce the executable Anna_program. If you aren't using a *nix style compiler then your sets will be different and you may need to put an extension on the end of the output file name in the last command.
You could do:
cc Anna_program.c other_file.o -o Anna_program
Which would combine the previous two steps.
What you should remember is that cc (or gcc) aren't actually the compiler, but a simple compilation manager. Under the hood they run other programs which do different steps in building your programs. By default cc will try to take what you give it and produce an executable (a.out), running as many of the steps as needed based on what you have given it. You can pass it flags, such as -c to tell it to only go part way (compiling and assembling, in this case).
The steps for C are Preprocessing (done by the program cpp), compiling (done by cc1), assembling (done by as), and linking (done by ld).
The cc or gcc command decides what needs to be done and then runs these other programs to do it.
You're having linker problems.
See the top of your code:
extern int x;
extern BOOL do_exp;
int exp_or_log();
Those three lines are like promises to the compiler. You're saying, trust me, when the time comes you'll be able to find an integer x, a BOOL do_exp, and a function exp_or_log();
The extern makes this promise for variables and the fact that the function doesn't have a body: {...} makes it for the function.
The linker is complaining because you're not following through on your promise. You need an implementation of exp_or_log(), and to have declared x and do_exp.
Is there more code? If you make another file, call is x.h, with the following content:
int x;
int do_exp;
int exp_or_log() {
return 6;
}
and then include this in your .c file:
#include "x.h"
You'll get some output. In this case it's nonsensical but it will compile while you fix the logic problems.
$ ./a.out
2^10 = 6
log(145) = 6

how do I always include symbols from a static library?

Suppose I have a static library libx.a. How to I make some symbols (not all) from this library to be always present in any binary I link with my library? Reason is that I need these symbols to be available via dlopen+dlsym. I'm aware of --whole-archive linker switch, but it forces all object files from library archive to linked into resulting binary, and that is not what I want...
Observations so far (CentOS 5.4, 32bit) (upd: this paragraph is wrong; I could not reproduce this behaviour)
ld main.o libx.a
will happily strip all non-referenced symbols, while
ld main.o -L. -lx
will link whole library in. I guess this depends on version of binutils used, however, and newer linkers will be able to cherry-pick individual objects from a static library.
Another question is how can I achieve the same effect under Windows?
Thanks in advance. Any hints will be greatly appreciated.
Imagine you have a project which consists of the following three C files in the same folder;
// ---- jam.h
int jam_badger(int);
// ---- jam.c
#include "jam.h"
int jam_badger(int a)
{
return a + 1;
}
// ---- main.c
#include "jam.h"
int main()
{
return jam_badger(2);
}
And you build it with a boost-build bjam file like this;
lib jam : jam.c <link>static ;
lib jam_badger : jam ;
exe demo : jam_badger main.c ;
You will get an error like this.
undefined reference to `jam_badger'
(I have used bjam here because the file is easier to read, but you could use anything you want)
Removing the 'static' produces a working binary, as does adding static to the other library, or just using the one library (rather than the silly wrapping on inside the other)
The reason this happens is because ld is clever enough to only select the parts of the archive which are actually used, which in this case is none of them.
The solution is to surround the static archives with -Wl,--whole-archive and -Wl,--no-whole-archive, like so;
g++ -o "libjam_candle_badger.so" -Wl,--whole-archive libjam_badger.a Wl,--no-whole-archive
Not quite sure how to get boost-build to do this for you, but you get the idea.
First things first: ld main.o libx.a does not build a valid executable. In general, you should never use ld to link anything directly; always use proper compiler driver (gcc in this case) instead.
Also, "ld main.o libx.a" and "ld main.o -L. -lx" should be exactly equivalent. I am very doubtful you actually got different results from these two commands.
Now to answer your question: if you want foo, bar and baz to be exported from your a.out, do this:
gcc -Wl,-u,foo,-u,bar,-u,baz main.o -L. -lx -rdynamic
Update:
your statement: "symbols I want to include are used by library internally only" doesn't make much sense: if the symbols are internal to the library, why do you want to export them? And if something else uses them (via dlsym), then they are not internal to the library -- they are part of the library public API.
You should clarify your question and explain what you really are trying to achieve. Providing sample code will not hurt either.
I would start with splitting off those symbols you always need into a seperate library, retaining only the optional ones in libx.a.
Take an address of the symbol you need to include.
If gcc's optimiser anyway eliminates it, do something with this address - should be enough.

Resources