Static function and variable get exported in shared library - c

So far I've assumed that objects with static linkage (i.e. static functions and static variables) in C do not collide with other objects (of static or external linkage) in other compilation units (i.e. .c files) so I've used "short" names for internal helper functions rather than prefixing everything with the library name. Recently a user of my library experienced a crash due to a name collision with an exported function from another shared library. On investigation it turned out that several of my static functions are part of the symbol table of the shared library. Since it happens with several GCC major versions I assume I'm missing something (such a major bug would be noticed and fixed).
I managed to get it down to the following minimum example:
#include <stdbool.h>
#include <stdlib.h>
bool ext_func_a(void *param_a, char const *param_b, void *param_c);
bool ext_func_b(void *param_a);
static bool bool_a, bool_b;
static void parse_bool_var(char *doc, char const *var_name, bool *var)
{
char *var_obj = NULL;
if (!ext_func_a(doc, var_name, &var_obj)) {
return;
}
*var = ext_func_b(var_obj);
}
static void parse_config(void)
{
char *root_obj = getenv("FOO");
parse_bool_var(root_obj, "bool_a", &bool_a);
parse_bool_var(root_obj, "bool_b", &bool_b);
}
void libexample_init(void)
{
parse_config();
}
Both the static variable bool_a and the static function parse_bool_var are visible in the symbol table of the object file and the shared library:
$ gcc -Wall -Wextra -std=c11 -O2 -fPIC -c -o example.o example.c
$ objdump -t example.o|egrep 'parse_bool|bool_a'
0000000000000000 l O .bss 0000000000000001 bool_a
0000000000000000 l F .text 0000000000000050 parse_bool_var
$ gcc -shared -Wl,-soname,libexample.so.1 -o libexample.so.1.1 x.o -fPIC
$ nm libexample.so.1.1 |egrep 'parse_bool|bool_a'
0000000000200b79 b bool_a
0000000000000770 t parse_bool_var
I've dived into C11, Ulrich Drepper's "How to Write Shared Libraries" and a couple of other sources explaining visibility of symbols, but I'm still at a loss. Why are bool_a and parse_bool_var ending up in the dynamic symbol table even though they're declared static?

The lower case letter in the second column of nm output means they're local (if they were upper case, it would be a different story).
Those symbols won't conflict with other symbols of the same name and AFAIK, are basically there only for debugging purposes.
Local symbols won't go into the dynamic symbol table (printable with nm -D but only in shared libraries) either and they are strippable along with exported symbols (upper case letters in the second column of nm output) that aren't dynamic.
(As you will have learned from Drepper's How to Write Shared Libraries, you can control visibility with -fvisibility=(default|hidden) (shouldn't use protected) and with visibility attributes.)

Related

Why does global variable definition in C header file work? [duplicate]

This question already has answers here:
What happens if I define the same variable in each of two .c files without using "extern"?
(3 answers)
Closed 2 years ago.
From what I saw across many many stackoverflow questions among other places, the way to define globals is to define them in exactly one .c file, then declare it as an extern in a header file which then gets included in the required .c files.
However, today I saw in a codebase global variable definition in the header file and I got into arguing, but he insisted it will work. Now, I had no idea why, so I created a small project to test it out real quick:
a.c
#include <stdio.h>
#include "a.h"
int main()
{
p1.x = 5;
p1.x = 4;
com = 6;
change();
printf("p1 = %d, %d\ncom = %d\n", p1.x, p1.y, com);
return 0;
}
b.c
#include "a.h"
void change(void)
{
p1.x = 7;
p1.y = 9;
com = 1;
}
a.h
typedef struct coord{
int x;
int y;
} coord;
coord p1;
int com;
void change(void);
Makefile
all:
gcc -c a.c -o a.o
gcc -c b.c -o b.o
gcc a.o b.o -o run.out
clean:
rm a.o b.o run.out
Output
p1 = 7, 9
com = 1
How is this working? Is this an artifact of the way I've set up the test? Is it that newer gcc has managed to catch this condition? Or is my interpretation of the whole thing completely wrong? Please help...
This relies on so called "common symbols" which are an extension to standard C's notion of tentative definitions (https://port70.net/~nsz/c/c11/n1570.html#6.9.2p2), except most UNIX linkers make it work across translation units too (and many even with shared dynamic libaries)
AFAIK, the feature has existed since pretty much forever and it had something to do with fortran compatibility/similarity.
It works by the compiler placing giving uninitialized (tentative) globals a special "common" category (shown in the nm utility as "C", which stands for "common").
Example of data symbol categories:
#!/bin/sh -eu
(
cat <<EOF
int common_symbol; //C
int zero_init_symbol = 0; //B
int data_init_symbol = 4; //D
const int const_symbol = 4; //R
EOF
) | gcc -xc - -c -o data_symbol_types.o
nm data_symbol_types.o
Output:
0000000000000004 C common_symbol
0000000000000000 R const_symbol
0000000000000000 D data_init_symbol
0000000000000000 B zero_init_symbol
Whenever a linker sees multiple redefinitions for a particular symbol, it usually generates linkers errors.
But when those redefinitions are in the common category, the linker will merge them into one.
Also, if there are N-1 common definitions for a particular symbol and one non-tentative definition (in the R,D, or B category), then all the definitions are merged into the one nontentative definition and also no error is generated.
In other cases you get symbol redefinition errors.
Although common symbols are widely supported, they aren't technically standard C and relying on them is theoretically undefined behavior (even though in practice it often works).
clang and tinycc, as far as I've noticed, do not generate common symbols (there you should get a redefinition error). On gcc, common symbol generation can be disabled with -fno-common.
(Ian Lance Taylor's serios on linkers has more info on common symbols and it also mentions how linkers even allow merging differently sized common symbols, using the largest size for the final object: https://www.airs.com/blog/archives/42 . I believe this weird trick was once used by libc's to some effect)
That program should not compile (well it should compile, but you'll have double definition errors in your linking phase) due to how the variables are defined in your header file.
A header file informs the compiler about external environment it normally cannog guess by itself, as external variables defined in other modules.
As your question deals with this, I'll try to explain the correct way to define a global variable in one module, and how to inform the compiler about it in other modules.
Let's say you have a module A.c with some variable defined in it:
A.c
int I_am_a_global_variable; /* you can even initialize it */
well, normally to make the compiler know when compiling other modules that you have that variable defined elsewhere, you need to say something like (the trick is in the extern keyword used to say that it is not defined here):
B.c
extern int I_am_a_global_variable; /* you cannot initialize it, as it is defined elsewhere */
As this is a property of the module A.c, we can write a A.h file, stating that somewhere else in the program, there's a variable named I_am_a_global_variable of type int, in order to be able to access it.
A.h
extern int I_am_a_global_variable; /* as above, you cannot initialize the variable here */
and, instead of declaring it in B.c, we can include the file A.h in B.c to ensure that the variable is declared as the author of B.c wanted to.
So now B.c is:
B.c
#include "A.h"
void some_function() {
/* ... */
I_am_a_global_variable = /* some complicated expression */;
}
this ensures that if the author of B.c decides to change the type or the declaration of the variable, he can do changing the file A.h and all the files that #include it should be recompiled (you can do this in the Makefile for example)
A.c
#include "A.h" /* extern int I_am_a_global_variable; */
int I_am_a_global_variable = 27;
In order to prevent errors, it is good that A.c also #includes the file A.h, so the declaration
extern int I_am_a_global_variable; /* as above, you cannot initialize the variable here */
and the final definition (that is included in A.c):
int I_am_a_global_variable = 23; /* I have initialized it to a non-default value to show how to do it */
are consistent between them (consider the author changes the type of I_am_a_global_variable to double and forgets to change the declaration in A.h, the compiler will complaint about non-matching declaration and definition, when compiling A.c (which now includes A.h).
Why I say that you will have double definition errors when linking?
Well, if you compile several modules with the statement (result of #includeing the file A.h in several modules) with the statement:
#include "A.h" /* this has an extern int I_am_a_global_variable; that informs the
* compiler that the variable is defined elsewhere, but see below */
int I_am_a_global_variable; /* here is _elsewhere_ :) */
then all those modules will have a global variable I_m_a_global_variable, initialized to 0, because the compiler defined it in every module (you don't say that the variable is defined elsewhere, you are stating it to declare and define it in this compilation unit) and when you link all the modules together you'll end with several definitions of a variable with the same name at several places, and the references from other modules using this variable will don't know which one is to be used.
The compiler doesn't know anything of other compilations for an application when it is compiling module A, so you need some means to tell it what is happening around. The same as you use function prototypes to indicate it that there's a function somewhere that takes some number of arguments of types A, B, C, etc. and returns a value of type Z, you need to tell it that there's a variable defined elsewhere that has type X, so all the accesses you do to it in this module will be compiled correctly.

What's the difference between declaring a function static and not including it in a header? [duplicate]

This question already has answers here:
What is a "static" function in C?
(11 answers)
Closed 2 years ago.
What's the difference between the following scenarios?
// some_file.c
#include "some_file.h" // doesn't declare some_func
int some_func(int i) {
return i * 5;
}
// ...
and
// some_file.c
#include "some_file.h" // doesn't declare some_func
static int some_func(int i) {
return i * 5;
}
// ...
If all static does to functions is restrict their accessibility to their file, then don't both scenarios mean some_func(int i) can only be accessed from some_file.c since in neither scenario is some_func(int i) put in a header file?
A static function is "local" to the .c file where it is declared in. So you can have another function (static or nor) in another .c file without having a name collision.
If you have a non static function in a .c file that is not declared in any header file, you cannot call this function from another .c file, but you also cannot have another function with the same name in another .c file because this would cause a name collision.
Conclusion: all purely local functions (functions that are only used inside a .c function such as local helper functions) should be declared static in order to prevent the pollution of the name space.
Example of correct usage:
file1.c
static void LocalHelper()
{
}
...
file2.c
static void LocalHelper()
{
}
...
Example of semi correct usage
file1.c
static LocalHelper() // function is local to file1.c
{
}
...
file2.c
void LocalHelper() // global functio
{
}
...
file3.c
void Foo()
{
LocalHelper(); // will call LocalHelper from file2.c
}
...
In this case the program will link correctly, even if LocalHelpershould have been static in file2.c
Example of incorrect usage
file1.c
LocalHelper() // global function
{
}
...
file2.c
void LocalHelper() // global function
{
}
...
file3.c
void Foo()
{
LocalHelper(); // which LocalHelper should be called?
}
...
In this last case we hava a nema collition and the program wil not even link.
The difference is that with a non-static function it can still be declared in some other translation unit (header files are irrelevant to this point) and called. A static function is simply not visible from any other translation unit.
It is even legal to declare a function inside another function:
foo.c:
void foo()
{
void bar();
bar();
}
bar.c:
void bar()
{ ... }
What's the difference between declaring a function static and not including it in a header?
Read more about the C programming language (e.g. Modern C), the C11 standard n1570. Read also about linkers and loaders.
On POSIX systems, notably Linux, read about dlopen(3), dlsym(3), etc...
Practically speaking:
If you declare a function static it stays invisible to other translation units (concretely your *.c source files, with the way you compile them: you could -at least in principle, even if it is confusing- compile foo.c twice with GCC on Linux: once as gcc -c -O -DOPTION=1 foo.c -o foo1.o and another time with gcc -c -O -DOPTION=2 -DWITHOUT_MAIN foo.c -o foo2.o and in some cases be able to link both foo1.o and foo2.o object files into a single executable foo using gcc foo1.o foo2.o -o foo)
If you don't declare it static, you could (even if it is poor taste) code in some other translation unit something like:
if (x > 2) {
extern int some_func(int); // extern is here for readability
return some_func(x);
}
In practice, I have the habit of naming all my C functions (including static ones) with a unique name, and I generally have a naming convention related to them (e.g. naming them with a common prefix, like GTK does). This makes debugging the program (with GDB) easier (since GDB has autocompletion function names).
At last, a good optimizing compiler could, and practically would often, inline calls to static functions whose body is known. With a recent GCC, compile your code with gcc -O2 -Wall. If you want to check how inlining happened, look into the produced assembler (using gcc -O2 -S -fverbose-asm).
On Linux, you can obtain using dlsym the address of a function which is not declared static.

Is it possible to remove dead code from a static library?

I would like to remove dead code from a static library by specifying an entry point.
For instance:
lib1.c
int foo() { return 0; }
int bar() { return 0; }
lib2.c
#include "lib1.h"
int entry() {
return foo();
}
new.a (lib1.a + lib2.a)
libtool -static -o new.a lib1.a lib2.a
I would like new.a to not contain int bar() because it is unused in the entry point of lib1.a, and I don't plan on using lib2.a directly.
Is this possible?
If you compile with -ffunction-sections (and possibly -fdata-sections) and link with -Wl,--gc-sections, the unreferenced functions will be removed. This is subtly different from them not being present to begin with (for example, if bar contained references to other functions or data, it could cause the files containing them to be pulled in for consideration, possibly resulting in new global ctors or overriding weak definitions) but close enough for most purposes.
The right way, on the other hand, is not to define functions that can be used independently in the same translation unit (source file). Split them into separate files and this just works automatically with no special options.

Linking against older symbol version in a .so file

Using gcc and ld on x86_64 linux I need to link against a newer version of a library (glibc 2.14) but the executable needs to run on a system with an older version (2.5). Since the only incompatible symbol is memcpy (needing memcpy#GLIBC_2.2.5 but the library providing memcpy#GLIBC_2.14), I would like to tell the linker that instead of taking the default version for memcpy, it should take an old version I specify.
I found a quite arkward way to do it: simply specify a copy of the old .so file at the linker command line. This works fine, but I don't like the idea of having multiple .so files (I could only make it work by specifying all old libraries I link to that also have references to memcpy) checked into the svn and needed by my build system.
So I am searching for a way to tell the linker to take the old versioned symbol.
Alternatives that don't work (well) for me are:
Using asm .symver (as seen on Web Archive of Trevor Pounds' Blog) since this would require me to make sure the symver is before all the code that is using memcpy, which would be very hard (complex codebase with 3rd party code)
Maintaining a build environment with the old libraries; simply because I want to develop on my desktop system and it would be a pita to sync stuff around in our network.
When thinking about all the jobs a linker does, it doesn't seem like a hard thing to imlpement, after all it has some code to figure out the default version of a symbol too.
Any other ideas that are on the same complexity level as a simple linker command line (like creating a simple linker script etc.) are welcome too, as long as they are not weird hacks like editing the resulting binary...
edit:
To conserve this for the future readers, additionally to the below ideas I found the option --wrap to the linker, which might be useful sometimes too.
I found the following working solution. First create file memcpy.c:
#include <string.h>
/* some systems do not have newest memcpy##GLIBC_2.14 - stay with old good one */
asm (".symver memcpy, memcpy#GLIBC_2.2.5");
void *__wrap_memcpy(void *dest, const void *src, size_t n)
{
return memcpy(dest, src, n);
}
No additional CFLAGS needed to compile this file. Then link your program with -Wl,--wrap=memcpy.
Just link memcpy statically - pull memcpy.o out of libc.a ar x /path/to/libc.a memcpy.o (whatever version - memcpy is pretty much a standalone function) and include it in your final link. Note that static linking may complicate licensing issues if your project is distributed to the public and not open-source.
Alternatively, you could simply implement memcpy yourself, though the hand-tuned assembly version in glibc is likely to be more efficient
Note that memcpy#GLIBC_2.2.5 is mapped to memmove (old versions of memcpy consistently copied in a predictable direction, which led to it sometimes being misused when memmove should have been used), and this is the only reason for the version bump - you could simply replace memcpy with memmove in your code for this specific case.
Or you could go to static linking, or you could ensure that all systems on your network have the same or better version than your build machine.
I had a similar issue. A third party library we use needs the old memcpy#GLIBC_2.2.5. My solution is an extended approach #anight posted.
I also warp the memcpy command, but i had to use a slightly different approach, since the solution #anight posted did not work for me.
memcpy_wrap.c:
#include <stddef.h>
#include <string.h>
asm (".symver wrap_memcpy, memcpy#GLIBC_2.2.5");
void *wrap_memcpy(void *dest, const void *src, size_t n) {
return memcpy(dest, src, n);
}
memcpy_wrap.map:
GLIBC_2.2.5 {
memcpy;
};
Build the wrapper:
gcc -c memcpy_wrap.c -o memcpy_wrap.o
Now finally when linking the program add
-Wl,--version-script memcpy_wrap.map
memcpy_wrap.o
so that you will end up with something like:
g++ <some flags> -Wl,--version-script memcpy_wrap.map <some .o files> memcpy_wrap.o <some libs>
I had a similar problem. Trying to install some oracle components on RHEL 7.1, I got this:
$ gcc -o /some/oracle/bin/foo .... -L/some/oracle/lib ...
/some/oracle/lib/libfoo.so: undefined reference to `memcpy#GLIBC_2.14'
It seems that (my) RHEL's glibc only defines memcpy#GLIBC_2.2.5:
$ readelf -Ws /usr/lib/x86_64-redhat-linux6E/lib64/libc_real.so | fgrep memcpy#
367: 000000000001bfe0 16 FUNC GLOBAL DEFAULT 8 memcpy##GLIBC_2.2.5
1166: 0000000000019250 16 FUNC WEAK DEFAULT 8 wmemcpy##GLIBC_2.2.5
So, I managed to get around this, by first creating a memcpy.c file without wrapping, as follows:
#include <string.h>
asm (".symver old_memcpy, memcpy#GLIBC_2.2.5"); // hook old_memcpy as memcpy#2.2.5
void *old_memcpy(void *, const void *, size_t );
void *memcpy(void *dest, const void *src, size_t n) // then export memcpy
{
return old_memcpy(dest, src, n);
}
and a memcpy.map file that exports our memcpy as memcpy#GLIBC_2.14:
GLIBC_2.14 {
memcpy;
};
I then compiled my own memcpy.c into a shared lib like this:
$ gcc -shared -fPIC -c memcpy.c
$ gcc -shared -fPIC -Wl,--version-script memcpy.map -o libmemcpy-2.14.so memcpy.o -lc
, moved libmemcpy-2.14.so into /some/oracle/lib (pointed to by -L arguments in my linking), and linked again by
$ gcc -o /some/oracle/bin/foo .... -L/some/oracle/lib ... /some/oracle/lib/libmemcpy-2.14.so -lfoo ...
(which compiled without errors) and verified it by:
$ ldd /some/oracle/bin/foo
linux-vdso.so.1 => (0x00007fff9f3fe000)
/some/oracle/lib/libmemcpy-2.14.so (0x00007f963a63e000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f963a428000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f963a20c000)
librt.so.1 => /lib64/librt.so.1 (0x00007f963a003000)
libc.so.6 => /lib64/libc.so.6 (0x00007f9639c42000)
/lib64/ld-linux-x86-64.so.2 (0x00007f963aa5b000)
This worked for me. I hope it does it for you, too.
I'm clearly a little late responding to this but I recently upgraded (more reasons to never upgrade) my Linux OS to XUbuntu 14.04 which came with the new libc. I compile a shared library on my machine which is used by clients who, for whatever legitimate reasons, have not upgraded their environment from 10.04. The shared library I compiled no longer loaded in their environment because gcc put a dependency on memcpy glibc v. 2.14 (or higher). Let's leave aside the insanity of this. The workaround across my whole project was three fold:
added to my gcc cflags: -include glibc_version_nightmare.h
created the glibc_version_nightmare.h
created a perl script to verify the symbols in the .so
glibc_version_nightmare.h:
#if defined(__GNUC__) && defined(__LP64__) /* only under 64 bit gcc */
#include <features.h> /* for glibc version */
#if defined(__GLIBC__) && (__GLIBC__ == 2) && (__GLIBC_MINOR__ >= 14)
/* force mempcy to be from earlier compatible system */
__asm__(".symver memcpy,memcpy#GLIBC_2.2.5");
#endif
#undef _FEATURES_H /* so gets reloaded if necessary */
#endif
perl script fragment:
...
open SYMS, "nm $flags $libname |";
my $status = 0;
sub complain {
my ($symbol, $verstr) = #_;
print STDERR "ERROR: $libname $symbol requires $verstr\n";
$status = 1;
}
while (<SYMS>) {
next unless /\#\#GLIBC/;
chomp;
my ($symbol, $verstr) = (m/^\s+.\s(.*)\#\#GLIBC_(.*)/);
die "unable to parse version from $libname in $_\n"
unless $verstr;
my #ver = split(/\./, $verstr);
complain $symbol, $verstr
if ($ver[0] > 2 || $ver[1] > 10);
}
close SYMS;
exit $status;
Minimal runnable self contained example
GitHub upstream.
main.c
#include <assert.h>
#include <stdlib.h>
#include "a.h"
#if defined(V1)
__asm__(".symver a,a#LIBA_1");
#elif defined(V2)
__asm__(".symver a,a#LIBA_2");
#endif
int main(void) {
#if defined(V1)
assert(a() == 1);
#else
assert(a() == 2);
#endif
return EXIT_SUCCESS;
}
a.c
#include "a.h"
__asm__(".symver a1,a#LIBA_1");
int a1(void) {
return 1;
}
/* ## means "default version". */
__asm__(".symver a2,a##LIBA_2");
int a2(void) {
return 2;
}
a.h
#ifndef A_H
#define A_H
int a(void);
#endif
a.map
LIBA_1{
global:
a;
local:
*;
};
LIBA_2{
global:
a;
local:
*;
};
Makefile
CC := gcc -pedantic-errors -std=c89 -Wall -Wextra
.PHONY: all clean run
all: main.out main1.out main2.out
run: all
LD_LIBRARY_PATH=. ./main.out
LD_LIBRARY_PATH=. ./main1.out
LD_LIBRARY_PATH=. ./main2.out
main.out: main.c libcirosantilli_a.so
$(CC) -L'.' main.c -o '$#' -lcirosantilli_a
main1.out: main.c libcirosantilli_a.so
$(CC) -DV1 -L'.' main.c -o '$#' -lcirosantilli_a
main2.out: main.c libcirosantilli_a.so
$(CC) -DV2 -L'.' main.c -o '$#' -lcirosantilli_a
a.o: a.c
$(CC) -fPIC -c '$<' -o '$#'
libcirosantilli_a.so: a.o
$(CC) -Wl,--version-script,a.map -L'.' -shared a.o -o '$#'
libcirosantilli_a.o: a.c
$(CC) -fPIC -c '$<' -o '$#'
clean:
rm -rf *.o *.a *.so *.out
Tested on Ubuntu 16.04.
This workaround seem not compatible with -flto compile option.
My solution is calling memmove. memove does exactly the same jobs than memcpy.
The only difference is when src and dest zone overlap, memmove is safe and memcpy is unpredictable. So we can safely always call memmove instead memcpy
#include <string.h>
#ifdef __cplusplus
extern "C" {
#endif
void *__wrap_memcpy(void *dest, const void *src, size_t n)
{
return memmove(dest, src, n);
}
#ifdef __cplusplus
}
#endif
For nim-lang, I elaborated on a solution I found using the C compiler --include= flag as follows:
Create a file symver.h with:
__asm__(".symver fcntl,fcntl#GLIBC_2.4");
Build your program with nim c ---passC:--include=symver.h
As for me I'm cross compiling too. I compile with nim c --cpu:arm --os:linux --passC:--include=symver.h ... and I can get symbol versions using arm-linux-gnueabihf-objdump -T ../arm-libc.so.6 | grep fcntl
I had to remove ~/.cache/nim at some point. And it seems to work.
I think you can get away with making a simple C file containing the symver statement and perhaps a dummy function calling memcpy. Then you just have to ensure that the resulting object file is the first file given to linker.
I suggest you either link memcpy() statically; or find the source of memcpy( ) and compile it as your own library.
It may caused by old ld (gnu link) version.
For following simple problem:
#include <string.h>
#include <stdio.h>
int main(int argc,char **argv)
{
char buf[5];
memset(buf,0,sizeof(buf));
printf("ok\n");
return 0;
}
When I use ld 2.19.1, memset is relocated to: memset##GLIBC_2.0, and cause crash.
After upgraded to 2.25, it is: memset#plt, and crash solved.
We had a similar issue, but instead of one older GLIBC symbol, we have to provide in our .so libs a mix of newer ones with necessary functionality and older ones our libs may be referencing but are not available. This situation occurs because we are shipping to customers high performance codec libs with vectorized math functions and we cannot impose requirements on what version of OS distro, gcc, or glibc they use. As long as their machine has appropriate SSE and AVX extensions, the libs should work. Here is what we did:
Include glibc 2.35 libmvec.so.1 and libm.so.6 files in a separate subfolder. These contain the necessary vectorized math functions. In a "hello codec" application example, we reference these in the link target depending on what distro, gcc, and glibc versions are found by the Makefile. More or less, for anything with glibc v2.35 or higher the high performance libs are referenced, otherwise slower libs are referenced.
To deal with missing symbols -- the subject of this thread -- we used a modification of Ortwin Anermeier's solution, in turn based on anight's solution, but without using the -Wl,--wrap=xxx option.
The .map file looks like:
GLIBC_2.35 {
hypot;
:
: (more function symbols as needed)
};
GLIBC_2.32 {
exp10;
:
: (more function symbols as needed)
};
:
: (more version nodes as needed)
and in a "stublib" .so we have:
#define _GNU_SOURCE
#include <math.h>
asm(".symver hypot_235, hypot#GLIBC_2.35");
asm(".symver exp10_232, exp10f#GLIBC_2.32");
/* ... more as needed */
double hypot_235(double x, double y) { return hypot(x, y); }
double exp10_232(double x) { return exp10(x); }
/* ... more as needed */
-lstublib.so is then included in the app build as the last link item, even after -lm.
This answer and this one also offer clues, but they not handling the general case of a .so flexible enough to be used on a wide variety of systems.

Embedding binary blobs using gcc mingw

I am trying to embed binary blobs into an exe file. I am using mingw gcc.
I make the object file like this:
ld -r -b binary -o binary.o input.txt
I then look objdump output to get the symbols:
objdump -x binary.o
And it gives symbols named:
_binary_input_txt_start
_binary_input_txt_end
_binary_input_txt_size
I then try and access them in my C program:
#include <stdlib.h>
#include <stdio.h>
extern char _binary_input_txt_start[];
int main (int argc, char *argv[])
{
char *p;
p = _binary_input_txt_start;
return 0;
}
Then I compile like this:
gcc -o test.exe test.c binary.o
But I always get:
undefined reference to _binary_input_txt_start
Does anyone know what I am doing wrong?
In your C program remove the leading underscore:
#include <stdlib.h>
#include <stdio.h>
extern char binary_input_txt_start[];
int main (int argc, char *argv[])
{
char *p;
p = binary_input_txt_start;
return 0;
}
C compilers often (always?) seem to prepend an underscore to extern names. I'm not entirely sure why that is - I assume that there's some truth to this wikipedia article's claim that
It was common practice for C compilers to prepend a leading underscore to all external scope program identifiers to avert clashes with contributions from runtime language support
But it strikes me that if underscores were prepended to all externs, then you're not really partitioning the namespace very much. Anyway, that's a question for another day, and the fact is that the underscores do get added.
From ld man page:
--leading-underscore
--no-leading-underscore
For most targets default symbol-prefix is an underscore and is defined in target's description. By this option it is possible to disable/enable the default underscore symbol-prefix.
so
ld -r -b binary -o binary.o input.txt --leading-underscore
should be solution.
I tested it in Linux (Ubuntu 10.10).
Resouce file:
input.txt
gcc (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5 [generates ELF executable, for Linux]
Generates symbol _binary__input_txt_start.
Accepts symbol _binary__input_txt_start (with underline).
i586-mingw32msvc-gcc (GCC) 4.2.1-sjlj (mingw32-2) [generates PE executable, for Windows]
Generates symbol _binary__input_txt_start.
Accepts symbol binary__input_txt_start (without underline).
Apparently this feature is not present in OSX's ld, so you have to do it totally differently with a custom gcc flag that they added, and you can't reference the data directly, but must do some runtime initialization to get the address.
So it might be more portable to make yourself an assembler source file which includes the binary at build time, a la this answer.

Resources