how to prevent linker from discarding a function? - c

I have a function in my C code that is being called implicitly, and getting dumped by the linker. how can I prevent this phenomena?
I'm compiling using gcc and the linker flag -gc-sections, and I don't want to exclude the whole file from the flag. I tried using attributes: "used" and "externally_visible" and neither has worked.
void __attribute__((section(".mySec"), nomicromips, used)) func(){
...
}
on map file I can see that the function has compiled but didn't linked. am I using it wrong? is there any other way to do it?

You are misunderstanding the used attribute
used
This attribute, attached to a function, means that code must be emitted for the function even if it appears that the function is not referenced...
i.e the compiler must emit the function definition even the function appears
to be unreferenced. The compiler will never conclude that a function is unreferenced
if it has external linkage. So in this program:
main1.c
static void foo(void){}
int main(void)
{
return 0;
}
compiled with:
$ gcc -c -O1 main1.c
No definition of foo is emitted at all:
$ nm main1.o
0000000000000000 T main
because foo is not referenced in the translation unit, is not external,
and so may be optimised out.
But in this program:
main2.c
static void __attribute__((used)) foo(void){}
int main(void)
{
return 0;
}
__attribute__((used)) compels the compiler to emit the local definition:
$ gcc -c -O1 main2.c
$ nm main2.o
0000000000000000 t foo
0000000000000001 T main
But this does nothing to inhibit the linker from discarding a section
in which foo is defined, in the presence of -gc-sections, even if foo is external, if that section is unused:
main3.c
void foo(void){}
int main(void)
{
return 0;
}
Compile with function-sections:
$ gcc -c -ffunction-sections -O1 main3.c
The global definition of foo is in the object file:
$ nm main3.o
0000000000000000 T foo
0000000000000000 T main
But after linking:
$ gcc -Wl,-gc-sections,-Map=mapfile main3.o
foo is not defined in the program:
$ nm a.out | grep foo; echo Done
Done
And the function-section defining foo was discarded:
mapfile
...
...
Discarded input sections
...
...
.text.foo 0x0000000000000000 0x1 main3.o
...
...
As per Eric Postpischil's comment, to force the linker to retain
an apparently unused function-section you must tell it to assume that the program
references the unused function, with linker option {-u|--undefined} foo:
main4.c
void __attribute__((section(".mySec"))) foo(void){}
int main(void)
{
return 0;
}
If you don't tell it that:
$ gcc -c main4.c
$ gcc -Wl,-gc-sections main4.o
$ nm a.out | grep foo; echo Done
Done
foo is not defined in the program. If you do tell it that:
$ gcc -c main4.c
$ gcc -Wl,-gc-sections,--undefined=foo main4.o
$ nm a.out | grep foo; echo Done
0000000000001191 T foo
Done
it is defined. There's no use for attribute used.

Apart from -u already mentioned here are two other ways to keep the symbol using GCC.
Create a reference to it without calling it
This approach does not require messing with linker scripts, which means it will work for hosted programs and libraries using the operating system's default linker script.
However it varies with compiler optimization settings and may not be very portable.
For example, in GCC 7.3.1 with LD 2.31.1, you can keep a function without actually calling it, by calling another function on its address, or branching on a pointer to its address.
bool function_exists(void *address) {
return (address != NULL);
}
// Somewhere reachable from main
assert(function_exists(foo));
assert(foo != NULL); // Won't work, GCC optimises out the constant expression
assert(&foo != NULL); // works on GCC 7.3.1 but not GCC 10.2.1
Another way is to create a struct containing function pointers, then you can group them all together and just check the address of the struct. I use this a lot for interrupt handlers.
Modify the linker script to keep the section
If you are developing a hosted program or a library, then it's pretty tricky to change the linker script.
Even if you do, its not very portable, for example gcc on OSX does not actually use the GNU linker since OSX uses the Mach-O format instead of ELF.
Your code already shows a custom section though, so it's possible you are working on an embedded system and can easily modify the linker script.
SECTIONS {
// ...
.mySec {
KEEP(*(.mySec));
}
}

Related

How to combine LTO with symbol versioning

I would like to compile a shared library using both symbol versioning and link-time optimization (LTO). However, as soon as I turn on LTO, some of the exported symbols vanish. Here is a minimal example:
Start by defining two implementations of a function fun:
$ cat fun.c
#include <stdio.h>
int fun1(void);
int fun2(void);
__asm__(".symver fun1,fun#v1");
int fun1() {
printf("fun1 called\n");
return 1;
}
__asm__(".symver fun2,fun##v2");
int fun2() {
printf("fun2 called\n");
return 2;
}
Create a version script to ensure that only fun is exported:
$ cat versionscript
v1 {
global:
fun;
local:
*;
};
v2 {
global:
fun;
} v1;
First attempt, compile without LTO:
$ gcc -o fun.o -Wall -Wextra -O2 -fPIC -c fun.c
$ gcc -o libfun.so.1 -shared -fPIC -Wl,--version-script,versionscript fun.o
$ nm -D --with-symbol-versions libfun.so.1 | grep fun
00000000000006b0 T fun##v2
0000000000000690 T fun#v1
..exactly as it should be. But if I compile with LTO:
$ gcc -o fun.o -Wall -Wextra -flto -O2 -fPIC -c fun.c
$ gcc -o libfun.so.1 -flto -shared -fPIC -Wl,--version-script,versionscript fun.o
$ nm -D --with-symbol-versions libfun.so.1 | grep fun
..no symbols exported anymore.
What am I doing wrong?
WHOPR Driver Design gives some strong hints to what is going on. The function definitions fun1 and fun2 are not exported according to the version script. The LTO plugin is able to use this information, and since GCC does not peek into the asm directives, it knows nothing about the .symver directive, and therefore removes the function definition.
For now, adding __attribute__ ((externally_visible)) is the workaround for this. You also need to build with -flto-partition=none, so that the .symver directives do not land by accident in a different intermediate assembler file than the function definition (where it will not have the desired effect).
GCC PR 48200 tracks an enhancement request for symbol versioning at the compiler level, which would likely address this issue as well.
It looks like my externally_visible fix works. This is:
#define DLLEXPORT __attribute__((visibility("default"),externally_visible))
DLLEXPORT int fun1(void);
Also see: https://gcc.gnu.org/onlinedocs/gccint/WHOPR.html
But I think your versionscript is wrong.
If I take out the visibility overrides and change your versionscript by adding fun1 and fun2 then it works. Like:
v1 {
global:
fun; fun1;
local:
*;
};
v2 {
global:
fun; fun2;
} v1;
The symbol alias targets have to be visible as well as the alias.
I just hit the same problem - so thank you for asking this. However I've found it to be more clean to use __attribute__((used)). Since gcc is not scanning the top level assembler, it can't figure out that fun1 and fun2 are being used ... so it removes them. So it looks to me that changing definition to:
__asm__(".symver fun1,fun#v1");
int __attribute__((used)) fun1() {
printf("fun1 called\n");
return 1;
}
should be sufficient.

Can you compile a shared object to prefer local symbols even if it's being loaded by a program compiled with -rdynamic?

I am building a shared library in C that is dynamically loaded by a program that I do not have source access to. The target platform is a 64-bit Linux platform and we're using gcc to build. I was able to build a reproduction of the issue in ~100 lines, but it's still a bit to read. Hopefully it's illustrative.
The core issue is I have two non-static functions (bar and baz) defined in my shared library. Both need to be non-static as we expect the caller to be able to dlsym them. Additionally, baz calls bar. The program that is using my library also has a function named bar, which wouldn't normally be an issue, but the calling program is compiled with -rdynamic, as it has a function foo that needs to be called in my shared library. The result is my shared library ends up linked to the calling program's version of bar at runtime, producing unintuitive results.
In an ideal world I would be able to include some command line switch when compiling my shared library that would prevent this from happening.
The current solution I have is to rename my non-static functions as funname_local and declare them static. I then define a new function:
funname() { return funname_local(); }, and change any references to funname in my shared library to funname_local. This works, but it feels cumbersome, and I'd much prefer to just tell the linker to prefer symbols defined in the local compilation unit.
internal.c
#include <stdio.h>
#include "internal.h"
void
bar(void)
{
printf("I should only be callable from the main program\n");
}
internal.h
#if !defined(__INTERNAL__)
#define __INTERNAL__
void
bar(void);
#endif /* defined(__INTERNAL__) */
main.c
#include <dlfcn.h>
#include <stdio.h>
#include "internal.h"
void
foo(void)
{
printf("It's important that I am callable from both main and from any .so "
"that we dlopen, that's why we compile with -rdynamic\n");
}
int
main()
{
void *handle;
void (*fun1)(void);
void (*fun2)(void);
char *error;
if(NULL == (handle = dlopen("./shared.so", RTLD_NOW))) { /* Open library */
fprintf(stderr, "dlopen: %s\n", dlerror());
return 1;
}
dlerror(); /* Clear any existing error */
*(void **)(&fun1) = dlsym(handle, "baz"); /* Get function pointer */
if(NULL != (error = dlerror())) {
fprintf(stderr, "dlsym: %s\n", error);
dlclose(handle);
return 1;
}
*(void **)(&fun2) = dlsym(handle, "bar"); /* Get function pointer */
if(NULL != (error = dlerror())) {
fprintf(stderr, "dlsym: %s\n", error);
dlclose(handle);
return 1;
}
printf("main:\n");
foo();
bar();
fun1();
fun2();
dlclose(handle);
return 0;
}
main.h
#if !defined(__MAIN__)
#define __MAIN__
extern void
foo(void);
#endif /* defined(__MAIN__) */
shared.c
#include <stdio.h>
#include "main.h"
void
bar(void)
{
printf("bar:\n");
printf("It's important that I'm callable from a program that loads shared.so"
" as well as from other functions in shared.so\n");
}
void
baz(void)
{
printf("baz:\n");
foo();
bar();
return;
}
compile:
$ gcc -m64 -std=c89 -Wall -Wextra -Werror -pedantic -o main main.c internal.c -l dl -rdynamic
$ gcc -m64 -std=c89 -Wall -Wextra -Werror -pedantic -shared -fPIC -o shared.so shared.c
run:
$ ./main
main:
It's important that I am callable from both main and from any .so that we dlopen, that's why we compile with -rdynamic
I should only be callable from the main program
baz:
It's important that I am callable from both main and from any .so that we dlopen, that's why we compile with -rdynamic
I should only be callable from the main program
bar:
It's important that I'm callable from a program that loads shared.so as well as from other functions in shared.so
Have you tried -Bsymbolic linker option (or -Bsymbolic-functions)? Quoting from ld man:
-Bsymbolic
When creating a shared library, bind references to global symbols to the definition within the shared library, if any. Normally, it is possible for a program linked against a shared library to override the definition within the shared library. This option can also be used with the --export-dynamic option, when creating a position independent executable, to bind references to global symbols to the definition within the executable. This option is only meaningful on ELF platforms which support shared libraries and position independent executables.
It seems to solve the problem:
$ gcc -m64 -std=c89 -Wall -Wextra -Werror -pedantic -shared -fPIC -o shared.so shared.c
$ gcc -m64 -std=c89 -Wall -Wextra -Werror -pedantic -o main main.c internal.c -l dl -rdynamic
$ ./main
main:
It's important that I am callable from both main and from any .so that we dlopen, that's why we compile with -rdynamic
I should only be callable from the main program
baz:
It's important that I am callable from both main and from any .so that we dlopen, that's why we compile with -rdynamic
I should only be callable from the main program
bar:
It's important that I'm callable from a program that loads shared.so as well as from other functions in shared.so
$ gcc -m64 -std=c89 -Wall -Wextra -Werror -pedantic -shared -fPIC -Wl,-Bsymbolic -o shared.so shared.c
$ ./main
main:
It's important that I am callable from both main and from any .so that we dlopen, that's why we compile with -rdynamic
I should only be callable from the main program
baz:
It's important that I am callable from both main and from any .so that we dlopen, that's why we compile with -rdynamic
bar:
It's important that I'm callable from a program that loads shared.so as well as from other functions in shared.so
bar:
It's important that I'm callable from a program that loads shared.so as well as from other functions in shared.so
A common solution for this problem is to not actually depend on a global symbol not being overriden. Instead do the following:
Call the function bar from your library mylib_bar or something like that
Hide mylib_bar with __attribute__((visibility("hidden"))) or similar
Make bar a weak symbol, referring to mylib_bar like this:
#pragma weak bar = mylib_bar
Make your library call mylib_bar everywhere instead of bar
Now everything works as expected:
When your library calls mylib_bar, this always refers to the definition in the library as the visibility is hidden.
When other code calls bar, this calls mylib_bar by default.
If someone defines his own bar, this overrides bar but not mylib_bar, leaving your library untouched.

gcc does not include extern variables in symbol table when optimisation is used

My program contains many externally defined variables. When I compile it with -O0 flag, I see them in the symbol table, but not when I use -O1 or -O2. How can I force the compiler to export them?
foo.c:
extern const int my_symbol;
void my_fn()
{
void *x = &my_symbol;
// but x is not used, that's probably why it is optimised out
}
nm foo.o (with O0):
U my_symbol
nm foo.o (with O2):
<my_symbol absent>
If your foo.c only (essentially) has
extern const int my_symbol;
then compile it with -O1 or -O2, that symbol will be optimized out. However, if you use that symbol in foo.c, for example
extern const int my_symbol;
extern int my_flag;
void foo(void)
{
if (my_symbol)
my_flag = 1;
}
All of those symbols will exist in foo.o even if you compile it with -O1 or -O2.
With `-O1' and '-O2', the compiler tries to reduce code size and execution time. One of the optimizations used to reduce the size of the resulting executable is to 'throw everything overboard' that is not required for execution. The symbol table of an executable is one of those debugging niceties that is really not required for execution; so it is excluded from the final output file.
('-O0' means "Do no optimizations").

How do I link object files in C? Fails with "Undefined symbols for architecture x86_64"

So I'm trying trying to use a function defined in another C (file1.c) file in my file (file2.c). I'm including the header of file1 (file1.h) in order to do this.
However, I keep getting the following error whenever I try to compile my file using gcc:
Undefined symbols for architecture x86_64:
"_init_filenames", referenced from:
_run_worker in cc8hoqCM.o
"_read_list", referenced from:
_run_worker in cc8hoqCM.o
ld: symbol(s) not found for architecture x86_64
I've been told I need to "link the object files together" in order to use the functions from file1 in file2, but I have no clue what that means :(
I assume you are using gcc, to simply link object files do:
$ gcc -o output file1.o file2.o
To get the object-files simply compile using
$ gcc -c file1.c
this yields file1.o and so on.
If you want to link your files to an executable do
$ gcc -o output file1.c file2.c
The existing answers already cover the "how", but I just wanted to elaborate on the "what" and "why" for others who might be wondering.
What a compiler (gcc) does: The term "compile" is a bit of an overloaded term because it is used at a high-level to mean "convert source code to a program", but more technically means to "convert source code to object code". A compiler like gcc actually performs two related, but arguably distinct functions to turn your source code into a program: compiling (as in the latter definition of turning source to object code) and linking (the process of combining the necessary object code files together into one complete executable).
The original error that you saw is technically a "linking error", and is thrown by "ld", the linker. Unlike (strict) compile-time errors, there is no reference to source code lines, as the linker is already in object space.
By default, when gcc is given source code as input, it attempts to compile each and then link them all together. As noted in the other responses, it's possible to use flags to instruct gcc to just compile first, then use the object files later to link in a separate step. This two-step process may seem unnecessary (and probably is for very small programs) but it is very important when managing a very large program, where compiling the entire project each time you make a small change would waste a considerable amount of time.
You could compile and link in one command:
gcc file1.c file2.c -o myprogram
And run with:
./myprogram
But to answer the question as asked, simply pass the object files to gcc:
gcc file1.o file2.o -o myprogram
Add foo1.c , foo2.c , foo3.c and makefile in one folder
the type make in bash
if you do not want to use the makefile, you can run the command
gcc -c foo1.c foo2.c foo3.c
then
gcc -o output foo1.o foo2.o foo3.o
foo1.c
#include <stdio.h>
#include <string.h>
void funk1();
void funk1() {
printf ("\nfunk1\n");
}
int main(void) {
char *arg2;
size_t nbytes = 100;
while ( 1 ) {
printf ("\nargv2 = %s\n" , arg2);
printf ("\n:> ");
getline (&arg2 , &nbytes , stdin);
if( strcmp (arg2 , "1\n") == 0 ) {
funk1 ();
} else if( strcmp (arg2 , "2\n") == 0 ) {
funk2 ();
} else if( strcmp (arg2 , "3\n") == 0 ) {
funk3 ();
} else if( strcmp (arg2 , "4\n") == 0 ) {
funk4 ();
} else {
funk5 ();
}
}
}
foo2.c
#include <stdio.h>
void funk2(){
printf("\nfunk2\n");
}
void funk3(){
printf("\nfunk3\n");
}
foo3.c
#include <stdio.h>
void funk4(){
printf("\nfunk4\n");
}
void funk5(){
printf("\nfunk5\n");
}
makefile
outputTest: foo1.o foo2.o foo3.o
gcc -o output foo1.o foo2.o foo3.o
make removeO
outputTest.o: foo1.c foo2.c foo3.c
gcc -c foo1.c foo2.c foo3.c
clean:
rm -f *.o output
removeO:
rm -f *.o
Since there's no mention of how to compile a .c file together with a bunch of .o files, and this comment asks for it:
where's the main.c in this answer? :/ if file1.c is the main, how do
you link it with other already compiled .o files? – Tom Brito Oct 12
'14 at 19:45
$ gcc main.c lib_obj1.o lib_obj2.o lib_objN.o -o x0rbin
Here, main.c is the C file with the main() function and the object files (*.o) are precompiled. GCC knows how to handle these together, and invokes the linker accordingly and results in a final executable, which in our case is x0rbin.
You will be able to use functions not defined in the main.c but using an extern reference to functions defined in the object files (*.o).
You can also link with .obj or other extensions if the object files have the correct format (such as COFF).

Error when compiling with GCC

Every time I compile I get the following error message:
Undefined reference to ( function name )
Let's say I have three files: Main.c, printhello.h, printhello.c. Main.c calls function print_hello(), which returns "Hello World". The function is defined in printhello.c.
Now, here's the following code of printhello.h:
#ifndef PRINTHELLO_H
#define PRINTHELLO_H
void print_hello();
#endif
I am sure this code is fine. I still don't know why is it giving me the error, though. Can you help me?
Undefined references are the linker errors. Are you compiling and linking all the source files ? Since the main.c calls print_hello(), linker should see the definition of it.
gcc Main.c printhello.c -o a.out
The error is, I think, a linker error rather than a compiler error; it is trying to tell you that you've not provided all the functions that are needed to make a complete program.
You need to compile the program like this:
gcc -o printhello Main.c printhello.c
This assumes that your file Main.c is something like:
#include "printhello.h"
int main(void)
{
print_hello();
return 0;
}
and that your file printhello.c is something like:
#include "printhello.h"
#include <stdio.h>
void print_hello(void)
{
puts("Hello World");
}
Your declaration in printhello.h should be:
void print_hello(void);
This explicitly says that the function takes no parameters. The declaration with the empty brackets means "there is a function print_hello() which returns no value and takes an indeterminate (but not variadic) list of arguments", which is quite different. In particular, you could call print_hello() with any number of arguments and the compiler could not reject the program.
Note that C++ treats the empty argument list the same as void print_hello(void); (so it would ensure that calls to print_hello() include no arguments), but C++ is not the same as C.
Another way to do it is to explicitly build object files for the printhello:
gcc -c printhello.c -o printhello.o
gcc -o Main main.c printhello.o
This has the added benefit of allowing other programs to use the print_hello method
It seems that the error is from the linker and not the compiler. You need to compile and link both the source files. I think what you are doing is simply including the header file in Main.c and you are not compiling the printhello.c
You need to :
gcc Main.c printhello.c -o myprog
or
construct the object files first
gcc -c printhello.c
gcc -c Main.c
then link them
gcc Main.o printhello.o

Resources