Detecting unresolved symbols in an ELF executable - c

Let's say I have two files:
// shared.c (will be compiled to 'shared.so')
#include <stdio.h>
int f() { printf("hello\n"); }
and
// exe.c (will be compiled to 'exe')
#include <stdio.h>
int f();
int main() {
int i;
scanf("%d", &i);
if (i == 5) f();
}
I compile both files as following:
gcc -shared shared.c -o libshared.so
gcc exe.c -o exe -lshared -L.
When I run exe and type 5, it will call f and then exit. However, if I delete f from shared.c and recompile it I will get a runtime symbol lookup error only if I type 5. Is there a way that I can check that exe has all its symbols that will work independent of user input in this case? Preferrably without running it.

You can use ldd -r exe command to list the shared library dependencies.
Here is my output for your example without the f function:
$ LD_LIBRARY_PATH=. ldd -r ./exe
linux-vdso.so.1 (0x00007ffcfa7c3000)
libshared.so => ./libshared.so (0x00007f303a02e000)
libc.so.6 => /lib64/libc.so.6 (0x0000003e26c00000)
/lib64/ld-linux-x86-64.so.2 (0x0000003e26400000)
undefined symbol: f (./exe)
(Don't mind the LD_LIBRARY_PATH=. part. It is used to tell to look for shared libraries in the current directory)

#tohava
When you compile the executable and link it with the shared object, ld (linker) checks if all referenced symbols are available in the list of shared objects your executable is dependent on and will throw an error if any symbol was unresolved.
So, I am not sure how you managed to get a runtime error when you removed f() from the shared library and rebuilt the executable. (I did the exercise myself and got the linker error).

Related

System function overloading in nested shared library

Simple scenario
Application uses write function from libc and links to shared library.
Shared library defines write function.
Original write function from libc will be overloaded by its version from the shared library.
Nested scenario
Application uses write function from libc.
Shared library 1 does not defines its own write but depends on shared library 2.
Shared library 2 defines write function.
No. write function will NOT be replaced with a version from the second shared library.
I do want to understand Why so ? How to make it work with nested shared library dependencies.
Here is exact code examples:
main.c
#include <unistd.h>
int main() {
write(1,"Hello\n",6);
return 0;
}
shared-lib-1.c
#include <unistd.h>
__attribute__((constructor))
void shared_lib_1_constructor(void) {
char str[] = "shared-lib-1-constructor\n";
write(1, str, sizeof(str));
}
shared-lib-2.c
#include <stdlib.h>
#include <unistd.h>
ssize_t write(int fd, const void *buf, size_t count) {
exit(1);
return 0;
}
__attribute__((constructor))
void shared_lib_2_constructor(void) {
char str[] = "shared-lib-2-constructor\n";
write(1, str, sizeof(str));
}
build.sh
#!/usr/bin/env sh
set -v
set -e
gcc -g -fPIC -shared shared-lib-2.c -o libshared-lib-2.so
gcc -g -fPIC -shared -Wl,-rpath . -L`pwd` -lshared-lib-2 shared-lib-1.c -o libshared-lib-1.so
gcc -g -L`pwd` -Wl,-rpath . -lshared-lib-1 main.c
Execution of the a.out gives:
[smirnov#localhost tmp]$ ./a.out
shared-lib-2-constructor
shared-lib-1-constructor
Hello
write was not overwritten by shared-lib-2. It was completely ignored even from shared-lib-2 code.
How does it work with nested libraries ? Moving write definition to shared-lib-1 does work, it overloads glibc version and exits from application.
Running application with LD_DEBUG=all shows order of resolving:
36441: Initial object scopes
36441: object=./a.out [0]
36441: scope 0: ./a.out ./libshared-lib-1.so /lib64/libc.so.6 ./libshared-lib-2.so /lib64/ld-linux-x86-64.so.2
...
36441: calling init: ./libshared-lib-2.so
36441:
36441: symbol=write; lookup in file=./a.out [0]
36441: symbol=write; lookup in file=./libshared-lib-1.so [0]
36441: symbol=write; lookup in file=/lib64/libc.so.6 [0]
36441: binding file ./libshared-lib-2.so [0] to /lib64/libc.so.6 [0]: normal symbol `write'
Why libc placed in between shared-lib-1 and shared-lib-2 ?
It seems that ld resolves in the following order:
all dependencies of a.out, but not recursivelly, only first level
all subdependencies, but not recursivelly
so on...
The only solution I've found is to use
LD_DYNAMIC_WEAK=1 ./a.out
Is there a way to fix behavior in advance ?
write function will NOT be replaced with a version from the second shared library.
This is not what I expected, and it took me a while to figure out why it's happening.
The issue is that libc.so.6 doesn't define write function (if it did, everything would have worked the same whether the dependency is direct or not).
Instead, libc.so.6 defines a versioned write##GLIBC_2.2.5 symbol. It is that symbol versioning (a GNU extension) which interferes with your desired outcome.
Compare:
gcc main.c ./libshared-lib-1.so && nm -D a.out | grep ' write'
U write#GLIBC_2.2.5
gcc main.c ./libshared-lib-2.so && nm -D a.out | grep ' write'
U write
When performing symbol resolution, the loader looks up symbol version in the search scope (and a.out is at the front). With libshared-lib-1.so, the loader finds that it should be looking for write version GLIBC_2.2.5 (which can only be found in libc.so.6), and ignores the un-versioned definition in libshared-lib-2.so.
Is there a way to fix behavior in advance ?
If you can control the application link line, it's best to link a.out with libshared-lib-2.so:
gcc main.c ./libshared-lib-1.so ./libshared-lib-2.so
./a.out
# no output as expected.
If you can't control the application link line, you should be able to provide a versioned write#GLIBC_2.2.5 using the linker --version-script flag and appropriate version script. But my initial attempt at doing that failed, and I am not sure this approach can be made to work.

I don't know how to link multiple files

I'm learning how to use header files and I have a problem while linking these 3 files:
f.c:
#include "f.h"
int getfavoritenumber(void)
{
return 3;
}
f.h:
#ifndef _f_H_
#define _f_H_
int getfavoritenumber(void);
#endif
main.c:
#include <stdio.h>
#include "f.h"
int main (void)
{
printf("%d\n", getfavoritenumber());
return 0;
}
Compiling gcc main.c -o f I get this error:
Undefined symbols for architecture x86_64:
"_getfavoritenumber", referenced from:
_main in main-7be23f.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
But if I include the f.c file with gcc main.c f.c -o f it works.
So, when compiling, should I include each C file that I used in my project, or am I missing something? Because adding each single file to gcc is very annoying.
When you have multiple source files that together make an executable, they must be each compiled and then linked together. This is done by specifying each source file when the compiler is invoked as you've discovered.
Note also that you can either do the compiling and linking in one step as you've done, or you can separate them as follows:
gcc -c main.c
gcc -c f.c
gcc -o f main.o f.o
Managing multiple source files and their dependencies is where using a makefile comes into play.
You need to understand there are three stages from c source code to run able binary. For etch and every file this stages must be followed except for header file or .h file.
First Stage : Source to assembly. gcc -S -o source.s source.c
Second Stage : assembly to Compiled Binary Object. Which can not be runned directly. gcc -c source.s -o source.o
Third Stage : In this stage we marge all the compiled binary to a single compiled binary where an input binary object holds an entry function int main().This the file which we can run on our operating system.gcc -o OutputFile source1.o source2.o source3.o ......
After this three stages we can run our program by ./OutputFile. You can avoid making assembly file and directly make object file.
Yeah you must include all the file for compilation in you project. You can use automation tools like automake or linux shell script for doing this job.

-rdynamic for select symbols only?

Scenario:
Executable loads shared object at run time via dlopen.
The shared object references some symbol (a function) that is actually compiled into the main executable.
This works fine if I add -rdynamic to gcc when linking the executable.
-rdynamic exports all non-static symbols of the executable. My shared object only needs a select few.
Question: Is there a way to achieve the effect of -rdynamic, but restricted the the few select symbols that I know are needed by my shared object?
Edit:
At least two people misunderstood the question, so I try to clarify:
This question is about exporting a symbol from the main executable.
This question is not about exporting a symbol from a dynamic library.
Here is a minimal example:
func.h, the common header file
#include <stdio.h>
void func(void);
main.c, the main executable code:
#include <dlfcn.h>
#include "func.h"
// this function is later called by plugin
void func(void) {
printf("func\n");
}
int main() {
void * plugin_lib = dlopen("./plugin.so", RTLD_NOW);
printf("dlopen -> %p, error: %s\n", plugin_lib, dlerror());
// find and call function "plugin" in plugin.so
void (*p)(void); // declares p as pointer to function
p = dlsym(plugin_lib, "plugin");
p();
return 0;
}
plugin.c, code for the plugin that is loaded at runtime:
#include "func.h"
void plugin()
{
printf("plugin\n");
func();
}
If I compile with
$ gcc -o main main.c -ldl
$ gcc -shared -fPIC -o plugin.so plugin.c
Then plugin.so cannot be loaded, because it references the symbol func, which cannot be resolved:
$ ./main
dlopen -> (nil), error: ./plugin.so: undefined symbol: func
Segmentation fault (core dumped)
I can convince the main executable to export all its global symbols by compiling with -rdynamic:
$ gcc -rdynamic -o main main.c -ldl
$ ./main
dlopen -> 0x75e030, error: (null)
plugin
func
But this fills the dynamic symbol table unnecessarily with all symbols.
(This dynamic symbol table can be inspected with nm -D main.)
The question is, how can I add only "func" to the dynamic symbol table of the main executable, and not everything.
Unfortunately it's harder to achieve this for executables. You need to generate a list of symbols that you want to export and then add -Wl,--dynamic-list=symfile.txt to LDFLAGS.
Here's example of how it's done in Clang (and here's the script they use to generate the symbols file).
You could do it with the visibility attribute of GCC.
Declare the function you need to export with __attribute__ ((visibility ("default"))) flag. Then compile your whole library passing -fvisibility=hidden argument to GCC.
For full explanation on this, refer to the following GCC documentation page.

when dlopen one so, it's symbol is not covered by main symbol, why?

libp2.c
#include <stdio.h>
void pixman()
{
printf("pixman in libp1\n");
}
libc2.c
#include <stdio.h>
void pixman();
void cairo()
{
printf("cairo2\n");
pixman();
}
main.c
#include <stdio.h>
#include <dlfcn.h>
void pixman()
{
printf("pixman in main\n");
}
int main()
{
pixman();
void* handle=NULL;
void (*callfun)();
handle=dlopen("/home/zpeng/test/so_test/libc2.so",RTLD_LAZY);
callfun = (void(*)())dlsym(handle, "cairo");
callfun();
...
}
compile
gcc -c libp2.c -fPIC -olibp2.o
rm libp2.a
ar -rs libp2.a libp2.o
gcc -shared -fPIC libc2.c ./libp2.a -o libc2.so
gcc main.c -ldl -L. -g
the result:
pixman in main
cairo2
pixman: libp2
why the last is not "pixman in main"?
I see the symbols processing(LD_DEBUG=symbols), it begins with :
21180: symbol=pixman; lookup in file=./a.out
21180: symbol=pixman; lookup in file=/lib64/libdl.so.2
21180: symbol=pixman; lookup in file=/lib64/tls/libc.so.6
21180: symbol=pixman; lookup in file=/lib64/ld-linux-x86-64.so.2
21180: symbol=pixman; lookup in file=/home/zpeng/test/so_test/libc2.so
if I add -lc2 or -rdynamic to gcc main cmd , it will generate:
pixman in main
cairo2
pixman in main
My questions:
why lookup symbol in a.out but not get the result and continue to search libc2.so when without -rdynamic and -lc2 ?
Why the last is not "pixman in main"?
That's because shared libraries have their own global offset table or GOT. When you use the cairo function in libc2.so, the pixman function that will be called is the same function that was resolved when compiling the .so file in the first place.
That is:
# creates object file only -- contains first pixman implementation
gcc -c libp2.c -fPIC -olibp2.o
# just turns the object file into an archive
ar -rs libp2.a libp2.o
# creates the .so file -- all symbols in libc2.c are resolved here
# and you passed in the .a file for that purpose. The .a file containing the
# first pixman implementation gets put in libc2.so.
gcc -shared -fPIC libc2.c ./libp2.a -o libc2.so
After this, anyone using libc2.so will get the copy stored in libc2.so. The lookup order you post is for a.out I believe and it's right. It looks for pixman in a.out, then libc2.so, and so on.
Why lookup symbol in a.out but not get the result and continue to search libc2.so when without -rdynamic and -lc2?
The rdynamic option loads ALL symbols to the dynamic symbol table -- not just the ones it thinks are used (lc2 has the same effect). When you load all those symbols you have a conflict -- the pixman function. The main.c implementation is used in this case. As others have pointed out, this will probably generate a warning.
You need to compile the sources that get archived into the .a file with -fvisibility=hidden, to indicate that, although they are global functions, they are not meant to be used outside the resulting library but are instead meant to resolve symbols inside the library. That will cause the symbols in the .a file to appear with the qualifier " t " in nm -a instead of " T ", which is used for symbols available to other libraries.
It just auto binded to LOCAL symbol,
Since there not __attribute__((visibility("default"))) explicit in libp2.c, the compiler auto bind this function calling to LOCAL .symtab, instead of .dynsym
appendix1: more about ELF header: readelf -s xxx.lib
appendix2: keyword of ld argument -Bsymbolic-functions

C program linking with shared library without setting LD_LIBRARY_PATH

I was reading Introduction to GCC and it says if a package has both .a and .so. gcc prefer the shared library. By default the loader searches for shared libraries only in a predefined set of system directories, such as /usr/local/lib and /usr/lib. If the library is not located in one of these directories it must be added to the load path, or you need to use -static option to force it to use the .a library. However, I tried the following:
vim hello.c:
#include <gmp.h>
#include <stdio.h>
int main() {
mpz_t x;
mpz_init(x);
return 0;
}
gcc hello.c -I/opt/include -L/opt/lib -lgmp (my gmp library is in opt)
./a.out
And it runs. The book says it should have the following error:
./a.out: error while loading shared libraries:
libgdbm.so.3: cannot open shared object file:
No such file or directory
(well, the book uses GDBM as example but I used GMP, but this won't matter right?)
However, I did not set LD_LIBRARY_PATH=/opt/lib, and as you can see I did not use -static option either, but a.out still runs.
Can you all tell me why and show me how to get the error described in the book? Yes I want the error so I will understand what I misunderstood.
From your response to my comment:
linux-gate.so.1 => (0xb7746000)
libgmp.so.10 => /usr/lib/i386-linux-gnu/libgmp.so.10 (0xb76c5000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb7520000)
/lib/ld-linux.so.2 (0xb7747000)
So, your program is picking up the lib from /usr/lib.
What you can try to do is rename the lib in your /opt/lib, and link against the new name.
mv /opt/lib/libgmp.so /opt/lib/libgmp-test.so
gcc hello.c -I/opt/include -L/opt/lib -lgmp-test
Then try running the program. Also, compare the result of ldd against the new a.out against what you got before.

Resources