Why does gcc/clang know to link to libc by default? - c

When I run clang/gcc to compile a .c file, I don't need to explicitly link to libc. But it still works as libc and two additional libraries are automatically linked. Why does gcc/clang know to link automatically? Where is this behavior mentioned?
$ cat main.c
/* vim: set noexpandtab tabstop=2: */
#include <stdio.h>
int main() {
puts("Hello World!");
return 0;
}
$ clang -o main.exe main.c # or gcc
$ ./main.exe
Hello World!
$ nm -D /lib/x86_64-linux-gnu/libc-2.27.so | grep -w puts
00000000000809c0 W puts
$ ldd main.exe
linux-vdso.so.1 (0x00007ffe743ba000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f397ce7b000)
/lib64/ld-linux-x86-64.so.2 (0x00007f397d26c000)

Why does gcc/clang know to link automatically?
The GCC developers built this into GCC as a convenience. Which libraries are linked by default is partly affected by the language being compiled, which is deduced from the file names and may be controlled with the -x switch.
Where is this behavior mentioned?
This page in the GCC documentation mentions there are some libraries linked in by default and says you can disable or modify this behavior with -nostdlib and other switches, but I do not see an explicit list of the libraries that are linked in by default. It might vary by system/platform as well as by language. You can use the -v switch to ask GCC to show you the commands it is executing, and the link command (using ld) should reveal the libraries.

Related

Shared library not found when compiling a C program

So, I have a simple program which looks like so:
#include <amqp.h>
#include <amqp_framing.h>
int main(int argc, char const * const *argv) {
amqp_connection_state_t conn;
conn = amqp_new_connection();
amqp_destroy_connection(conn);
return 0;
}
This program depends on rabbitmq-c library. I compiled it with no errors. So, when I run
$ ls /rabbitmq-c/_install/include/
I get all its header files, that I need:
amqp.h
amqp_framing.h
amqp_tcp_socket.h
And when I run
$ ls /rabbitmq-c/_build/librabbitmq/
I see all needed ".so" files:
CMakeFiles
Makefile
cmake_install.cmake
config.h
librabbitmq.a
librabbitmq.so
librabbitmq.so.4
librabbitmq.so.4.4.1
And finally I compile my own program like so:
$ gcc -I/rabbitmq-c/_install/include/ -g -Wall -c main.c
$ gcc -L/rabbitmq-c/_build/librabbitmq/ -g -Wall -o rabbit main.o -lrabbitmq
It compiles with no errors. However, when I do:
$ ldd ./rabbit
I get this message:
librabbitmq.so.4 => not found
So, what am I missing and how can I fix it?
When you link shared library into an executable, the linker will recorder the library name (in this case librabbitmq.so.4) into the executable. It is the job of the dynamic linker (ld.so), to locate the libraries, and combine them for execution.
To locate the libraries, the dynamic linker constructs a search path (similar to PATH). This include:
LD_LIBRARY_PATH
Hard-coded directories added to the executable.
Default folders (/lib, /usr/lib, etc.).
In the above case, looks like neither #1 nor #2 were used, and the library is not in the default location. Can be fixed using #1 or #2
# Option 1.
# Both gcc, and ldd consult LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/rabbitmq-c/_build/librabbitmq
gcc -g -Wall -o rabbit main.o -lrabbitmq
ldd ./rabbit
# Option #2
# Inject SO directory into the executable with -Wl,-rpath,...
gcc -L/rabbitmq-c/_build/librabbitmq/ -Wl,-rpath,/rabbitmq-c/_build/librabbitmq/ -g -Wall -o rabbit main.o -lrabbitmq
ldd ./rabbit
Consult man ld.so for the full details.
From personal experience, when dealing with 'one-off' libraries, better to use the 'rpath' (#2) approach. Trying to add lot of locations into LD_LIBRARY_PATH can easily result in hard to manage, long, LD_LIBRARY_PATH. Using LD_LIBRARY_PATH works best when a wrapper script is created to launch the program
File: rabbit-run (same folder as executable)
# Prepend rabbitmq SO location to current LD_LIBRARY_PATH
LD_LIBRARY_PATH=LD_LIBRARY_PATH=/rabbitmq-c/_build/librabbitmq${LD_LIBRARY_PATH+:$X}
# Execute the binary, from the same location of the launcher
${0%/*}/./rabbit
If your binary don't find your "librabbitmq.so.4", that means this shared object is not found by ld (the dynamic linker)
First step, do a "ldconfig". Does this solve your problem ?
Yes ? Cool.
if not, then you have to tell ldconfig where to look to find "librabbitmq.so.4".
So either you move it in a knowed folder (LD_LIBRARY_PATH for exemple) or add it so it will be knowed by ld.
echo '/rabbitmq-c/_build/librabbitmq' > '/etc/ld.so.conf.d/name_this_file_yourself.conf'
ldconfig
This should fix your issue.

a linker issue when learning static library [duplicate]

When I try to build the following program:
#include <stdio.h>
int main(void)
{
printf("hello world\n");
return 0;
}
On OS X 10.6.4, with the following flags:
gcc -static -o blah blah.c
It returns this:
ld: library not found for -lcrt0.o
collect2: ld returned 1 exit status
Has anyone else encountered this, or is it something that noone else has been affected with yet? Any fixes?
Thanks
This won’t work. From the man page for gcc:
This option will not work on Mac OS X unless all libraries (including libgcc.a) have also been compiled with -static. Since neither a static version of libSystem.dylib nor crt0.o are provided, this option is not useful to most people.
Per Nate's answer, a completely static application is apparently not possible - see also man ld:
-static Produces a mach-o file that does not use the dyld. Only used building the kernel.
The problem in linking with static libraries is that, if both a static and a dynamic version of a library are found in the same directory, the dynamic version will be taken in preference. Three ways of avoiding this are:
Do not attempt to find them via the -L and -l options; instead, specify the full paths, to the libraries you want to use, on the compiler or linker command line.
$ g++ -Wall -Werror -o hi /usr/local/lib/libboost_unit_test_framework.a hi.cpp
Create a separate directory, containing symbolic links to the static libraries, use the -L option to have this directory searched first, and use the -l option to specify the libraries you want to use.
$ g++ -Wall -Werror -L ./staticBoostLib -l boost_unit_test_framework -o hi hi.cpp
Instead of creating a link of the same name in a different directory, create a link of a different name in the same directory, and specify that name in a -l argument.
$ g++ -Wall -Werror -l boost_unit_test_framework_static -o hi hi.cpp
You may also try LLVM LLD linker - I did prebuilt version for my two major OSes - https://github.com/VerKnowSys/Sofin-llds
This one allows me to link for exmple: "Qemu" properly - which is impossible with ld preinstalled by Apple.
And last one is - to build GCC yourself with libstdc++ (don't).

How to make a linux shared object (library) runnable on its own?

Noticing that gcc -shared creates an executable file, I just got the weird idea to check what happens when I try to run it ... well the result was a segfault for my own lib. So, being curious about that, I tried to "run" the glibc (/lib/x86_64-linux-gnu/libc.so.6 on my system). Sure enough, it didn't crash but provided me some output:
GNU C Library (Debian GLIBC 2.19-18) stable release version 2.19, by Roland McGrath et al.
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.8.4.
Compiled on a Linux 3.16.7 system on 2015-04-14.
Available extensions:
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<http://www.debian.org/Bugs/>.
So my question here is: what is the magic behind this? I can't just define a main symbol in a library -- or can I?
I wrote a blog post on this subject where I go more in depth because I found it intriguing. You can find my original answer below.
You can specify a custom entry point to the linker with the -Wl,-e,entry_point option to gcc, where entry_point is the name of the library's "main" function.
void entry_point()
{
printf("Hello, world!\n");
}
The linker doesn't expect something linked with -shared to be run as an executable, and must be given some more information for the program to be runnable. If you try to run the library now, you will encounter a segmentation fault.
The .interp section is a part of the resulting binary that is needed by the OS to run the application. It's set automatically by the linker if -shared is not used. You must set this section manually in the C code if building a shared library that you want to execute by itself. See this question.
The interpreter's job is to find and load the shared libraries needed by a program, prepare the program to run, and then run it. For the ELF format (ubiquitous for modern *nix) on Linux, the ld-linux.so program is used. See it's man page for more info.
The line below puts a string in the .interp section using GCC attributes. Put this in the global scope of your library to explicitly tell the linker that you want to include a dynamic linker path in your binary.
const char interp_section[] __attribute__((section(".interp"))) = "/path/to/ld-linux";
The easiest way to find the path to ld-linux.so is to run ldd on any normal application. Sample output from my system:
jacwah#jacob-mint17 ~ $ ldd $(which gcc)
linux-vdso.so.1 => (0x00007fff259fe000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faec5939000)
/lib64/ld-linux-x86-64.so.2 (0x00007faec5d23000)
Once you've specified the interpreter your library should be executable! There's just one slight flaw: it will segfault when entry_point returns.
When you compile a program with main, it's not the first function to be called when executing it. main is actually called by another function called _start. This function is responsible for setting up argv and argc and other initialisation. It then calls main. When main returns, _start calls exit with the return value of main.
There's no return address on stack in _start as it's the first function to be called. If it tries to return, an invalid read occurs (ultimately causing a segmentation fault). This is exactly what is happening in our entry point function. Add a call to exit as the last line of your entry function to properly clean up and not crash.
example.c
#include <stdio.h>
#include <stdlib.h>
const char interp_section[] __attribute__((section(".interp"))) = "/path/to/ld-linux";
void entry_point()
{
printf("Hello, world!\n");
exit(0);
}
Compile with gcc example.c -shared -fPIC -Wl,-e,entry_point.
While linking with -shared gcc strips start files, and some objects (like cout) will not be initialized. So, std::cout << "Abc" << std::endl will cause SEGFAULT.
Approach 1
(simplest way to create executable library)
To fix it change linker options. The simplest way - run gcc to build executable with -v option (verbose) and see the linker command line. In this command line you should remove -z now, -pie (if present) and add -shared. The sources must be anyway compiled with -fPIC (not -fPIE).
Let's try. For example we have the following x.cpp:
#include <iostream>
// The next line is required, while building executable gcc will
// anyway include full path to ld-linux-x86-64.so.2:
extern "C" const char interp_section[] __attribute__((section(".interp"))) = "/lib64/ld-linux-x86-64.so.2";
// some "library" function
extern "C" __attribute__((visibility("default"))) int aaa() {
std::cout << "AAA" << std::endl;
return 1234;
}
// use main in a common way
int main() {
std::cout << "Abc" << std::endl;
}
Firstly compile this file via g++ -c x.cpp -fPIC. Then will link it dumping command-line via g++ x.o -o x -v.
We will get correct executable, which can't be dynamically loaded as a shared library. Check this by python script check_x.py:
import ctypes
d = ctypes.cdll.LoadLibrary('./x')
print(d.aaa())
Running $ ./x will be successful. Running $ python check_x.py will fail with OSError: ./x: cannot dynamically load position-independent executable.
While linking g++ calls collect2 linker wraper which calls ld. You can see command-line for collect2 in the output of last g++ command like this:
/usr/lib/gcc/x86_64-linux-gnu/11/collect2 -plugin /usr/lib/gcc/x86_64-linux-gnu/11/liblto_plugin.so -plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper -plugin-opt=-fresolution=/tmp/ccqDN9Df.res -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc --build-id --eh-frame-hdr -m elf_x86_64 --hash-style=gnu --as-needed -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie -z now -z relro -o x /usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/Scrt1.o /usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/11/crtbeginS.o -L/usr/lib/gcc/x86_64-linux-gnu/11 -L/usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/11/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/11/../../.. x.o -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc/x86_64-linux-gnu/11/crtendS.o /usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/crtn.o
Find there -pie -z now and replace with -shared. After running this command you will get new x executable, which will wonderfully work as an executable and as a shared library:
$ ./x
Abc
$ python3 check_x.py
AAA
1234
This approach has disadvantages: it is hard to do replacement automatically. Also before calling collect2 GCC will create a temporary file for LTO plugin (link-time optimization). This temporary file will be missing while you running the command manually.
Approach 2
(applicable way to create executable library)
The idea is to change linker for GCC to own wrapper which will correct arguments for collect2. We will use the following Python script collect3.py as linker:
#!/usr/bin/python3
import subprocess, sys, os
marker = '--_wrapper_make_runnable_so'
def sublist_index(haystack, needle):
for i in range(len(haystack) - len(needle)):
if haystack[i:i+len(needle)] == needle: return i
def remove_sublist(haystack, needle):
idx = sublist_index(haystack, needle)
if idx is None: return haystack
return haystack[:idx] + haystack[idx+len(needle):]
def fix_args(args):
#print("!!BEFORE REPLACE ", *args)
if marker not in args:
return args
args = remove_sublist(args, [marker])
args = remove_sublist(args, ['-z', 'now'])
args = remove_sublist(args, ['-pie'])
args.append('-shared')
#print("!!AFTER REPLACE ", *args)
return args
# get search paths for linker directly from gcc
def findPaths(prefix = "programs: ="):
for line in subprocess.run(['gcc', '-print-search-dirs'], stdout=subprocess.PIPE).stdout.decode('utf-8').split('\n'):
if line.startswith(prefix): return line[len(prefix):].split(':')
# get search paths for linker directly from gcc
def findLinker(linker_name = 'collect2'):
for p in findPaths():
candidate = os.path.join(p, linker_name)
#print("!!CHECKING LINKER ", candidate)
if os.path.exists(candidate) : return candidate
if __name__=='__main__':
args = sys.argv[1:]
args = fix_args(args)
exit(subprocess.call([findLinker(), *args]))
This script will replace arguments and call true linker. To switch linker we will create the file specs.txt with the following content:
*linker:
<full path to>/collect3.py
To tell our fake linker that we want to correct arguments we will use the additional argument --_wrapper_make_runnable_so. So, the complete command line will be the following:
g++ -specs=specs.txt -Wl,--_wrapper_make_runnable_so x.o -o x
(we suppose that you want to link existing x.o).
After this you can both run the target x and use it as dynamic library.

Linking a C program directly with ld fails with undefined reference to `__libc_csu_fini`

I'm trying to compile a C program under Linux. However, out of curiosity, I'm trying to execute some steps by hand: I use:
the gcc frontend to produce assembler code
then run the GNU assembler to get an object file
and then link it with the C runtime to get a working executable.
Now I'm stuck with the linking part.
The program is a very basic "Hello world":
#include <stdio.h>
int main() {
printf("Hello\n");
return 0;
}
I use the following command to produce the assembly code:
gcc hello.c -S -masm=intel
I'm telling gcc to quit after compiling and dump the assembly code with Intel syntax.
Then I use th GNU assembler to produce the object file:
as -o hello.o hello.s
Then I try using ld to produce the final executable:
ld hello.o /usr/lib/libc.so /usr/lib/crt1.o -o hello
But I keep getting the following error message:
/usr/lib/crt1.o: In function `_start':
(.text+0xc): undefined reference to `__libc_csu_fini'
/usr/lib/crt1.o: In function `_start':
(.text+0x11): undefined reference to `__libc_csu_init'
The symbols __libc_csu_fini/init seem to be a part of glibc, but I can't find them anywhere! I tried linking against libc statically (against /usr/lib/libc.a) with the same result.
What could the problem be?
/usr/lib/libc.so is a linker script which tells the linker to pull in the shared library /lib/libc.so.6, and a non-shared portion, /usr/lib/libc_nonshared.a.
__libc_csu_init and __libc_csu_fini come from /usr/lib/libc_nonshared.a. They're not being found because references to symbols in non-shared libraries need to appear before the archive that defines them on the linker line. In your case, /usr/lib/crt1.o (which references them) appears after /usr/lib/libc.so (which pulls them in), so it doesn't work.
Fixing the order on the link line will get you a bit further, but then you'll probably get a new problem, where __libc_csu_init and __libc_csu_fini (which are now found) can't find _init and _fini. In order to call C library functions, you should also link /usr/lib/crti.o (after crt1.o but before the C library) and /usr/lib/crtn.o (after the C library), which contain initialisation and finalisation code.
Adding those should give you a successfully linked executable. It still won't work, because it uses the dynamically linked C library without specifying what the dynamic linker is. You'll need to tell the linker that as well, with something like -dynamic-linker /lib/ld-linux.so.2 (for 32-bit x86 at least; the name of the standard dynamic linker varies across platforms).
If you do all that (essentially as per Rob's answer), you'll get something that works in simple cases. But you may come across further problems with more complex code, as GCC provides some of its own library routines which may be needed if your code uses certain features. These will be buried somewhere deep inside the GCC installation directories...
You can see what gcc is doing by running it with either the -v option (which will show you the commands it invokes as it runs), or the -### option (which just prints the commands it would run, with all of the arguments quotes, but doesn't actually run anything). The output will be confusing unless you know that it usually invokes ld indirectly via one of its own components, collect2 (which is used to glue in C++ constructor calls at the right point).
I found another post which contained a clue: -dynamic-linker /lib/ld-linux.so.2.
Try this:
$ gcc hello.c -S -masm=intel
$ as -o hello.o hello.s
$ ld -o hello -dynamic-linker /lib/ld-linux.so.2 /usr/lib/crt1.o /usr/lib/crti.o hello.o -lc /usr/lib/crtn.o
$ ./hello
hello, world
$
Assuming that a normal invocation of gcc -o hello hello.c produces a working build, run this command:
gcc --verbose -o hello hello.c
and gcc will tell you how it's linking things. That should give you a good idea of everything that you might need to account for in your link step.
In Ubuntu 14.04 (GCC 4.8), the minimal linking command is:
ld -dynamic-linker /lib64/ld-linux-x86-64.so.2 \
/usr/lib/x86_64-linux-gnu/crt1.o \
/usr/lib/x86_64-linux-gnu/crti.o \
-L/usr/lib/gcc/x86_64-linux-gnu/4.8/ \
-lc -lgcc -lgcc_s \
hello.o \
/usr/lib/x86_64-linux-gnu/crtn.o
Although they may not be necessary, you should also link to -lgcc and -lgcc_s, since GCC may emit calls to functions present in those libraries for operations which your hardware does not implement natively, e.g. long long int operations on 32-bit. See also: Do I really need libgcc?
I had to add:
-L/usr/lib/gcc/x86_64-linux-gnu/4.8/ \
because the default linker script does not include that directory, and that is where libgcc.a was located.
As mentioned by Michael Burr, you can find the paths with gcc -v. More precisely, you need:
gcc -v hello_world.c |& grep 'collect2' | tr ' ' '\n'
This is how I fixed it on ubuntu 11.10:
apt-get remove libc-dev
Say yes to remove all the packages but copy the list to reinstall after.
apt-get install libc-dev
If you're running a 64-bit OS, your glibc(-devel) may be broken. By looking at this and this you can find these 3 possible solutions:
add lib64 to LD_LIBRARY_PATH
use lc_noshared
reinstall glibc-devel
Since you are doing the link process by hand, you are forgetting to link the C run time initializer, or whatever it is called.
To not get into the specifics of where and what you should link for you platform, after getting your intel asm file, use gcc to generate (compile and link) your executable.
simply doing gcc hello.c -o hello should work.
Take it:
$ echo 'main(){puts("ok");}' > hello.c
$ gcc -c hello.c -o hello.o
$ ld hello.o -o hello.exe /usr/lib/crt1.o /usr/lib/crti.o /usr/lib/crtn.o \
-dynamic-linker /lib/ld-linux.so.2 -lc
$ ./hello.exe
ok
Path to /usr/lib/crt*.o will when glibc configured with --prefix=/usr

How to link to a different libc file?

I want to supply the shared libraries along with my program rather than using the target system's due to version differences.
ldd says my program uses these shared libs:
linux-gate.so.1 => (0xf7ef0000)**(made by kernel)**
libc.so.6 => /lib32/libc.so.6 (0xf7d88000)**(libc-2.7.so)**
/lib/ld-linux.so.2 (0xf7ef1000)**(ld-2.7.so)**
I have successfully linked ld-xxx.so by compiling with:
gcc -std=c99 -D_POSIX_C_SOURCE=200112L -O2 -m32 -s -Wl,-dynamic-linker,ld-2.7.so myprogram.c
But I have not managed to successful link libc-xxx.so. How can I do that ?
I found out how to do it:
rpath specifies where the provided libraries are located. This folder should contain: libc.so.6, libdl.so.2, libgcc_s.so.1 and maybe more. Check with strace to find out which libraries your binary file uses.
ld.so is the provided linker
gcc -Xlinker -rpath=/default/path/to/libraries -Xlinker -I/default/path/to/libraries/ld.so program.c
Passing -nodefaultlibs or -nostdlib to gcc will tell it to not pass the default libraries as arguments to ld. You will then be able to explicitly specify the libc you want to link against. See the gcc(1) man page for more details and caveats regarding both options.

Resources