Why do you have to link the math library in C? - c

If I include <stdlib.h> or <stdio.h> in a C program, I don't have to link these when compiling, but I do have to link to <math.h>, using -lm with GCC, for example:
gcc test.c -o test -lm
What is the reason for this? Why do I have to explicitly link the math library, but not the other libraries?

The functions in stdlib.h and stdio.h have implementations in libc.so (or libc.a for static linking), which is linked into your executable by default (as if -lc were specified). GCC can be instructed to avoid this automatic link with the -nostdlib or -nodefaultlibs options.
The math functions in math.h have implementations in libm.so (or libm.a for static linking), and libm is not linked in by default. There are historical reasons for this libm/libc split, none of them very convincing.
Interestingly, the C++ runtime libstdc++ requires libm, so if you compile a C++ program with GCC (g++), you will automatically get libm linked in.

Remember that C is an old language and that FPUs are a relatively recent phenomenon. I first saw C on 8-bit processors where it was a lot of work to do even 32-bit integer arithmetic. Many of these implementations didn't even have a floating point math library available!
Even on the first 68000 machines (Mac, Atari ST, Amiga), floating point coprocessors were often expensive add-ons.
To do all that floating point math, you needed a pretty sizable library. And the math was going to be slow. So you rarely used floats. You tried to do everything with integers or scaled integers. When you had to include math.h, you gritted your teeth. Often, you'd write your own approximations and lookup tables to avoid it.
Trade-offs existed for a long time. Sometimes there were competing math packages called "fastmath" or such. What's the best solution for math? Really accurate but slow stuff? Inaccurate but fast? Big tables for trig functions? It wasn't until coprocessors were guaranteed to be in the computer that most implementations became obvious. I imagine that there's some programmer out there somewhere right now, working on an embedded chip, trying to decide whether to bring in the math library to handle some math problem.
That's why math wasn't standard. Many or maybe most programs didn't use a single float. If FPUs had always been around and floats and doubles were always cheap to operate on, no doubt there would have been a "stdmath".

Because of ridiculous historical practice that nobody is willing to fix. Consolidating all of the functions required by C and POSIX into a single library file would not only avoid this question getting asked over and over, but would also save a significant amount of time and memory when dynamic linking, since each .so file linked requires the filesystem operations to locate and find it, and a few pages for its static variables, relocations, etc.
An implementation where all functions are in one library and the -lm, -lpthread, -lrt, etc. options are all no-ops (or link to empty .a files) is perfectly POSIX conformant and certainly preferable.
Note: I'm talking about POSIX because C itself does not specify anything about how the compiler is invoked. Thus you can just treat gcc -std=c99 -lm as the implementation-specific way the compiler must be invoked for conformant behavior.

Because time() and some other functions are builtin defined in the C library (libc) itself and GCC always links to libc unless you use the -ffreestanding compile option. However math functions live in libm which is not implicitly linked by gcc.

An explanation is given here:
So if your program is using math functions and including math.h, then you need to explicitly link the math library by passing the -lm flag. The reason for this particular separation is that mathematicians are very picky about the way their math is being computed and they may want to use their own implementation of the math functions instead of the standard implementation. If the math functions were lumped into libc.a it wouldn't be possible to do that.
[Edit]
I'm not sure I agree with this, though. If you have a library which provides, say, sqrt(), and you pass it before the standard library, a Unix linker will take your version, right?

There's a thorough discussion of linking to external libraries in An Introduction to GCC - Linking with external libraries. If a library is a member of the standard libraries (like stdio), then you don't need to specify to the compiler (really the linker) to link them.
After reading some of the other answers and comments, I think the libc.a reference and the libm reference that it links to both have a lot to say about why the two are separate.
Note that many of the functions in 'libm.a' (the math library) are defined in 'math.h' but are not present in libc.a. Some are, which may get confusing, but the rule of thumb is this--the C library contains those functions that ANSI dictates must exist, so that you don't need the -lm if you only use ANSI functions. In contrast, `libm.a' contains more functions and supports additional functionality such as the matherr call-back and compliance to several alternative standards of behavior in case of FP errors. See section libm, for more details.

As ephemient said, the C library libc is linked by default and this library contains the implementations of stdlib.h, stdio.h and several other standard header files. Just to add to it, according to "An Introduction to GCC" the linker command for a basic "Hello World" program in C is as below:
ld -dynamic-linker /lib/ld-linux.so.2 /usr/lib/crt1.o
/usr/lib/crti.o /usr/libgcc-lib /i686/3.3.1/crtbegin.o
-L/usr/lib/gcc-lib/i686/3.3.1 hello.o -lgcc -lgcc_eh -lc
-lgcc -lgcc_eh /usr/lib/gcc-lib/i686/3.3.1/crtend.o /usr/lib/crtn.o
Notice the option -lc in the third line that links the C library.

If I put stdlib.h or stdio.h, I don't have to link those but I have to link when I compile:
stdlib.h, stdio.h are the header files. You include them for your convenience. They only forecast what symbols will become available if you link in the proper library. The implementations are in the library files, that's where the functions really live.
Including math.h is only the first step to gaining access to all the math functions.
Also, you don't have to link against libm if you don't use it's functions, even if you do a #include <math.h> which is only an informational step for you, for the compiler about the symbols.
stdlib.h, stdio.h refer to functions available in libc, which happens to be always linked in so that the user doesn't have to do it himself.

It's a bug. You shouldn't have to explicitly specify -lm any more. Perhaps if enough people complain about it, it will be fixed. (I don't seriously believe this, as the maintainers who are perpetuating the distinction are evidently very stubborn, but I can hope.)

I think it's kind of arbitrary. You have to draw a line somewhere (which libraries are default and which need to be specified).
It gives you the opportunity to replace it with a different one that has the same functions, but I don't think it's very common to do so.
I think GCC does this to maintain backwards compatibility with the original cc executable. My guess for why cc does this is because of build time -- cc was written for machines with far less power than we have now. A lot of programs don't have any floating-point math, and they probably took every library that wasn't commonly used out of the default. I'm guessing that the build time of the Unix OS and the tools that go along with it were the driving force.

I would guess that it is a way to make applications which don't use it at all perform slightly better. Here's my thinking on this.
x86 OSes (and I imagine others) need to store FPU state on context switch. However, most OSes only bother to save/restore this state after the app attempts to use the FPU for the first time.
In addition to this, there is probably some basic code in the math library which will set the FPU to a sane base state when the library is loaded.
So, if you don't link in any math code at all, none of this will happen, therefore the OS doesn't have to save/restore any FPU state at all, making context switches slightly more efficient.
Just a guess though.
The same base premise still applies to non-FPU cases (the premise being that it was to make apps which didn't make use libm perform slightly better).
For example, if there is a soft-FPU which was likely in the early days of C. Then having libm separate could prevent a lot of large (and slow if it was used) code from unnecessarily being linked in.
In addition, if there is only static linking available, then a similar argument applies that it would keep executable sizes and compile times down.

stdio is part of the standard C library which, by default, GCC will link against.
The math function implementations are in a separate libm file that is not linked to by default, so you have to specify it -lm. By the way, there is no relation between those header files and library files.

All libraries like stdio.h and stdlib.h have their implementation in libc.so or libc.a and get linked by the linker by default. The libraries for libc.so are automatically linked while compiling and is included in the executable file.
But math.h has its implementations in libm.so or libm.a which is separate from libc.so. It does not get linked by default and you have to manually link it while compiling your program in GCC by using the -lm flag.
The GNU GCC team designed it to be separate from the other header files, while the other header files get linked by default, but math.h file doesn't.
Here read the item number 14.3, you could read it all if you wish:
Reason why math.h is needs to be linked
Look at this article: Why do we have to link math.h in GCC?
Have a look at the usage:
Using the library

Note that -lm may not always need to be specified even if you use some C math functions.
For example, the following simple program:
#include <stdio.h>
#include <math.h>
int main() {
printf("output: %f\n", sqrt(2.0));
return 0;
}
can be compiled and run successfully with the following command:
gcc test.c -o test
It was tested on GCC 7.5.0 (on Ubuntu 16.04) and GCC 4.8.0 (on CentOS 7).
The post here gives some explanations:
The math functions you call are implemented by compiler built-in functions
See also:
Other Built-in Functions Provided by GCC
How to get the gcc compiler to not optimize a standard library function call like printf?

Related

Why would gcc change the order of functions in a binary?

Many questions about forcing the order of functions in a binary to match the order of the source file
For example, this post, that post and others
I can't understand why would gcc want to change their order in the first place?
What could be gained from that?
Moreover, why is toplevel-reorder default value is true?
GCC can change the order of functions, because the C standard (e.g. n1570 or newer) allows to do that.
There is no obligation for GCC to compile a C function into a single function in the sense of the ELF format. See elf(5) on Linux
In practice (with optimizations enabled: try compiling foo.c with gcc -Wall -fverbose-asm -O3 foo.c then look into the emitted foo.s assembler file), the GCC compiler is building intermediate representations like GIMPLE. A big lot of optimizations are transforming GIMPLE to better GIMPLE.
Once the GIMPLE representation is "good enough", the compiler is transforming it to RTL
On Linux systems, you could use dladdr(3) to find the nearest ELF function to a given address. You can also use backtrace(3) to inspect your call stack at runtime.
GCC can even remove functions entirely, in particular static functions whose calls would be inline expanded (even without any inline keyword).
I tend to believe that if you compile and link your entire program with gcc -O3 -flto -fwhole-program some non static but unused functions can be removed too....
And you can always write your own GCC plugin to change the order of functions.
If you want to guess how GCC works: download and study its source code (since it is free software) and compile it on your machine, invoke it with GCC developer options, ask questions on GCC mailing lists...
See also the bismon static source code analyzer (some work in progress which could interest you), and the DECODER project. You can contact me by email about both. You could also contribute to RefPerSys and use it to generate GCC plugins (in C++ form).
What could be gained from that?
Optimization. If the compiler thinks some code is like to be used a lot it may put that code in a different region than code which is not expected to execute often (or is an error path, where performance is not as important). And code which is likely to execute after or temporally near some other code should be placed nearby, so it is more likely to be in cache when needed.
__attribute__((hot)) and __attribute__((cold)) exist for some of the same reasons.
why is toplevel-reorder default value is true?
Because 99% of developers are not bothered by this default, and it makes programs faster. The 1% of developers who need to care about ordering use the attributes, profile-guided optimization or other features which are likely to conflict with no-toplevel-reorder anyway.

Why do some system libraries require a -l option while others do not?

I recently took a class where we used pthreads and when compiling we were told to add -lpthread. But how come when using other #include <> statements for system header files, it seems that the linking of the object implementation code happens automatically? For example, if I just want to get the header file #include <stdio.h>, I don't need a -l option in compilation, the linking of that .o implementation file file just happens.
For this file
#include <stdio.h>
int main() {
return 0;
}
run
gcc -v -o simple simple.c
and you will see what, actually, gcc does. You will see that it links with libraries behind your back. This is why you don't specify system libraries explicitly.
Basic Answer: -lpthreads tells the compiler/linker to link to the pthreads library.
Longer answer
Headers tell the compiler that a certain function is going to be available (sometimes that function is defined in the header, perhaps inline) when the code is later linked. So the compiler marks the function essentially available but later during the linking phase the linker has to connect the function call to an actual function. A lot of function in the system headers are part of the "C Runtime Library" your linker automatically uses but others are provided by external libraries (such as pthreads). In the case where it is provided by an external library you have to use the '-lxxx' so the compiler/linker knows which external library to include in the process so it gets the address of the functions correctly.
Hope that helps
A C header file does not contain the implementations of the functions. It only contains function prototypes, so that the compiler could generate proper function calls.
The actual implementation is stored in a library. The most commonly used functions (such as printf and malloc) are implemented in the Standard C Library (LibC), which is implicitly linked to any executable file unless you request not to link it. The threads support is implemented in a separate library that has to be linked explicitly by passing the -pthread option to the linker.
NB: You can pass the option -pthread to the compiler that will also link the appropriate library.
The compiler links the standard C library libc.a implicitly so a -lc argument is not necessary. Pthreads is a system library but not a part of the C standard library, so must be explicitly linked.
If you were to specify -nolibc you would then need to explicitly link libc (or some alternative C library).
If all system libraries were linked implicitly, gcc would have to be have a different implementation for each system (for example pthreads is not a system library on Windows), and if a system introduced a new library, gcc would have to change in lock step. Moreover the link time would increase as each library were searched in some unknown sequence to resolve symbols. The C standard library is the one library the compiler can rely on to be provided in any particular implementation, so implicit linking is generally safe and a simple convenience.
Some library files are are searched by default, simply for convenience, basically the C standard library. Linkers have options to disable this convenience and have no default libraries (for if you are doing something unusual and don't want them). Non-default libraries, you have to tell linker to use them.
There could be a mechanism to tell what libraries are needed in the source code (include files are plain text which is just "copy-pasted" at #include line), there's no technical difficulty. But there just isn't (as far as I know, not even as non-standard compiler extension).
There are some solutions to the problem though, such as pkg-config for unixy platforms, which tackle the problem of include and library files by providing the compiler and linker options for a library easily.
the -l option is for linking against an external library in this case libpthread, against system libraries is linked by default
http://www.network-theory.co.uk/docs/gccintro/gccintro_17.html
C++: How to add external libraries

GCC linked library for compile

Why do we have to tell gcc which library to link against when that information is already in source file in form of #include?
For example, if I have a code which uses threads and has:
#include <pthread.h>
I still have to compile it with -pthread option in gcc:
gcc -pthread test.c
If I don't give -pthread option it will give errors finding thread function definitions.
I am using this version:
gcc --version
gcc (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4
This may be one of the most common things that trip up beginners to C.
In C there are two different steps to building a program, compilation and linking. For the purposes of your question, these steps connect your code to two different types of files, headers and libraries.
The #include <pthread.h> directive in your C code is handled by the compiler. The compiler (actually preprocessor) literally pastes in the contents of pthread.h into your code before turning your C file into an object file.
pthread.h is a header file, not a library. It contains a list of the functions that you can expect to find in the library, what arguments they take and what they return. A header can exist without a library and vice-versa. The header is a text file, often found in /usr/include on Unix-derived systems. You can open it just like any C file to read the contents.
The command line gcc -lpthread test.c does both compilation and linking. In the old days, you would first do something like cc test.c, then ld -lpthread test.o. As you can see, -lpthread is actually an option to the linker.
The linker does not know anything about text files like C code or headers. It only works with compiled object files and existing libraries. The -l flag tells it which libraries to look in to find the functions you are using.
The name of the header has nothing to do with the name of the library. Here it's really just by the accident. Most often there are many headers provided by the library.
Especially in C++ there is usually one header per class and the library usually provides classes implementations from the same namespace. In C the headers are organized that they contain some common subset of functions - math.h contains mathematical operations, stdio.h provides IO functions etc.
They are two separate things. .h files holds the declarations, sometimes the inline function also. As we all know, every functions should have an implementation/definition to work. These implementations are kept seperately. -lpthread, for example is the library which holds the implementation of the functions declared in headers in binary form.
Separating the implementation is what people want when you don't want to share your commercial code with others
So,
gcc -pthread test.c
tell gcc to look for definitions declared in pthread.h in the libpthread. -pthread is expanded to libpthread by linker automatically
there are/were compilers that you told it where the lib directory was and it simply scanned all the files hoping to find a match. then there are compilers that are the other extreme where you have to tell it everything to link in. the key here is include simply tells the compiler to look for some definitions or even simpler to include some external file into this file. this does not necessarily have any connection to a library or object, there are many includes that are not tied to such things and it is a bad assumption. next the linker is a different step and usually a different program from the compiler, so not only does the include not have a one to one relationship with an object or library, the linker is not the compiler.

Why do I have to explicitly link with libm? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Why do you have to link the math library in C?
When I write a program that uses functions from the math.h library, why is it that I have to explicitly link to libm even though they are part of the C standard library?
For instance, when I want to use the sin() function I need to #include <math.h> but I also need to pass -lm to GCC. But for any other library from the standard library, I don't have to do that. Why the difference?
In the old days, linkers were slow and separating the mostly unused math code from the rest made the compilation process go faster. The difference is not so great today, so you can add the -lm option to your default compiler configuration.
Note that the header <math.h> (or any other header) does not contain code. It contains information about the code, specifically how to call functions. The code itself is in a library. I mean, your program does not use the "<math.h> library", it uses the math library and uses the prototypes declared in the <math.h> header.
It's the same reason you have to explicitly link to libpthread on most implementations. When something new and scary is added to the standard library, it usually first gets implemented as a separate add-on library that overrides some of the symbols in the old standard library implementation with versions that conform to the new requirements, while also adding lots of new interfaces. I wouldn't be surprised if some historical implementations had separate versions of printf in libm for floating point printing, with a "light" version in the main libc lacking floating point. This kind of implementation is actually mentioned and encouraged for tiny systems in the ISO C rationale document, if I remember correctly.
Of course in the long-term, separating the standard library out like this leads to a lot more problems than benefits. The worst part is probably the increased load time and memory usage for dynamic-linked programs.
Actually, the reason you normally don't need to link against libm for most math functions is that these are inlined by your compiler. Your program would fail to link on a platform where this is not the case.

Undefined reference to `sin` [duplicate]

This question already has answers here:
Undefined reference to sqrt (or other mathematical functions)
(5 answers)
Closed 4 years ago.
I have the following code (stripped down to the bare basics for this question):
#include<stdio.h>
#include<math.h>
double f1(double x)
{
double res = sin(x);
return 0;
}
/* The main function */
int main(void)
{
return 0;
}
When compiling it with gcc test.c I get the following error, and I can't work out why:
/tmp/ccOF5bis.o: In function `f1':
test2.c:(.text+0x13): undefined reference to `sin'
collect2: ld returned 1 exit status
However, I've written various test programs that call sin from within the main function, and those work perfectly. I must be doing something obviously wrong here - but what is it?
You have compiled your code with references to the correct math.h header file, but when you attempted to link it, you forgot the option to include the math library. As a result, you can compile your .o object files, but not build your executable.
As Paul has already mentioned add "-lm" to link with the math library in the step where you are attempting to generate your executable.
In the comment, linuxD asks:
Why for sin() in <math.h>, do we need -lm option explicitly; but,
not for printf() in <stdio.h>?
Because both these functions are implemented as part of the "Single UNIX Specification". This history of this standard is interesting, and is known by many names (IEEE Std 1003.1, X/Open Portability Guide, POSIX, Spec 1170).
This standard, specifically separates out the "Standard C library" routines from the "Standard C Mathematical Library" routines (page 277). The pertinent passage is copied below:
Standard C Library
The Standard C library is automatically searched by
cc to resolve external references. This library supports all of the
interfaces of the Base System, as defined in Volume 1, except for the
Math Routines.
Standard C Mathematical Library
This library supports
the Base System math routines, as defined in Volume 1. The cc option
-lm is used to search this library.
The reasoning behind this separation was influenced by a number of factors:
The UNIX wars led to increasing divergence from the original AT&T UNIX offering.
The number of UNIX platforms added difficulty in developing software for the operating system.
An attempt to define the lowest common denominator for software developers was launched, called 1988 POSIX.
Software developers programmed against the POSIX standard to provide their software on "POSIX compliant systems" in order to reach more platforms.
UNIX customers demanded "POSIX compliant" UNIX systems to run the software.
The pressures that fed into the decision to put -lm in a different library probably included, but are not limited to:
It seems like a good way to keep the size of libc down, as many applications don't use functions embedded in the math library.
It provides flexibility in math library implementation, where some math libraries rely on larger embedded lookup tables while others may rely on smaller lookup tables (computing solutions).
For truly size constrained applications, it permits reimplementations of the math library in a non-standard way (like pulling out just sin() and putting it in a custom built library.
In any case, it is now part of the standard to not be automatically included as part of the C language, and that's why you must add -lm.
I still have the problem with -lm added:
gcc -Wall -lm mtest.c -o mtest.o
mtest.c: In function 'f1':
mtest.c:6:12: warning: unused variable 'res' [-Wunused-variable]
/tmp/cc925Nmf.o: In function `f1':
mtest.c:(.text+0x19): undefined reference to `sin'
collect2: ld returned 1 exit status
I discovered recently that it does not work if you specify -lm first. The order matters. You must specify -lm last, like this:
gcc mtest.c -o mtest.o -lm
That links without problems.
So, you must specify the libraries at the end.
You need to link with the math library, libm:
$ gcc -Wall foo.c -o foo -lm
I had the same problem, which went away after I listed my library last: gcc prog.c -lm

Resources