Various glibc and Linux kernel versions compatibility - c

When building a compiler, one must specify Linux headers version and minumum supported kernel version, in addition to glibc version. And then there is actual kernel version and glibc version (with its own kernel headers version and minumum supported kernel version) on the target machine. I'm rather confused trying to understand how these versions go together.
Example 1: Assume I have system with glibc 2.13 built against kernel headers 3.14. Does that make any sense? How is it possible for glibc 2.13 (released in 2011) to use new kernel features from 3.14 (released in 2014)?
Example 2: Assume I have a compiler with glibc version newer than 2.13. Will compiled programs work on system with glibc 2.13? And if compiler's glibc version is older than 2.13?
Example 3: From https://sourceware.org/glibc/wiki/FAQ#What_version_of_the_Linux_kernel_headers_should_be_used.3F I understand that it's OK to use older kernel if it satisfies "minumum kernel version" used when compiling glibc. But I don't understand the passage The other way round (compiling the GNU C library with old kernel headers and running on a recent kernel) does not necessarily work as expected. For example you can't use new kernel features if you used old kernel headers to compile the GNU C library.. Is it the only thing that can happen to me? Won't it break something in glibc if the kernel is newer than at compile-time?
Example 4: Do more subtle differences in glibc settings (for example, linking an executable against glibc version 2.X compiled against kernel headers 3.Y with minimum supported kernel version 2.6.A and executing in on system with the same glibc 2.X, but compiled against kernel headers 3.Z with minumum supported kernel version 2.6.B) influence anything? I suspect they're not, but would like to be sure.
So many questions :) Thanks!

You can not easily (for whatever definition of the word) use newer kernel features with older versions of glibc. If you really need to, you can invoke system calls directly (using the syscall() library function) and dig whatever constant values and datastructures necessary from user-space kernel headers (the stuff which in the newer kernel is held under include/uapi). On the other hand, kernel developers usually promise not to break legacy features in newer kernels, so older glibc versions keep working as expected (well, almost).
Older programs still work with newer versions of glibc because glibc supports versioning of symbols (see here for some details: https://www.kernel.org/pub/software/libs/glibc/hjl/compat/). If your program is dynamically linked with newer version of glibc without special provisions (as described in the link above) you would not be able to run it with an older version of glibc libraries (dynamic linker will complain about unresolved symbols, as proper symbol versions will not be available).

Related

Link failure with versioned symbols (memcpy & secure_getenv)

I am seeing undefined symbols when trying to link shared libraries with a program on Redhat Linux.
We are running Linux kernel 3.10.0, gcc 4.8.2 with libc-2.17.so, and libblkid 2.23.2
When I build the application I am writing I get two undefined symbols from libblkid: memcpy#GLIBC_2.14 and secure_getenv#GLIBC_2.17. (A very similar build works on other machines, ostensibly using the same versions of everything).
Note, for secure_getenv libblkid wants the same version as the libc library itself.
Looking at the symbols defined in libc-2.17.so I find memcpy##GLIBC_2.14, memcpy#GLIBC_2.2.5, secure_getenv, and secure_getenv#GLIBC_2.2.5. According to my understanding the double # in the first memcpy version is simply supposed to mark it as the default version. And, for some reason even in this libc with versioned symbols the first secure_getenv appears to be unversioned.
So, why does a requirement for memcpy#GLIBC_2.14 not match the defaulted memcpy##GLIBC_2.14?
And logically I would expect the base version of secure_getenv in libc-2.17 to match a requirement for version 2.17.
So, what is going on here? What is making it fail on my development machine and not others? How do I fix this? (As the make works on other machines this appears to be something specific to my build environment, but what?)
You probably have compat-glibc installed, as indicated by the -L/usr/lib/x86_64-redhat-linux6E/lib64 argument. compat-glibc on Red Hat Enterprise Linux 7 provides glibc 2.12 only, so it cannot be used to link against system libraries.

I'm confused with C libraries

Ok here's the thing.
Most people learn about the C standard library simultaneously as they first get in contact with the C language and I wasn't an exception either. But as I am studying linux now, I tend to get confused with C libraries. well first, I know that you get a nice old C standard lib as you install gcc on your linux distro as a static lib. After that, you get a new stable version of glibc pretty soon as you connect to the internet.
I started to look into glibc API and here's where I got messed up. glibc seems to support vast amount of lib basically starting from POSIX C Standard lib (which implements the standard C lib(including C99 as I know of)) to it's own extensions based on the POSIX standard C lib.
Does this mean that glibc actually modified or added functions in the POSIX C Standard lib? or even add whole new header set? Cause I see some functions that are not in the standard C lib but actually included in the standard C header (such as strnlen() in
Also referring to what I mentioned about a 'glibc making whole new header set', is because I'm starting to see some header files that seems pretty unique such as linux/blahblah.h or sys/syscalls.h <= (are these the libs that only glibc support?)
Next Ques is that I actually heard linux is built based on C language. Does this mean linux compiles itself with it's own gcc compiler???????
For the first question, glibc follows both standard C and POSIX, from About glibc
The GNU C Library is primarily designed to be a portable and high performance C library. It follows all relevant standards including ISO C11 and POSIX.1-2008. It is also internationalized and has one of the most complete internationalization interfaces known.
For the second question, yes, you can compile Linux using gcc. Even gcc itself can be compiled using gcc, it's called bootstrapping.
Glibc implements the POSIX, ANSI and ISO C standards, and adds its own 'fluff', which it calls "glibc extensions". The reason that they are all "mixed together" is because they wrote the library as one package, there is no separate POSIX-only glibc.
<linux/blah> is not part of glibc. It is a set headers written specifically for the operating system, by people outside of glibc, to give the programmer access to the Linux kernel API. It is "part" of the Linux kernel and is installed with it, and is used for kernel hacking. <sys/blah> is part of glibc, and is specific to Linux. It gives access to a fairly abstracted Linux system API.
As for your second question, yes. Linux is written in C, as it is (according to Linus) the only programming language for kernel and system programming. The way this is done is through a technique called bootstrapping, where a small compiler is built (usually manually in ASM) and builds the entire kernel or the entirety of GCC.
There is one more thing to be aware of: one of the purposes of the libc is to abstract from the actual system kernel. As such, the libc is the one part of your app that is kernel specific. If you had a different kernel with different syscalls, you would need to have a specially compiled libc. AFAIK, the libc is therefore usually linked as a shared library.
On linux, we usually have the glibc installed, because linux systems usually are GNU/Linux systems with a GNU toolchain on top of the linux kernel.
And yes, the glibc does expand the standards in certain spots: The asprintf() function for instance originated as a gnu-addition. It almost made it into the C11 standard subsequently, but until it becomes part of them, it's use will require a glibc-based system, or statically linking with the glibc.
By default, the glibc headers do not define these gnu additions. You can switch them on by defining the preprocessor macro GNU_SOURCE before including the appropriate headers, or by specifying -std=gnu11 to the gcc call.

Understanding glibc

I'd like to distribute my program as a binary, not in source code form. I have two test systems: An older Linux (openSUSE 11.2 with glibc 2.10) and a recent one (LinuxMint 13 with glibc 2.15). Now when I compile my program on the LinuxMint system with glibc 2.15 and then try to start the binary on the openSUSE system with glibc 2.10 I get the following two errors:
./a.out: /lib/libc.so.6: version 'GLIBC_2.15' not found (required by ./a.out)
./a.out: /lib/libc.so.6: version 'GLIBC_2.11' not found (required by ./a.out)
What is confusing me here is this: Why do I get the "glibc 2.11 not found" error here as well? I would expect the program to require glibc 2.15 now because it has been compiled with glibc 2.15. Why is the program looking for glibc 2.11 as well? Does this mean that my program will run on both glibc versions, i.e. 2.15 AND 2.11? So it requires at least 2.11? Or will it require 2.15 in any case?
Another question: Is the assumption correct that glibc is upwards compatible but not downwards? E.g. is a program compiled with glibc 2.10 guaranteed to work flawlessly with any future version of glibc? If that is the case, what happens if a constant like PATH_MAX is changed in the future? Currently it is set to 4096 and I'm allocating my buffers for the realpath() POSIX function using the PATH_MAX constant. Now if this constant is raised to 8192 in the future, there could be problems because my program allocates only 4096 bytes. Or did I misunderstand something here?
Thanks for explanations!
Libc uses a symbol versioning. It's rather advanced wizardry, but basically each symbol is attached tag according to version where it appeared. If it's semantics changes, there are two versions, one for the old semantics with the version where it first appeared and another with the new semantics and the version where it appeared. The loader will only complain about symbols your program actually requests. Some of them happened to be introduced in 2.15 and some of them happened to be introduced in 2.11.
The whole point of this is to keep glibc backward compatible. It's really important, because there is a lot of packaged software, all of it is linked dynamically and recompiling all of it would take really long. There is also lot of software that does not have sources available and the old build of libc might not work with new kernel or new something else.
So yes, glibc is backward compatible. Just make sure you compile against the oldest version you need it to run with, because it is not (and can't be) forward compatible.
Ad PATH_MAX: If a change like that is done, glibc would simply export new versions of symbols that use the new value and old versions of symbols with suitable safegurads to work with code compiled against the old value. That's the main point of all that wizardry.

What is the role of libc(glibc) in our linux app?

When we debug a program using gdb, we usually see functions with strange names defined in libc(glibc?). My questions are:
Is libc/glibc the standard implementation of some standard C/C++ functions like strcpy,strlen,malloc?
Or, is it not only of the first usage as described above, but also an wrapper of Unix/Linux system calls like open,close,fctl? If so, why can't we issue syscalls directly, without libc?
Does libc only consist of one lib (.a or .so) file, or many lib files (in this case, libc is the general name of this set of libs)? Where do these lib file(s) reside?
What is the difference between libc and glibc?
libc implements both standard C functions like strcpy() and POSIX functions (which may be system calls) like getpid(). Note that not all standard C functions are in libc - most math functions are in libm.
You cannot directly make system calls in the same way that you call normal functions because calls to the kernel aren't normal function calls, so they can't be resolved by the linker. Instead, architecture-specific assembly language thunks are used to call into the kernel - you can of course write these directly in your own program too, but you don't need to because libc provides them for you.
Note that in Linux it is the combination of the kernel and libc that provides the POSIX API. libc adds a decent amount of value - not every POSIX function is necessarily a system call, and for the ones that are, the kernel behaviour isn't always POSIX conforming.
libc is a single library file (both .so and .a versions are available) and in most cases resides in /usr/lib. However, the glibc (GNU libc) project provides more than just libc - it also provides the libm mentioned earlier, and other core libraries like libpthread. So libc is just one of the libraries provided by glibc - and there are other alternate implementations of libc other than glibc.
With regard to the first two, glibc is both the C standard library (e.g, "standard C functions") and a wrapper for system calls. You cannot issue system calls directly because the compiler doesn't know how -- glibc contains the "glue" which is necessary to issue system calls, which is written in assembly. (It is possible to reimplement this yourself, but it's far more trouble than it's worth.)
(The C++ standard library is a separate thing; it's called libstdc++.)
glibc isn't a single .so (dynamic library) file -- there are a bunch, but libc and libm are the most commonly-used two. All of the static and dynamic libraries are stored in /lib.
libc is a generic term used to refer to all C standard libraries -- there are several. glibc is the most commonly used one; others include eglibc, uclibc, and dietlibc.
It's the "standard library". It's exactly like "MSVCRTL" in the Windows world.
The Gnu standard library ("glibc") is the implementation of libc most commonly (almost universally?) found on Linux systems. Here are the relevant files on an old SusE Linux system:
ls -l /lib =>
-rwxr-xr-x 1 root root 1383527 2005-06-14 08:36 libc.so.6
ls -l /usr/lib =>
-rw-r--r-- 1 root root 2580354 2005-06-14 08:20 libc.a
-rw-r--r-- 1 root root 204 2005-06-14 08:20 libc.so
This link should answer any additional questions you might have (including references to the full and complete GLibc source code):
http://sourceware.org/glibc/wiki/HomePage
You can check the detailed information about "libc" and "glibc" from the man pages on your linux sytem by typing "man libc" on the shell, copied as below;
LIBC(7) Linux Programmer's Manual LIBC(7)
NAME
libc - overview of standard C libraries on Linux
DESCRIPTION
The term "libc" is commonly used as a shorthand for the "standard C library", a library of standard functions that can be
used by all C programs (and sometimes by programs in other languages). Because of some history (see below), use of the
term "libc" to refer to the standard C library is somewhat ambiguous on Linux.
glibc
By far the most widely used C library on Linux is the GNU C Library ⟨http://www.gnu.org/software/libc/⟩, often referred
to as glibc. This is the C library that is nowadays used in all major Linux distributions. It is also the C library
whose details are documented in the relevant pages of the man-pages project (primarily in Section 3 of the manual). Doc‐
umentation of glibc is also available in the glibc manual, available via the command info libc. Release 1.0 of glibc was
made in September 1992. (There were earlier 0.x releases.) The next major release of glibc was 2.0, at the beginning of
1997.
The pathname /lib/libc.so.6 (or something similar) is normally a symbolic link that points to the location of the glibc
library, and executing this pathname will cause glibc to display various information about the version installed on your
system.
Linux libc
In the early to mid 1990s, there was for a while Linux libc, a fork of glibc 1.x created by Linux developers who felt
that glibc development at the time was not sufficing for the needs of Linux. Often, this library was referred to
(ambiguously) as just "libc". Linux libc released major versions 2, 3, 4, and 5 (as well as many minor versions of those
releases). For a while, Linux libc was the standard C library in many Linux distributions.
However, notwithstanding the original motivations of the Linux libc effort, by the time glibc 2.0 was released (in 1997),
it was clearly superior to Linux libc, and all major Linux distributions that had been using Linux libc soon switched
back to glibc. Since this switch occurred long ago, man-pages no longer takes care to document Linux libc details. Nev‐
ertheless, the history is visible in vestiges of information about Linux libc that remain in some manual pages, in par‐
ticular, references to libc4 and libc5.

Relink a shared library to a different version of libc

I have a linux shared library (.so) compiled with a specific version of libc (GLIBC2.4) and I need to use it on a system with different version of libc. I do not have sources for the library in question so I cannot recompile for the new system. Is it somehow possible to change the dependencies in that library to a different libc?
If you need the .so on a system with an older glibc, you would need the source code and recompile/relink it with the older glibc. The alternative is to install the required glibc on the old system in a non-default location and adjust the LD_LIBRARY_PATH for the executable that needs this .so
If there's a newer glibc rather, it should normally not be a problem as glibc tend to be backwards compatible.
Unless your library really uses interfaces that changed (unlikely), you can just hexedit the references to versions in the resulting .so file. They're all text anyway.
Best you can do is compile the old glibc version for your system and then build your application with that glibc and your shared library. Ugly though ...

Resources