Understanding glibc - c

I'd like to distribute my program as a binary, not in source code form. I have two test systems: An older Linux (openSUSE 11.2 with glibc 2.10) and a recent one (LinuxMint 13 with glibc 2.15). Now when I compile my program on the LinuxMint system with glibc 2.15 and then try to start the binary on the openSUSE system with glibc 2.10 I get the following two errors:
./a.out: /lib/libc.so.6: version 'GLIBC_2.15' not found (required by ./a.out)
./a.out: /lib/libc.so.6: version 'GLIBC_2.11' not found (required by ./a.out)
What is confusing me here is this: Why do I get the "glibc 2.11 not found" error here as well? I would expect the program to require glibc 2.15 now because it has been compiled with glibc 2.15. Why is the program looking for glibc 2.11 as well? Does this mean that my program will run on both glibc versions, i.e. 2.15 AND 2.11? So it requires at least 2.11? Or will it require 2.15 in any case?
Another question: Is the assumption correct that glibc is upwards compatible but not downwards? E.g. is a program compiled with glibc 2.10 guaranteed to work flawlessly with any future version of glibc? If that is the case, what happens if a constant like PATH_MAX is changed in the future? Currently it is set to 4096 and I'm allocating my buffers for the realpath() POSIX function using the PATH_MAX constant. Now if this constant is raised to 8192 in the future, there could be problems because my program allocates only 4096 bytes. Or did I misunderstand something here?
Thanks for explanations!

Libc uses a symbol versioning. It's rather advanced wizardry, but basically each symbol is attached tag according to version where it appeared. If it's semantics changes, there are two versions, one for the old semantics with the version where it first appeared and another with the new semantics and the version where it appeared. The loader will only complain about symbols your program actually requests. Some of them happened to be introduced in 2.15 and some of them happened to be introduced in 2.11.
The whole point of this is to keep glibc backward compatible. It's really important, because there is a lot of packaged software, all of it is linked dynamically and recompiling all of it would take really long. There is also lot of software that does not have sources available and the old build of libc might not work with new kernel or new something else.
So yes, glibc is backward compatible. Just make sure you compile against the oldest version you need it to run with, because it is not (and can't be) forward compatible.
Ad PATH_MAX: If a change like that is done, glibc would simply export new versions of symbols that use the new value and old versions of symbols with suitable safegurads to work with code compiled against the old value. That's the main point of all that wizardry.

Related

Link failure with versioned symbols (memcpy & secure_getenv)

I am seeing undefined symbols when trying to link shared libraries with a program on Redhat Linux.
We are running Linux kernel 3.10.0, gcc 4.8.2 with libc-2.17.so, and libblkid 2.23.2
When I build the application I am writing I get two undefined symbols from libblkid: memcpy#GLIBC_2.14 and secure_getenv#GLIBC_2.17. (A very similar build works on other machines, ostensibly using the same versions of everything).
Note, for secure_getenv libblkid wants the same version as the libc library itself.
Looking at the symbols defined in libc-2.17.so I find memcpy##GLIBC_2.14, memcpy#GLIBC_2.2.5, secure_getenv, and secure_getenv#GLIBC_2.2.5. According to my understanding the double # in the first memcpy version is simply supposed to mark it as the default version. And, for some reason even in this libc with versioned symbols the first secure_getenv appears to be unversioned.
So, why does a requirement for memcpy#GLIBC_2.14 not match the defaulted memcpy##GLIBC_2.14?
And logically I would expect the base version of secure_getenv in libc-2.17 to match a requirement for version 2.17.
So, what is going on here? What is making it fail on my development machine and not others? How do I fix this? (As the make works on other machines this appears to be something specific to my build environment, but what?)
You probably have compat-glibc installed, as indicated by the -L/usr/lib/x86_64-redhat-linux6E/lib64 argument. compat-glibc on Red Hat Enterprise Linux 7 provides glibc 2.12 only, so it cannot be used to link against system libraries.

Various glibc and Linux kernel versions compatibility

When building a compiler, one must specify Linux headers version and minumum supported kernel version, in addition to glibc version. And then there is actual kernel version and glibc version (with its own kernel headers version and minumum supported kernel version) on the target machine. I'm rather confused trying to understand how these versions go together.
Example 1: Assume I have system with glibc 2.13 built against kernel headers 3.14. Does that make any sense? How is it possible for glibc 2.13 (released in 2011) to use new kernel features from 3.14 (released in 2014)?
Example 2: Assume I have a compiler with glibc version newer than 2.13. Will compiled programs work on system with glibc 2.13? And if compiler's glibc version is older than 2.13?
Example 3: From https://sourceware.org/glibc/wiki/FAQ#What_version_of_the_Linux_kernel_headers_should_be_used.3F I understand that it's OK to use older kernel if it satisfies "minumum kernel version" used when compiling glibc. But I don't understand the passage The other way round (compiling the GNU C library with old kernel headers and running on a recent kernel) does not necessarily work as expected. For example you can't use new kernel features if you used old kernel headers to compile the GNU C library.. Is it the only thing that can happen to me? Won't it break something in glibc if the kernel is newer than at compile-time?
Example 4: Do more subtle differences in glibc settings (for example, linking an executable against glibc version 2.X compiled against kernel headers 3.Y with minimum supported kernel version 2.6.A and executing in on system with the same glibc 2.X, but compiled against kernel headers 3.Z with minumum supported kernel version 2.6.B) influence anything? I suspect they're not, but would like to be sure.
So many questions :) Thanks!
You can not easily (for whatever definition of the word) use newer kernel features with older versions of glibc. If you really need to, you can invoke system calls directly (using the syscall() library function) and dig whatever constant values and datastructures necessary from user-space kernel headers (the stuff which in the newer kernel is held under include/uapi). On the other hand, kernel developers usually promise not to break legacy features in newer kernels, so older glibc versions keep working as expected (well, almost).
Older programs still work with newer versions of glibc because glibc supports versioning of symbols (see here for some details: https://www.kernel.org/pub/software/libs/glibc/hjl/compat/). If your program is dynamically linked with newer version of glibc without special provisions (as described in the link above) you would not be able to run it with an older version of glibc libraries (dynamic linker will complain about unresolved symbols, as proper symbol versions will not be available).

How can I compile a Linux executable for a different machine?

I've written a Linux program in C, and I'm trying to get it to run on a server system. It looks like everything should work, but when I try it, I get this:
/lib64/libc.so.6: version `GLIBC_2.14' not found (required by <program>)
/lib64/libc.so.6: version `GLIBC_2.14' not found (required by ./libdbi.so.1)
(Where <program> is my program's name.)
So far as I can tell, my program only requires that version of GLIBC because libdbi does. I've tried compiling libdbi from source, and it still attempts to link to that version of GLIBC.
I don't own the server system (it's a shared system I run a website on, and have SSH access to), so I can't make any changes to it -- that's why the library file is in the same directory, and I've set LD_LIBRARY_PATH=.. Unfortunately I also don't have access to a compiler on it -- when I try to run GCC, I'm told "permission denied". It's run by a big corporation, and I'm only one customer; the chances of them making any changes at my request are essentially zero.
Is there any way to compile the program on my system so that it will work on the server?
Before I asked, I found these similar questions:
Compile C program in Linux with different glibc library: the link in the answer goes to a 404 page, and from what I've been able to determine, apgcc isn't available on Debian distributions.
Relink a shared library to a different version of libc: seems to say that this problem doesn't exist, because "glibc tend to be backwards compatible" (except they apparently aren't in this case).
How to compile Linux C program to run on another Linux machine?: suggests a chroot or virtual machine, which I've done before elsewhere, but how can I tell it to use a libc without that old GLIBC version?
is binary executable file portable: suggests static-linking, but libdbi dynamically-links to its driver files, so that apparently can't be done -- I get several errors referring to missing functions like ldopen.
There are others, but they seem to be variations on those.
I'd be willing to use a non-free solution (like one that I saw in another answer I can't find now) if I turn this into a commercial product, but for a single use it seems like massive overkill, not to mention the expense.
Is there any way to simply tell libdbi to link to a later GLIBC version, maybe? If not, is there any solution I've overlooked?
Big corporation or not, the least they owe you if you are paying for service in any way or being paid for development to meet a requirement is a careful description of the runtime environment so you can duplicate it on a development machine.
Then you must set out to systematically duplicate this environment. Since you're using libdbi you should be thorough. Database connections can exercise big chunks of the system API, so you want to have exactly the same version of Linux, gcc (even if you can't run it, you need to know the version other parts of the system were compiled with), and other tools and libraries. If you don't, you won't be able to have much confidence that your development machine tests translate to good behavior on the target.
A virtual machine is a good way to create a specialized development environment without messing up your existing one.
You must compile it on a machine that has the same version of glibc as the target machine, or an older version. shared library compatibility works in that direction only.
Find out what version of Linux the server uses, get a copy of it and install it in a VM
Virtualbox is good for this
You can use this environment for testing code as well as this particular compilation problem
You have the following options:
Compile your code on the server machine (which likely has gcc installed)
Compile your program with statically linked libraries (option -static for gcc)

Copying over glibc library

I downloaded the glibc source code, modified some portion of the standard library and then used LD_PRELOAD to use that modified standard library (in the form of an .so file) with my program. However, when I copied that .so file to another computer and tried to run the same program using LD_PRELOAD there, I got a segmentation fault.
Notice that both computers have x86-64 processors. Moreover, both computers have gcc 4.4 installed. Although the computer in which it is not running has also gcc 4.1.2 installed besides gcc 4.4. However, one is running Ubuntu 10.04 (where I compiled), while the other is running CentOS 5. Is that the cause of the segmentation fault? How can I solve this problem? Notice that I don't have administrative rights on the computer with CentOS 5.
When you LD_PRELOAD the C library, I believe you're loading it in addition to the default C library. When they're the exact same version, all the symbols match, and yours takes precedence. So it works. When they're different versions, you may well have a mix, on a per-symbol basis.
Also, the NSS (name service switch, e.g., all the stuff from /etc/nsswitch.conf) API is not stable. These modules are separate from the main libc.so, but are dynamically loaded when a program e.g., does a user id to username mapping. Loading the wrong version (because you copied libc.so over) will do all kinds of badness.
Further, Ubuntu may be using eglibc and CentOS glibc. So you could be looking at a different fork of glibc.
If your LD_PRELOAD library included only the symbols you actually need to override, and overrode them to the minimum amount possible (e.g., if possible, call the overridden function), then your library has a much higher chance of being portable.
For an example of how to do this, see (for example) fakeroot.
If you're changing so much of libc that your only choice is to override all of it, then (a) you're doing something very weird; (b) you probably want to use LD_LIBRARY_PATH, not LD_PRELOAD; see the ld.so(8) manpage for details.
It is likely that your libc is not portable between kernel versions.

Binary compatibility between Linux distributions

Sorry if this is an obvious question, but I've found surprisingly few references on the web ...
I'm working with an API written in C by one of our business partners and supplied to us as a .so binary file, built on Fedora 11. We've been testing out the API on a Fedora 11 development machine with no problems. However, when I try to link against the API on our customer's target platform, which happens to be SuSE Enterprise 10.2, I get a "File format not recognized" error.
Commands that are also part of the binutils package, such as objdump or nm, give me the same file format error. The "file" command shows me:
ELF 64-bit LSB shared object, AMD x86-64, version 1 (SYSV), not stripped
and the "ldd" command shows:
ldd: warning: you do not have execution permission for `./libuscuavactivity.so.1.1'
./libuscuavactivity.so.1.1: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.9' not found (required by ./libuscuavactivity.so.1.1)
[dependent library list]
I'm guessing this is due to incompatibility between the C libraries on the two platforms, with the problem being that the code was compiled against a new version of glibc etc. than the one available on SuSE 10.2. I'm posting this question on the off chance that there is a way to compile the code on our partner's Fedora 11 platform in such a way that it will run on SuSE 10.2 as well.
I think the trick is to build on a flavour of linux with the oldest kernel and C library versions of any of the platforms you wish to support. In my job we build on Debian 4, which allows us to officially support Debian 4 and above, RedHat 3,4,5, SuSE 10 plus various other distros (SELinux etc.) in an unofficial fashion.
I suspect by building on a nice new version of linux, it becomes difficult to support people on older machines.
(edit) I should mention that we use the default compiler that comes with Debian 4, which I think is GCC 4.1.2. Installing newer compiler versions tends to make compatibility much worse.
Windows has it problems with compatability between different realeases, service packs, installed SDKs, and DLLs in general (DLL Hell, anyone?). Linux is not immune to the same kinds of issues.
The compatability issues I have seen include:
Runtime library changes
Link library changes
Kernel changes
Compiler technology changes (eg: pre and post EGCS gcc versions. This might be your issue).
Packager issues (RPM vs. APT)
In your particular case, I'd have them do a "gcc -v" on their system and report to you the gcc version number. Compare that to what you are using.
You might have to get hold of that version of the compiler to build your half with.
You can use Linux Application Checker tool ([1], [2], [3]) in order to solve compatibility problems of an application between Linux distributions. It will check your file formats and all dependent libraries. It supports almost all popular Linux distributions including all versions of SuSE and Fedora.
This is just a personal opinion, but when distributing something in binary-only form on Linux, you have a few options:
Build the gamut of .debs and .rpms for every distro under the sun, with a nominal ".tar.gz full of binaries" package for anything you've missed. The first part is ideal but cumbersome. The latter part will lead you to point 2 and 3.
Do as some are suggesting and find the oldest distro you can find and build there. My own opinion is this is sort of a ridiculous idea. See point 3.
Distribute binaries, and statically link where ever you can. Especially for libstdc++, which appears to be your problem here. There are seemingly very many incompatible versions of libstdc++ floating around, which makes it a compatibility nightmare. If you can't link statically, you can also put *.so files alongside your binary, and use stuff like LD_PRELOAD or LD_LIBRARY_PATH to make them link preferentially at runtime. Note that if you take this route you may have to comply with LGPL etc. since you are now distributing other people's work alongside your project.
Of course, distributing your project in source form is always preferred on Linux. :-)
If the message is file format not recognized then the problem is most likely one mentioned by elmarco in a comment -- namely, different architecture. It might (I'm not sure) be a dynamic linker version mismatch, but that would mean the .so file was built with an ancient dynamic linker. I do not believe any incompatibility in libc could cause this -- they could cause link failures and runtime problems (latter very rarely), but not this.
I don't know about Suse, but I know fedora likes to stay on the bleeding edge. So you may very well be right about library versions. Why don't you ask and see if you can get the source code and build it on your Suse machine?

Resources