/usr/lib64/libhdf5* vs. /usr/lib64/openmpi/lib/libhdf5* - package

On our RHEL 6.6 machines we have the following two packages installed
hdf5-1.8.5.patch1-7.el6.x86_64 (provides /usr/lib64/libhdf5*)
hdf5-openmpi-1.8.5.patch1-7.el6.x86_64 (provides /usr/lib64/openmpi/lib/libhdf5*)
These seemingly provide what I would think are duplicate libraries (i.e. libhdf5.so.6.0.4), but doing an md5sum reveals that they are not identical.
1) Is this a bad practice / actual problem? One of our users claims having such duplicate libraries creates a dependency nightmare for him.
2) Assuming it is a problem, how do we "fix" it? Removing one or the other might break things for other people who are depending on the package we delete.

It shouldn't be a problem. If you are writing parallel code, you link to the parallel/OpenMPI version.
This Fedora page notes that they are built from the same source, so that strongly implies they have been tested in the configuration presented and shouldn't conflict.

These are not duplicate libraries and it is neither a bad practice nor an actual problem. HDF5 could be built with or without support for MPI. When built with MPI support, the HDF5 library can only be linked with applications that are also built against the same MPI library. That's why there are separate HDF5 packages:
hdf5-1.8.5 - non-MPI-enabled build to be used within non-MPI applications
hdf5-openmpi-1.8.5 - MPI-enabled build, uses Open MPI
hdf5-mpich-1.8.5 - MPI-enabled build, uses MPICH
The actual shared objects are installed in different places so that they could co-exist on the same system.

Related

How to dump an executable SBCL image that uses osicat

I have a simple common lisp server program, that uses the osicat library to interface with the posix filesystem. I need to do this because the system creates symbolic links to files, and uses the POSIX stat metadata, and neither of those things are straightforward to do in portable lisp.
I am managing the dependencies with quicklisp, and I have all of this pinned to a working distribution. The app is portable between CCL and SBCL, and I tend to build it in the first and deploy it using the latter. I declare the dependencies for the app with an asdf defsystem, and I can use quicklisp to load it for easy development from local projects.
For deployment I was just using some ansible playbooks that replicated a developer environment on a remote (.e. setting up quicklisp, pushing code into local projects, running out of a user home directory) which was hacky, but mostly ok. More recently, as it's becoming more stable I have been compiling it using sb-ext:save-lisp-and-die, using a simple compile script. This means I get an executable that I can run more like a server, with service management scripts, and an anonymous user account.
This has been working very well, and so I recently moved this step to the next level, and I'm building .deb packages with my compile script, so I can bundle up everything into a relocatable binary. This also sort of works, but the resultant binaries are not relocatable from the original build host. They refuse to start up, and it appears that they try to dynamically load a shared library for osicat
Unhandled SIMPLE-ERROR in thread #<SB-THREAD:THREAD "main thread" RUNNING
Mar 15 12:47:14 annie [479]: {10005C05B3}>:
Mar 15 12:47:14 annie [479]: Error opening shared object "libosicat.so":
Mar 15 12:47:14 annie [479]: libosicat.so: cannot open shared object file: No such file or directory.
it looks like the image expects to find this in the original build tree's quicklisp archives
(ERROR "Error opening ~:[runtime~;shared object ~:*~S~]:~% ~A." "/home/builder/buil...quicklisp/dists/quicklisp/software/osicat-20180228-git/posix/libosicat.so
(SB-SYS:DLOPEN-OR-LOSE #S(SB-ALIEN::SHARED-OBJECT :PATHNAME #P"
so poking around the source, I realise that when quicklisp fetches osicat and exercise its build operation, it compiles this DLL to wrap it's interface with the system libaries, rather than just ffi to them directly - possibly because it's using cffi groveller, I don't really know much about cffi (yet). This is fine, but rather than linking to a .so using the system linker it's trying to dlopen it from a fixed path, which is not very portable, and kind of breaks the usefulness of save-image
I'm a bit stumped at this point, but before I go diving any much further into QL and cffi builds, I wondered if there's some build or compile configuration I'm missing that would make it bootstrap in a more 'static' fashion or influence the production of the wrapped library. Ideally I just want a single blob I can wrap in an installer, and link it against system libraries, but if I have to deploy some additional artefacts that's probably alright. I don't know how to make the autogenerated shared objects occur at a more controlled path.
At that point though, I may as well write a .so for my posix calls and distribute this alongside the app and try and FFI to it more directly. That would be a bit of a pain, so I would prefer to not do this.
You're right, when a dumped image is starting up, it is trying to reload the shared libraries. Which, as you are experiencing, is not working if the image is not starting on the machine it was dumped on.
This is almost what static-program-op set out to solve. A simple system definition like this should help you compile a static program:
(defsystem "foo"
:defsystem-depends-on ("cffi-grovel")
:build-operation "static-program-op" ; "asdf" package is implied
:build-pathname "foo" ; path of the generated binary
:entry-point "foo:main" ; function to use as the entry point
;; ... everything else ...
)
If your system depends on grovel files (defined by :cffi-wrapper-file, :c-file or :o-file), such as the ones provided by osicat, then it will statically link those to your dumped image.
However, it's not perfect.
Essentially, there are still some issues. Some are being fixed upstream by CFFI itself (e.g. not reloading shared libraries of the statically embedded libraries), others are a bit harder. (E.g. SBCL default compilation options don't let you use static-program-op by default. This is being fixed in Debian builds of SBCL, but other distributions are being less responsive.)
This is obviously a problem that the community at large has met, and there are several libraries that are around to help:
The first one, that has been around for a while, is Deploy. The approach it takes is that it embeds the dumped image and the libraries into an archive, and rearranges things for the binary to load them from wherever it is extracted to.
The second one, which I am biased towards because I made it, is linux-packaging. It takes the approach of fixing static-program-op by extending it, but requires you to build a custom SBCL. However, it generates distribution packages like .deb and .rpm, in order to be able to specify dependencies for system shared libraries (e.g. if you depend on sqlite, it will figure out which package provides it and add it as a dependency in the .deb). I highly recommend looking at the .gitlab-ci.yml for examples.
I recommend reading the webpages of both of those libraries to make your choice, they both have their advantages and drawbacks. <joke>Obviously, linux-packaging is superior.</joke>
Maybe you can use sb-posix:symlink and sb-posix:fstat on SBCL instead, removing the osicat dependency by feature toggle.

Options for distributing a c program with library-dependencies under linux

I developed a C program requiring some dynamic libraries, most notably libmysqlclient.so, which I intent to run on some remote-hosts. It seems like I have the following Options for distribution:
Compile the program static.
Install the required dependencies on the remote host
Distribute the dependencies with the program.
The first option is problematic as I need glibc-version at runtime anyway (since I use glibc and libnss for now).
I'm not sure about the second option: Is there a mechanism which checks if a installed library-version is sufficient for a program to run (beside libxyz.so.VERSION). Can I somehow check ABI-compatibility at startup?
Regarding the last Option: would I distribute ALL shared-libraries with the binary, or just the one which are presumably not installed (e.g libmysqlclient, but not libm).
Apart form this, am I likely to encounter ABI-compatibility problems if I use a different compiler for the binary then the one the dependencies were build with (e.g binary clang, libraries gcc)?
Version checking is distribution-specific. Usually, you would package your application in a .deb or .rpm file using the target distribution's packaging tools, and ship that to users. This means that you have to build your application once for each supported distribution, but there really is no way around that anyway because different distributions have slightly different versions of libmysqlclient. These distribution build tools generate some dependency version information automatically, and in other cases, some manual help is needed.
As a starting point, it's a good idea to look at the distribution packaging for something that relies on the MySQL/MariaDB client library and copy that. Maybe inspircd in Debian is a good example.
You can reduce the amount of builds you need to create and test somewhat by building on the oldest distribution versions you want to support. But some caveats apply; distributions vary in the degree of backwards compatibility they provide.
Distributing dependencies with the program is very problematic because popular libraries such as libmysqlclient are also provided by the base operating system, and if you use LD_LIBRARY_PATH to inject your own version, this could unintentionally extend to other programs as well (e.g., those you launch from your own program). The latter risk is still present even if you use DT_RUNPATH (via the -rpath linker option), although it is somewhat reduced.
A different option is to link just application-specific support libraries statically, and link base operating system libraries dynamically. (This is what some software collections do.) This does not seem to be such a great choice for libmysqlclient, though, because there might be an expectation that its feature set is identical to the distribution (regarding the TLS library and available configuration options), and with static linking, this is difficult to achieve.

Avoiding too specific dependencies

I am using a shared C library on Linux that is distributed in binary form. The problem is that the dependencies are set to require exactly the versions available on the development machine. For example, each release requires the (at the time) latest glibc and only the exact version of libreadline on their system.
I have contacted the developers and they don't know what to do about this. As far as I can tell, they are not consciously using the latest features, so the library should continue to work with older dependencies. I think they are using gcc on Linux, but they are also using a complex make system to control other compilers to build for Windows and Unix.
How and to what extent can you manage the build process so that a library requires dependencies just of a sufficient version and will also accept later versions?
This was a related question.
Edit: To be clear, I want to know how to build programs so they will accept dependencies with a specific version number or later numbers. Whether the developers compile it or I do, I want to be able to distribute a binary that does not require exactly the versions of dependencies present in the build environment.
Edit 2: After rephrasing the question, I realized this has been covered many times before. Some of the best Q&A:
Deploying Yesod to Heroku, can't build statically
Compile with older libc
Linking against an old version of libc
How can I link to a specific glibc version?
It's not very confidence inspiring. They should be building on a stable baseline release, it could just be a virtual install. Some versions of Linux, copy a build environment so packages aren't linked to updated library versions.
The openSUSE build service, lets devolopers build binary packages, for a wide variety of http://openbuildservice.org/about/
IIRC readline is a GPL program and checking at http://cnswww.cns.cwru.edu/php/chet/readline/rltop.html#Availability suggests it is GPL v 3 so they may be in violation of the GPL, if they are using libreadline functions and should provide you with the source to their library. I am not sure if you are meaning rpm/apt package dependencies, or their library is actually calling libreadline.
You can always extract files from rpm or apt packages, if necessary so avoiding software manager issues, caused by poor packaging.

What is the common code reuse strategy in C

Context: C language, 8 bit microprocessor
We have identified components which can be reused between projects (products). But I can not find which is the best infrastructure to handle the reusable components.
Two possibilities I found up to now:
Static libraries
Shared files in subversion
Both shared libraries and shared source let you share the common code among projects. Libraries present a better of the two alternatives, so you should use them if they are available on your platform. This lets you guard the source of the library from inadvertent modifications, which could happen if the code from source control is changed locally.
The only problem with sharing code through libraries may be lack of support for source-level debugging of library code by some of the tools in your embedded tool chain (e.g. debuggers attached to in-circuit emulators). In this case reusing code through the source may be acceptable. If possible, you should guard the source from modification through the file system access controls.
If you have reusable components, libraries are the way to go.
It's easier to maintain and you have a clear interface. It's also easier to incorporate into new projects.
You can easily do individual unit tests on library code
Lesser risk to copy and paste code.
Programmers are more aware that this code is shared when they have to use it from a library.
Several good arguments have been made for the library approach.
However, there's at least one good argument for re-building (perhaps from the same source repository) each time you build a dependent project, and that would be the ability to apply target- project- or development stage- unique compile settings to all of the code, including the shared portion.
At my company, we used both approaches at the same time:
We do two checkouts: one for the project, the other for the library.
When the project needs to be compiled (via Makefile), we compile the library first.
The library is then linked as if it was a binary-only library.
When we release a project, we check whether the other projects still compile against the new library.
When we release a project, we tag the library along with the project.
This way you get the best of both worlds:
common code is shared: all projects benefit from bug fixes and improvements
source code is always fully available for understanding and debugging
source code availability encourages library maintenance (fixings, improvements, and experiments)
the library boundaries impose a more API-like approach: clearer interface and project embedding
you can pass compile-time flags to the library to build a different flavors
you can always go back in time if needed without library-vs-project mismatching hassles
if you are in a hurry, you can put off the library check.
The only drawback to this approach is that developers have not know what they are doing. If they modify the library, they should know that the change will impact on all projects. But you are already using a version control system and, if you use branches and the communication within your team is good, there should be no problem at all.

Which dependencies can be assumed to be installed on a build machine?

We have a project that is going to require linking against libcurl and libxml2, among other libraries. We seem to have essentially two strategies for managing these depencies:
Ask each developer to install those libraries under the "usual" locations, e.g. /usr/lib, or
Include the sources to these libraries under a dedicated folder in the project's source tree.
Approach 1 requires everyone to make sure those libraries are installed on their system, but appears to be the approach used by many open source projects. On such projects, the build will detect that those libraries are missing and will fail.
Approach 2 might make the project tree unmanageably large in some instances and make the compilation time much longer. In addition, this approach can obviously be taken too far. I wouldn't put the compiler under the project tree, for instance (right?).
What are the best practices wrt external dependencies? Can/should one require of every developer to have certain libraries installed to build the project? Or is considered better to include all the dependencies in the project tree?
Don't bother about their exact location in your code. Locating them should be handled by the used compiler/linker (or the user by setting variables) if they're common. For very uncommon dependencies (or ones with customized/modified files) you might want to include them in your source (if possible due to licensing etc.).
If you'd like it more convenient, you should use some script (e.g. configure or CMake) to setup/create the build files being used. CMake for example allows you to set different packages (libcurl and libxml2 in your example) as optional and required. When building the project it will try to locate those, if that fails it will ask the user. This IS an additional step and might make building a bit more cumbersome but it will also make downloading faster (smaller source) as well as updating easier (as all you have to do is rebuild your program).
So in general I'd follow approach 1, if there's special/rare/customized stuff being used, approach 2.
The normal way is to have the respective dependencies and have the developer install them. Later, if the project is packeted into .deb or .rpm, these packets will require the respective libraries to be installed, the source packets will have the -devel packets as dependencies.
Best practice is not to include the external libraries in your source tree - instead, include a text file called INSTALL in your project root, which gives instructions on building the project and includes a list of the library dependencies, including minimum versions.

Resources