The term has several definition according to Wikipedia, however what I'm really interested in is creating a program that has all its needed dependencies included within the source folder, so the end user doesn't need to install additional libraries for the app to install. For example, how Mac apps has all its dependencies all within the program itself already...
or is there a function that autotools does this? I'm programming in the Linux environment...
Are you talking about the source code of your application, or about your application binary?
The answer I'd give for both the cases depends on what libraries you're using.
If you're using libraries that you can find anywhere, that are somehow standard and/or that are quite big, you shouldn't attach them to your application, just require them both to build and to run your application.
Anyway don't be much concerned about your source code: little people will build your application, and they probably know something about programming and how a Linux system works; it won't be a big deal to require many (also not-so-common) dependences to build your application.
For what concerns the binary version it could be a little more problematic, since it will be used by end users who often don't know anything about libraries and programming stuff: you could choose to statically link the smallest and most uncommon libraries to your binary, in order to have less dependences.
You could do it, if you link statically, but it'd be somewhat unusual, and depending on what your program is supposed to do, you might be limiting yourself.
The alternative, if this is not just a one-off project, is to create a Linux Standard Base compatible RPM package and restrict yourself to linking against the libraries and symbols that LSB defines.
Run ldd on your program to discover all dependencies, then copy these to your directory, and add a program-wrapper script that issues
#!/bin/sh
LD_LIBRARY_PATH="${0##*/}:$LD_LIBRARY_PATH" exec "${0##*/}/real-program" "$#";
Duplicating the Mac OS X .app behavior on a plain POSIX system is difficult because it is very hard to guarantee that a process can find it's own executable (there are several way that will almost always work...). Mac OS X provides a OS service for this, but Linux (for instance) does not.
Once you've accomplished that feat, this becomes possible. Though, as others have mentioned, it loses the ability to share resource demands (disk space, RAM space, cache space) with other programs that use the same libraries because you'd be using static copies, or dynamically loading your own copy from the .app-like bundle.
Related
First off, I know that any such library would need to have at least some interface shimming to interact with system calls or board support packages or whatever (For example; newlib has a limited and well defined interface to the OS/BSP that doesn't assume intimate access to the OS). I can als see where there will be need for something similar for interacting with the compiler (e.g. details of some things left as "implementation defined" by the standard).
However, most libraries I've looked into end up being way more wedded to the OS and compiler than that. The OS assumptions seems to assume access to whatever parts they want from a specific OS environment and the compiler interactions practically in collusion with the compiler implementation it self.
So, the basic questions end up being:
Would it be possible (and practical) to implement a full C standard library that only has a very limited and well defined set of prerequisites and/or interface for being built by any compiler and run on any OS?
Do any such implementation exist? (I'm not asking for a recommendation of one to use, thought I would be interested in examples that I can make my own evaluations of.)
If either of the above answers is "no"; then why? What fundamentally makes it impossible or impractical? Or why hasn't anyone bothered?
Background:
What I'm working on that has lead me to this rabbit hole is an attempt to make a fully versioned, fully hermetic build chain. The goal being that I should be able to build a project without any dependencies on the local environment (beyond access to the needs source control repo and say "a valid posix shell" or "language-agnostic-build-tool-of-choice is installed and runs"). Given those dependencies, the build should do the exact same thing, with the same compiler and libraries, regardless of which versions of which compilers and libraries are or are not installed. Universal, repeatable, byte identical output is the target I'm wanting to move in the direction of.
Causing the compiler-of-choice to be run from a repo isn't too hard, but I've yet to have found a C standard library that "works right out of the box". They all seem to assume some magic undocumented interface between them and the compiler (e.g. __gnuc_va_list), or at best want to do some sort of config-probing of the hosting environment which would be something between useless and counterproductive for my goals.
This looks to be a bottomless rabbit hole that I'm reluctant to start down without first trying to locate alternatives.
I have a simple common lisp server program, that uses the osicat library to interface with the posix filesystem. I need to do this because the system creates symbolic links to files, and uses the POSIX stat metadata, and neither of those things are straightforward to do in portable lisp.
I am managing the dependencies with quicklisp, and I have all of this pinned to a working distribution. The app is portable between CCL and SBCL, and I tend to build it in the first and deploy it using the latter. I declare the dependencies for the app with an asdf defsystem, and I can use quicklisp to load it for easy development from local projects.
For deployment I was just using some ansible playbooks that replicated a developer environment on a remote (.e. setting up quicklisp, pushing code into local projects, running out of a user home directory) which was hacky, but mostly ok. More recently, as it's becoming more stable I have been compiling it using sb-ext:save-lisp-and-die, using a simple compile script. This means I get an executable that I can run more like a server, with service management scripts, and an anonymous user account.
This has been working very well, and so I recently moved this step to the next level, and I'm building .deb packages with my compile script, so I can bundle up everything into a relocatable binary. This also sort of works, but the resultant binaries are not relocatable from the original build host. They refuse to start up, and it appears that they try to dynamically load a shared library for osicat
Unhandled SIMPLE-ERROR in thread #<SB-THREAD:THREAD "main thread" RUNNING
Mar 15 12:47:14 annie [479]: {10005C05B3}>:
Mar 15 12:47:14 annie [479]: Error opening shared object "libosicat.so":
Mar 15 12:47:14 annie [479]: libosicat.so: cannot open shared object file: No such file or directory.
it looks like the image expects to find this in the original build tree's quicklisp archives
(ERROR "Error opening ~:[runtime~;shared object ~:*~S~]:~% ~A." "/home/builder/buil...quicklisp/dists/quicklisp/software/osicat-20180228-git/posix/libosicat.so
(SB-SYS:DLOPEN-OR-LOSE #S(SB-ALIEN::SHARED-OBJECT :PATHNAME #P"
so poking around the source, I realise that when quicklisp fetches osicat and exercise its build operation, it compiles this DLL to wrap it's interface with the system libaries, rather than just ffi to them directly - possibly because it's using cffi groveller, I don't really know much about cffi (yet). This is fine, but rather than linking to a .so using the system linker it's trying to dlopen it from a fixed path, which is not very portable, and kind of breaks the usefulness of save-image
I'm a bit stumped at this point, but before I go diving any much further into QL and cffi builds, I wondered if there's some build or compile configuration I'm missing that would make it bootstrap in a more 'static' fashion or influence the production of the wrapped library. Ideally I just want a single blob I can wrap in an installer, and link it against system libraries, but if I have to deploy some additional artefacts that's probably alright. I don't know how to make the autogenerated shared objects occur at a more controlled path.
At that point though, I may as well write a .so for my posix calls and distribute this alongside the app and try and FFI to it more directly. That would be a bit of a pain, so I would prefer to not do this.
You're right, when a dumped image is starting up, it is trying to reload the shared libraries. Which, as you are experiencing, is not working if the image is not starting on the machine it was dumped on.
This is almost what static-program-op set out to solve. A simple system definition like this should help you compile a static program:
(defsystem "foo"
:defsystem-depends-on ("cffi-grovel")
:build-operation "static-program-op" ; "asdf" package is implied
:build-pathname "foo" ; path of the generated binary
:entry-point "foo:main" ; function to use as the entry point
;; ... everything else ...
)
If your system depends on grovel files (defined by :cffi-wrapper-file, :c-file or :o-file), such as the ones provided by osicat, then it will statically link those to your dumped image.
However, it's not perfect.
Essentially, there are still some issues. Some are being fixed upstream by CFFI itself (e.g. not reloading shared libraries of the statically embedded libraries), others are a bit harder. (E.g. SBCL default compilation options don't let you use static-program-op by default. This is being fixed in Debian builds of SBCL, but other distributions are being less responsive.)
This is obviously a problem that the community at large has met, and there are several libraries that are around to help:
The first one, that has been around for a while, is Deploy. The approach it takes is that it embeds the dumped image and the libraries into an archive, and rearranges things for the binary to load them from wherever it is extracted to.
The second one, which I am biased towards because I made it, is linux-packaging. It takes the approach of fixing static-program-op by extending it, but requires you to build a custom SBCL. However, it generates distribution packages like .deb and .rpm, in order to be able to specify dependencies for system shared libraries (e.g. if you depend on sqlite, it will figure out which package provides it and add it as a dependency in the .deb). I highly recommend looking at the .gitlab-ci.yml for examples.
I recommend reading the webpages of both of those libraries to make your choice, they both have their advantages and drawbacks. <joke>Obviously, linux-packaging is superior.</joke>
Maybe you can use sb-posix:symlink and sb-posix:fstat on SBCL instead, removing the osicat dependency by feature toggle.
I developed a C program requiring some dynamic libraries, most notably libmysqlclient.so, which I intent to run on some remote-hosts. It seems like I have the following Options for distribution:
Compile the program static.
Install the required dependencies on the remote host
Distribute the dependencies with the program.
The first option is problematic as I need glibc-version at runtime anyway (since I use glibc and libnss for now).
I'm not sure about the second option: Is there a mechanism which checks if a installed library-version is sufficient for a program to run (beside libxyz.so.VERSION). Can I somehow check ABI-compatibility at startup?
Regarding the last Option: would I distribute ALL shared-libraries with the binary, or just the one which are presumably not installed (e.g libmysqlclient, but not libm).
Apart form this, am I likely to encounter ABI-compatibility problems if I use a different compiler for the binary then the one the dependencies were build with (e.g binary clang, libraries gcc)?
Version checking is distribution-specific. Usually, you would package your application in a .deb or .rpm file using the target distribution's packaging tools, and ship that to users. This means that you have to build your application once for each supported distribution, but there really is no way around that anyway because different distributions have slightly different versions of libmysqlclient. These distribution build tools generate some dependency version information automatically, and in other cases, some manual help is needed.
As a starting point, it's a good idea to look at the distribution packaging for something that relies on the MySQL/MariaDB client library and copy that. Maybe inspircd in Debian is a good example.
You can reduce the amount of builds you need to create and test somewhat by building on the oldest distribution versions you want to support. But some caveats apply; distributions vary in the degree of backwards compatibility they provide.
Distributing dependencies with the program is very problematic because popular libraries such as libmysqlclient are also provided by the base operating system, and if you use LD_LIBRARY_PATH to inject your own version, this could unintentionally extend to other programs as well (e.g., those you launch from your own program). The latter risk is still present even if you use DT_RUNPATH (via the -rpath linker option), although it is somewhat reduced.
A different option is to link just application-specific support libraries statically, and link base operating system libraries dynamically. (This is what some software collections do.) This does not seem to be such a great choice for libmysqlclient, though, because there might be an expectation that its feature set is identical to the distribution (regarding the TLS library and available configuration options), and with static linking, this is difficult to achieve.
Is it possible to write code in C, then statically build it and make a binary out of it like an ELF/PE then remove its header and all unnecessary meta-data so to create a raw binary and at last be able to put this raw binary in any other kind of OS specific like (ELF > PE) or (PE > ELF)?!
have you done this before?
is it possible?
what are issues and concerns?
how this would be possible?!
and if not, just tell me why not?!!?!
what are my pitfalls in understanding the static build?
doesn't it mean that it removes any need for 3rd party and standard as well as os libs and headers?!
Why cant we remove the meta of for example ELF and put meta and other specs needed for PE?
Mention:
I said, Cross OS not Cross Hardware
[Read after reading below!]
As you see the best answer, till now (!) just keep going and learn cross platform development issues!!! How crazy is this?! thanks to philosophy!!!
I would say that it's possible, but this process must be crippled by many, many details.
ABI compatibility
The first thing to think of is Application Binary Interface compatibility. Unless you're able to call your functions the same way, the code is broken. So I guess (though I can't check at the moment) that compiling code with gcc on Linux/OS X and MinGW gcc on Windows should give the same binary code as far as no external functions are called. The problem here is that executable metadata may rely on some ABI assumptions.
Standard libraries
That seems to be the largest hurdle. Partly because of C preprocessor that can inline some procedures on some platforms, leaving them to run-time on others. Also, cross-platform dynamic interoperation with standard libraries is close to impossible, though theoretically one can imagine a code that uses a limited subset of the C standard library that is exposed through the same ABI on different platforms.
Static build mostly eliminates problems of interaction with other user-space code, but still there is a huge issue of interfacing with kernel: it's int $0x80 calls on x86 Linux and a platform-specifc set of syscall numbers that does not map to Windows in any direct way.
OS-specific register use
As far as I know, Windows uses register %fs for storing some OS-wide exception-handling stuff, so a binary compiled on Linux should avoid cluttering it. There might be other similar issues. Also, C++ exceptions on Windows are mostly done with OS exceptions.
Virtual addresses
Again, AFAIK Windows DLLs have some predefined address they're must be loaded into in virtual address space of a process, whereas Linux uses position-independent code for shared libraries. So there might be issues with overlapping areas of an executable and ported code, unless the ported position-dependent code is recompiled to be position-independent.
So, while theoretically possible, such transformation must be very fragile in real situations and it's impossible to re-plant the whole static build code - some parts may be transferred intact, but must be relinked to system-specific code interfacing with other kernel properly.
P.S. I think Wine is a good example of running binary code on a quite different system. It tricks a Windows program to think it's running in Windows environment and uses the same machine code - most of the time that works well (if a program does not use private system low-level routines or unavailable libraries).
I'm a bit naive when it comes to application development in C. I've been writing a lot of code for a programming language I'm working on and I want to include stuff from ICU (for internationalization and unicode support).
The problem is, I'm just not sure if there are any conventions for including a third party library. for something like readline where lots of systems are probably going to have it installed already, it's safe to just link to it (I think). But what about if I wanted to include a version of the library in my own code? Is this common or am I thinking about this all wrong?
If your code requires 3rd party libraries, you need to check for them before you build. On Linux, at least with open-source, the canonical way to do this is to use Autotools to write a configure script that looks for both the presence of libraries and how to use them. Thankfully this is pretty automated and there are tons of examples. Basically you write a configure.ac (and/or a Makefile.am) which are the source files for autoconf and automake respectively. They're transformed into configure and Makefile.in, and ./configure conditionally builds the Makefile with any configure-time options you specify.
Note that this is really only for Linux. I guess the canonical way to do it on Windows is with a project file for an IDE...
If it is a .lib and it has no runtime linked libraries it gets complied into you code. If you need to link to dynamic libraries you will have to assure they are there provide a installer or point the user to where they can obtain them.
If you are talking about shipping your software off to end users and are worried about dependencies - you have to provide them correct packages/installers that include the dependencies needed to run your software, or otherwise make sure the user can get them (subject to local laws, export laws, etc, etc, etc, but that's all about licensing).
You could build your software and statically link in ICU and whatever else you use, or you can ship your software and the ICU shared libraries.
It depends on the OS you're targeting. For Linux and Unix system, you will typically see dynamic linking, so the application will use the library that is already installed on the system. If you do this, that means it's up to the user to obtain the library if they don't already have it. Package managers in Linux will do this for you if you package your application in the distro's package format.
On Windows you typically see static linking, which means the application bundles the library and it will use that specific version. many different applications may use the same library but include their own version. So you can have many copies of the library floating around on your system.
The problem with shipping a copy of the library with your code is that you don't get the benefit of the library's maintainers' bug fixes for free. Obscure, small, and unsupported libraries are generally worth linking statically. Otherwise I'd just add the dependency and ensure that whatever packages you ship indicate it appropriately.