Find a directory in shared library search path - c

I want dlopen() every shared library in a specific directory. In order to do that,
what is the cleanest way to retrieve linux's library search path. Or Is there a quicker way of find a specific directory in that path ?
posix would be better.

POSIX does not support a mechanism to find out the directories on the shared library search path (it does not mandate LD_LIBRARY_PATH, for example), so any solution is inherently somewhat platform specific.
Linux presents some problems because the values to be used could be based on the contents of /etc/ld.so.conf as well as any runtime value in LD_LIBRARY_PATH environment variable; other systems present comparable problems. The default locations also vary by system - with /lib and /usr/lib being usual for 32-bit Linux machines but /lib64 and /usr/lib64 being used on at least some 64-bit machines. However, other platforms use other locations for 64-bit software. For example, Solaris uses /lib/sparcv9 and /usr/lib/sparcv9, for example (though the docs mention /lib/64 and /usr/lib/64, they're symlinks to the sparcv9 directories). Solaris also has environment variables LD_LIBRARY_PATH_64 and LD_LIBRARY_PATH_32. HP-UX and AIX traditionally use other variables than LD_LIBRARY_PATH -- SHLIB_PATH and LIBPATH, IIRC -- though I believe AIX now uses LD_LIBRARY_PATH too. And, on Solaris, the tool for configuring shared libraries is 'crle' (configure runtime linking environment) and the analog of /etc/ld.so.conf is either /var/ld/ld.config or /var/ld/64/ld.config. Also, of course, the extensions on shared libraries varies (.so, .sl, .dylib, .bundle, etc).
So, your solution will be platform-specific. You will need to decide on the the default locations, the environment variables to read, and the configuration file to read, and the relevant file extension. Given those, then it is mainly a SMOP - Simple Matter Of Programming:
For each directory named by any of the sources:
Open the relevant sub-directory (opendir())
Read each file name (readdir()) in turn
Use dlopen() on the path of the relevant files.
Do whatever analysis is relevant to you.
Use dlclose()
Use closedir()
See also the notes in the comment below...the complete topic is modestly fraught with variations from platform to platform.

I'm not sure it's possible to do that and be portable. Since this question is about Linux, portability may not be of paramount importance. Then I do not understand the POSIX constraint. Could you clarify?
You'll probably have to either implement the search functionality detailed in man 8 ld.so, which includes scanning /etc/ld.so.conf in addition to LD_LIBRARY_PATH, or make /lib/ld.so do what you want for you and parse the output. A not-exactly-pretty command line for that could be:
export LD_PRELOAD=THISLIBRARYSODOESNOTEXIST
strace -s 4096 /bin/true 2>&1 | sed -n 's/^open("\([^"]*\)\/THISLIBRARYSODOESNOTEXIST".*$/\1\/YOURSUBDIRHERE/gp'
unset LD_PRELOAD
You can then enumerate files with the POSIX calls opendir(3) and readdir(3).

Related

Alternative to dir command to query a directory in C

I am trying to create a program where the user can add different path regex-s so that a specific set of operations on the files that match the regex.
I tried using opendir() of the dirent.h header file but soon realized that it does not use the concept of regex.
The dir command I am trying to emulate is
dir [regex] /b
I need the output in a (char) buffer**. Piping the output could be a solution but I am looking for a more efficient way to do it.
Is there any predefined function in the standard (C90) library or will we have to create our own implementation?
C does not know about directories. They are operating system specific, usually provided by your OS kernel (look however inside GNU Hurd as an exception, and into unikernels). Read the C11 standard n1570 and forget, in 2020, about the obsolete C89 standard and TurboC. Consider trying some Linux distribution (such as Ubuntu or Debian or others). Most of them provide GCC or Clang (or the non-optimizing TinyCC compiler) and are very developer-friendly. My recommendation: use GCC as gcc -Wall -Wextra -g. Choose a good enough built automation tool (maybe GNU make) with an appropriate source-code editor (such as GNU emacs or vim or geany or others). Learn how to debug small programs and use the GDB debugger and the git version control tool.
POSIX does know about directories (it is an API specification written in English, also defining regex(3)). See here, and read the Linux man pages. And also the WinAPI.
On Linux, see mkdir(2), chdir(2), readdir(3), getcwd(3), unlink(2), stat(2), open(2), nftw(3), path_resolution(7) etc etc; you could want to study the source code of a Linux kernel and of some common C library for it, such as GNU glibc or musl-libc. Budget for that several months full time of your efforts. They are open source, so with some conditions you are allowed to study, improve and reuse their source code. See also http://linuxfromscratch.org/
Notice also popen. You probably don't want to use it and would prefer using more primitive system calls (see syscalls(2) for their list on Linux). You could use a library like Glib (from GTK).
Remember that C programs (of the freestanding kind) could run on the bare metal (e.g. Arduino). In those cases, speaking of directories does not make any sense. See also osdev.org for more, and observe that the Linux kernel is written in C (with a tiny amount of assembler code).
GrassHopper was an OS written mostly in C without any files or directories. See also old discussions archived on tunes.org and tccboot.
Use the function findfirst to start querying a directory and then findnext to iterate. The functions find all files in one directory matching a given pattern, so you possibly need to append \*.* to your directory name to list all files in that directory.
Refer to the Turbo C documentation for details.

Resolve shared library path on Windows and *nix systems

When loading shared library given its name, systems searches for the actual file (eg .dll) in some directories, based on search order, or in cache.
How can I programmatically get the resolved path of DLL given its name, but without actually loading it? E.g. on Windows, for kernel32 or kernel32.dll it would probably return C:\windows\system32\kernel32.dll whereas given foo it could be C:\Program Files\my\app\foo.dll.
If that can't be done, is there another way to determinate whether certain library belongs to system? E.g. user32.dll or libc.so.6 are system libraries but avcodec-55.dll or myhelperslib.so are not.
I'm interested solutions that work on Windows, Linux and Mac OS.
On Windows, LoadLibraryEx has the LOAD_LIBRARY_AS_DATAFILE flag which opens the DLL without performing the operations you refer to as "actually loading it".
This can be combined with any of the search order flags (Yeah, there is more than just one search order).
Unfortunately, you cannot use GetModuleFilename. Use GetMappedFileName instead.
The LoadLibraryEx documentation also says specifically not to use the SearchPath function to locate DLLs and not to use the DONT_RESOLVE_DLL_REFERENCES flag mentioned in comments.
For Linux, there's an existing tool ldd for which source code is available. It does actually load the shared libraries, but with a special environment variable LD_TRACE_LOADED_OBJECTS set that by convention causes them to skip doing anything. Because this is just a convention, beware that malicious files can perform actions when loaded by ldd CVE-2009-5064.

What is the point of using `-L` when there is `LD_LIBRARY_PATH`?

After reading this question, my first reaction was that the user is not seeing the error because he specifies the location of the library with -L.
However, apparently, the -L option only influences where the linker looks, and has no influence over where the loader looks when you try to run the compiled application.
My question then is what's the point of -L? Since you won't be able to run your binary unless you have the proper directories in LD_LIBRARY_PATH anyway, why not just put them there in the first place, and drop the -L, since the linker looks in LD_LIBRARY_PATH automatically?
It might be the case that you are cross-compiling and the linker is targeting a system other than your own. For instance, MinGW can be used to compile Windows binaries on Linux. Here -L will point to the DLLs needed for linking and LD_LIBRARY_PATH will point to any libraries needed by linker to run. This allows compiling and linking of different architectures, OS ABIs, or processor types.
It's also helpful when trying to build special targets. I might be case that one links a static version of program against a different static library. This is the first step in Linux From Scratch, where one creates a separate mini-environment on the main system to become a chroot jail.
Setting LD_LIBRARY_PATH will affect all the commands you run to build your code (including the compiler itself).
That's not desirable in general (e.g. you might not want your compiler to run debug/instrumented libraries while it compiles - it might even go as far as breaking your compiles).
Use -L to tell the compiler where to look, LD_LIBRARY_PATH to influence runtime linking.
Building the binary and running the binary are two completely independent and unrelated processes. You seem to suggest that the running environment should affect the building environment, i.e. you seem to be making an assumption that the code build in some setup (account, machine) will be later run in the same setup. I find this assumption rather strange. I'd even say that in most cases the building and the running are done in different environments. I would actually prefer my compilers not to derive any assumptions about future running environment from the environment these compilers are invoked in. Looking onto the LD_LIBRARY_PATH of the building environment would be a major no-no.
The other answers are all good, but one nobody has mentioned yet is static libraries. Most of the time when you use -L it's with a static library built locally in your build tree that you don't intent to install, and it has nothing to do with LD_LIBRARY_PATH.
Compilers on Solaris support the -R /runtime/path/to/some/libs that adds to the path where libraries are to be searched by the run-time linker. On Linux the same could be achieved with -Wl,-rpath,/runtime/path/to/some/libs. It passes the -rpath /runtime/path/to/some/libs option to ld. GNU ld also supports the -R /path/to/libs for compatibility with other ELF linkers but this should be avoided as -R is normally used to specify symbol files to GNU ld.

Why "/lib/libc.so.1" is mounted on solaris 10?

Why the /lib/libc.so.1 (linker/loader) is always mounted on Solaris 10 ? I have tried both mount and df output. It shows me /lib/libc.so.1 entry.
For both SPARC and x86 architectures, Solaris provides optimized C standard libraries. At boot time, the best suited for your machine, i.e. the one taking advantage of CPU specific instructions and features, is lofs mounted on top of the standard one.
Since Solaris 10, no static libc is provided so this dynamic libc, being the interface between the kernel and the userland, is a mandatory component of every program running on Solaris.
More details here.
One might ask why is this done with a lofs mount and not by a lightweight feature like a symlink.
The reason is a symlink is persistent, i.e. survives a reboot. Using a symlink might then render a system unusable should the hardware capabilities evolve or should should for some other reason the wrong library would have been linked to. Again, all Solaris commands are dynamically linked to libc.so. There has not been a libc.a since a long time.
Using a lofs mount ensure the first stage of system boot are done with using the safe default libc.so, and the optimized one is only selected at the right time and in particular allows a safe boot with all services disabled (-m milestone=none) not to be affected by a capabilities change.
libc.so is required to run unix commands like ssh or awk that were written in C and use dynamic (runtime) linking. libc.so is a link to libc.so.1 which is the "base" version of the C library for the implementation of Solaris 10 you are running.
Solaris does not work exactly the way Linux does with versions of libc because there are different versions of sparc architecure. The lowest common denominator is sparc 1. I have a Ultrasparc III box and other more modern boxes.
Try the file command on libc.so.1: file /lib/libc.so.1 In order for the utilities and other code to get the max from the box, the architecture "sparc setting" of libc matches the box. Read about and try the isalist and isainfo commands.

histogram function in ansi C program: GSL and/or others?

If I just want to use the gsl_histogram.h library from Gnu Scientific Library (GSL), can I copy it from an existing machine (Mac OS Snow Leopard) that has GSL installed to a different machine (Linux CentOS 5.7) that doesn't have GSL installed, and just use an #include <gls_histogram.h> statement in my c program? Would this work?
Or, do I have to go through the full install of GSL on the Linux box, even though I only need this one library?
Just copying a header gsl_histogram.h is not enough. Header states merely the interface that is exposed by this library. You would need to copy also binaries like *.so and *.a files, but it's hard to tell which ones to copy. So I think the you'd better just install it on your machine. It's pretty easy, just use this tutorial to find and install GSL package.
So there are surely a lot of libraries out there. However the particular one is Gnuplot. Using it you even do not need to compile the code, however you do need to read a bit of documentation. But luckily there is already a question about how to draw a histogram with Gnuplot on Stackoverflow: Histogram using gnuplot? It worth noting that Gnuplot is actually very powerful tool, so invested time into reading its documentation will certainly pay off.
You cannot copy libraries from OS and expect them to work unchanged.
OS X uses the Mach-O object file format while modern Linux systems use the ELF object file format. The usual ld.so(8) linker/loader will not know how to load the Mach-O format object files for your executable to execute. So you would need the Apple-provided ld.so(8) -- or whatever they call their loader. (It's been a while.)
Furthermore, the object files from OS X will be linked against the Apple-supplied libc, and require the corresponding symbols from the Apple-supplied library. You would also need to provide the Apple-provided libc on the Linux system. This C library would try to make system calls using the OS X system call numbers and calling conventions. I guarantee the system call numbers have changed and almost certainly calling conventions are different.
While the Linux kernel's binfmt_misc generic object loader can be used to teach the kernel how to load different object file formats, and the kernel's personality(2) system call can be used to select between different calling conventions, system call numbers, and so on, the amount of work required to make this work is nothing short of immense: the WINE Project has been working on exactly this issue (but with the Windows format COFF and supporting libraries) since 1993.
It would be easier to run:
apt-get install libgs0-dev
or whatever the equivalent is on your distribution of choice. If your distribution does not make it easily available, it would still be easier to compile and install the library by hand rather than try to make the OS X version work.

Resources