access a POSIX function using dlopen - c

POSIX 2008 introduces several file system functions, which rely on directory descriptor when determining a path to the file (I'm speaking about -at functions, such as openat, renameat, symlinkat, etc.). I doubt if all POSIX platforms support it (well, at least the most recent versions seem to support) and I'm looking for a way to determine if platform supports such functions. Of course one may use autoconf and friends for compile-time determination, but I'm looking for a possibility to find out whether implementation supports -at functions dynamically.
The first that comes to my mind is a dlopen()/dlsym()/dlclose() combo; at least I've successfully loaded the necessary symbols from /usr/libc.so.6 shared library. However, libc may be (or is?) named differently on various platforms. Is there a list of standard locations to find libc? At least on Linux /lib/libc.so appears to be not a symbolic link to shared library, but a ld script. May be there exist some other way to examine during runtime if a POSIX function is supported? Thanks in advance!

#define _GNU_SOURCE 1
#include <dlfcn.h>
#include <stdio.h>
int main ()
{
void * funcaddr = dlsym(RTLD_DEFAULT, "symlinkat");
/* -----------------------^ magic! */
printf ("funcaddr = %p\n", funcaddr);
}
Output:
funcaddr = 0x7fb62e44c2c0
Magic explanation: your program is already linked with libc, no need to load it again.
Note, this is actually GNU libc feature, as hinted by _GNU_SOURCE. POSIX reserves RTLD_DEFAULT "for future use", and then proceeds to define it exactly like GNU libc does. So strictly speaking it is not guaranteed to work on all POSIX systems.

Related

Redirecting assert fail messages

We have a software project with real-time constraints largely written in C++ but making use of a number of C libraries, running in a POSIX operating system. To satisfy real-time constraints, we have moved almost all of our text logging off of stderr pipe and into shared memory ring buffers.
The problem we have now is that when old code, or a C library, calls assert, the message ends up in stderr and not in our ring buffers with the rest of the logs. We'd like to find a way to redirect the output of assert.
There are three basic approaches here that I have considered:
1.) Make our own assert macro -- basically, don't use #include <cassert>, give our own definition for assert. This would work but it would be prohibitively difficult to patch all of the libraries that we are using that call assert to include a different header.
2.) Patch libc -- modify the libc implementation of __assert_fail. This would work, but it would be really awkward in practice because this would mean that we can't build libc without building our logging infra. We could make it so that at run-time, we can pass a function pointer to libc that is the "assert handler" -- that's something that we could consider. The question is if there is a simpler / less intrusive solution than this.
3.) Patch libc header so that __assert_fail is marked with __attribute__((weak)). This means that we can override it at link-time with a custom implementation, but if our custom implementation isn't linked in, then we link to the regular libc implementation. Actually I was hoping that this function already would be marked with __attribute__((weak)) and I was surprised to find that it isn't apparently.
My main question is: What are the possible downsides of option (3) -- patching libc so that this line: https://github.com/lattera/glibc/blob/master/assert/assert.h#L67
extern void __assert_fail (const char *__assertion, const char *__file,
unsigned int __line, const char *__function)
__THROW __attribute__ ((__noreturn__));
is marked with __attribute__((weak)) as well ?
Is there a good reason I didn't think of that the maintainers didn't already do this?
How could any existing program that is currently linking and running successfully against libc break after I patch the header in this way? It can't happen, right?
Is there a significant run-time cost to using weak-linking symbols here for some reason? libc is already a shared library for us, and I would think the cost of dynamic linking should swamp any case analysis regarding weak vs. strong resolution that the system has to do at load time?
Is there a simpler / more elegant approach here that I didn't think of?
Some functions in glibc, particularly, strtod and malloc, are marked with a special gcc attribute __attribute__((weak)). This is a linker directive -- it tells gcc that these symbols should be marked as "weak symbols", which means that if two versions of the symbol are found at link time, the "strong" one is chosen over the weak one.
The motivation for this is described on wikipedia:
Use cases
Weak symbols can be used as a mechanism to provide default implementations of functions that can be replaced by more specialized (e.g. optimized) ones at link-time. The default implementation is then declared as weak, and, on certain targets, object files with strongly declared symbols are added to the linker command line.
If a library defines a symbol as weak, a program that links that library is free to provide a strong one for, say, customization purposes.
Another use case for weak symbols is the maintenance of binary backward compatibility.
However, in both glibc and musl libc, it appears to me that the __assert_fail function (to which the assert.h macro forwards) is not marked as a weak symbol.
https://github.com/lattera/glibc/blob/master/assert/assert.h
https://github.com/lattera/glibc/blob/master/assert/assert.c
https://github.com/cloudius-systems/musl/blob/master/include/assert.h
You don't need attribute((weak)) on symbol __assert_fail from glibc. Just write your own implementation of __assert_fail in your program, and the linker should use your implementation, for example:
#include <stdio.h>
#include <assert.h>
void __assert_fail(const char * assertion, const char * file, unsigned int line, const char * function)
{
fprintf(stderr, "My custom message\n");
abort();
}
int main()
{
assert(0);
printf("Hello World");
return 0;
}
That's because when resolving symbols by the linker the __assert_fail symbol will already be defined by your program, so the linker shouldn't pick the symbol defined by libc.
If you really need __assert_fail to be defined as a weak symbol inside libc, why not just objcopy --weaken-symbol=__assert_fail /lib/libc.so /lib/libc_with_weak_assert_fail.so. I don't think you need to rebuild libc from sources for that.
If I were you, I would probably opt for opening a pipe(2) and fdopen(2)'ing stderr to take the write end of that pipe. I'd service the read end of the pipe as part of the main poll(2) loop (or whatever the equivalent is in your system) and write the contents to the ring buffer.
This is obviously slower to handle actual output, but from your write-up, such output is rare, so the impact ought to be negligable (especially if you already have a poll or select this fd can piggyback on).
It seems to me that tweaking libc or relying on side-effects of the tools might break in the future and will be a pain to debug. I'd go for the guaranteed-safe mechanism and pay the performance price if at all possible.

<search.h> header file not available

I can't find the search.h header file mentioned in spell.c, and hence the compiler can't find hcreate(), hsearch() and ENTRY.
Ref:
http://marcelotoledo.com/how-to-write-a-spelling-corrector/
https://github.com/marcelotoledo/spelling_corrector/blob/master/spell.c
The <search.h> header is a POSIX-standard header — and the library functions it declares include:
hash search (hsearch())
linear search (lsearch())
tree search (tsearch())
Those pages each list the set of relevant functions for a particular search. Note that binary search, aka bsearch(), is defined by the C standard rather than POSIX.
The functions were part of Unix SVR4 (and possibly other System V versions), and made it into the Single Unix Specification and hence POSIX too.
If your system doesn't support the header, then it isn't strictly POSIX compliant. You can certainly find implementations of the functions on the web (BSD, Linux — and probably other places too). You may be able to find a version to download for your system. (Macs have it already; I'd expect to find AIX, HP-UX, Solaris include it by default, too.)

Macros like _GNU_SOURCE, what do they mean?

Lot many times while referring to linux header files or man files, I see the following macros used..
Ex : man mkstemp
In this man page we can see that the below macros are featured.
_GNU_SOURCE
_BSD_SOURCE
_SVID_SOURCE
_XOPEN_SOURCE
_XOPEN_SOURCE_EXTENDED
What am I supposed to understand to write a correct program if I am using these API's/Headers?
Read feature_test_macros(7) man page (and the §1.3.4 Feature Test Macros chapter of GNU libc documentation).
You might compile your whole program with some special feature symbols. For instance, I often compile a program with -D_GNU_SOURCE. This means that I want all the extra GNU specific features provided on my system by GNU libc etc. You could instead compile with -D_POSIX_C_SOURCE=200112L if you want strict POSIX 2001 compliance (and nothing more).
Alternatively, if all your .c files are just #include-ing only your own header, that header could start with #define _GNU_SOURCE 1 followed by several system #include ....
The point is that a GNU/Linux system obey to several standards (with GNU providing its own standard), and you might choose which ones.
GNU libc (which is the most common libc available on Linux, but you could use some other libc, like musl-libc ....) provides a lot of functions, features and headers not available on other systems, e.g. <argp.h> (header), fopencookie (function), %m format control directive in printf feature.
It is also relevant if you intend to code a program portable to other POSIX systems (e.g. to MacOSX). On MacOSX or AIX systems you don't have getopt_long since it is a GNU specific function.

What is GLIBC? What is it used for?

I was searching for the source code of the C standard libraries. What I mean with it is, for example, how are cos, abs, printf, scanf, fopen, and all the other standard C functions written, I mean to see their source code.
So while searching for this, I came across with GLIBC, but I don't know what it actually is. It is GNU C Library, and it contains some source codes, but what are they actually, are they the source code of the standard functions or are they something else? And what is it used for?
Its the implementation of Standard C library described in C standards plus some extra useful stuffs which are not strictly standard but used frequently.
Its main contents are :
1) C library described in ANSI,c99,c11 standards. It includes macros, symbols, function implementations etc.(printf(),malloc() etc)
2) POSIX standard library. The "userland" glue of system calls. (open(),read() etc. Actually glibc does not "implement" system calls. kernel does it. But glibc provides the user land interface to the services provided by kernel so that user application can use a system call just like a ordinary function.
3) Also some nonstandard but useful stuff.
"use the force, read the source "
$git clone git://sourceware.org/git/glibc.git
(I was recently pretty enlightened when i looked through malloc.c in glibc)
There are several implementations of the standard. Glibc is the implementation that most Linuxes use, but there are others. Glibc also contains (as Aftnix states) the glue functions which set up the scene for jumps into the kernel (also known as system calls). So many of glibc's 'functions' don't do the actual work but only delegate to the kernel.
To read the source of Glibc, just google for it. There are myriad sites which carry it, and also several variations.
Windows uses Microsoft's own implementation, which I believe is called MSVCR.DLL. I doubt that you will find the source code to that library anywhere. Also note that some functions which a Linux hacker might think of as 'standard', simply don't exist on Windows (notably fork). The reverse is also true.
Other systems will have their own libc.
The glibc package contains standard libraries which are used by multiple programs on the system. In order to save disk space and memory, as well as to make upgrading easier, common system code iskept in one place and shared between programs. This particular package contains the most important sets of shared libraries: the standard C library and the standard math library. Without these two libraries, a Linux system will not function. The glibc package also contains national language (locale) support.
Yes, It's the implementation of standard library functions.
More specifically, it is the implementation for all GNU systems and in almost all *NIX systems that use the Linux kernel.
Here are a few "hands-on" points of view:
it implements the POSIX C API on top of the Linux kernel: What is the meaning of "POSIX"?
it contains several assembly hand-optimized versions of ANSI C functions for several different architectures, e.g. strlen:
sysdeps/x86_64/strlen.S
sysdeps/aarch64/strlen.S
how to modify its source, recompile and use it understand it better: How to compile my own glibc C standard library from source and use it?
how to GDB step debug it with QEMU and Buildroot: https://github.com/cirosantilli/linux-kernel-module-cheat/tree/9693c23fe6b2ae1409010a1a29ff0c1b7bd4b39e#gdbserver-libc

How to use Linux-specific APIs and libraries only on Linux builds with CMake?

I have a project that I run on Linux (primarily), but sometimes on Darwin/Mac OS X. I use CMake to generate Makefiles on Linux and an Xcode project on Mac OS X. So far, this has worked well.
Now I want to use some Linux-specific functions (clock_gettime() and related functions). I get linker errors on Mac OS X when I try to use clock_gettime(), so I assume it is only available on Linux. I am prepared to introduce conditionally-compiled code in the .c files to use clock_gettime() on Linux and plain old clock() on Mac OS. (BTW I was planning to use #include <unistd.h> and #if _POSIX_TIMERS > 0 as the preprocessor expression, unless someone has a better alternative.)
Things get tricky when it comes to the CMakeLists.txt file. What is the preferred way of introducing linkage to Linux-specific APIs only under the Linux build in a cross-platform CMake project?
Note: An earlier revision of this question contained references to glibc, which was overly specific and confusing. The question is really about the right way to use Linux-specific APIs and libraries in a cross-platform CMake project.
Abstracting away from your examples, and answering only this question:
How to use Linux-specific APIs and libraries only on Linux builds with
CMake?
CMake provides numerous useful constants that you can check in order to determine which system you are running:
if (${UNIX})
# *nix-specific includes or actions
elsif (${WIN32})
# Windows-specific includes or actions
elsif (${APPLE})
# ...
endif (${UNIX})
(I know you're asking about glibc, but you really want to know whether clock_gettime is present, right? But nothing in your question is Linux-specific...)
If you want to check for clock_gettime, you can use the preprocessor. If clock_gettime is present, then _POSIX_TIMERS will be defined. The clock_gettime function is part of an optional POSIX extension (see spec), so it is not Linux-specific but not universal either. Mac OS X does not have clock_gettime: it is not declared in any header nor defined in any library.
#include <time.h>
#include <unistd.h> /* for _POSIX_TIMERS definition, if present */
#if _POSIX_TIMERS
...use clock_gettime()...
#else
...use something else...
#endif
This doesn't solve the problem that you still have to link with -lrt on Linux. This is typically solved with something like AC_CHECK_LIB in Autoconf, I'm sure there's an equivalent in CMake.
From man 2 clock_gettime:
On POSIX systems on which these functions are available, the symbol _POSIX_TIMERS is defined in <unistd.h> to a value greater than 0. The symbols _POSIX_MONOTONIC_CLOCK, _POSIX_CPUTIME, _POSIX_THREAD_CPUTIME indicate that CLOCK_MONOTONIC, CLOCK_PROCESS_CPUTIME_ID, CLOCK_THREAD_CPUTIME_ID are available. (See also sysconf(3).)
On Darwin you can use the mach_absolute_time function if you need a high-resolution monotonic clock. If you don't need the resolution or monotonicity, you should probably be using gettimeofday on both platforms.
There is also built-in CMake macro for checking if symbol exists - CheckSymbolExists.

Resources