Suppose I have an ELF binary that's dynamic linked, and I want to override/redirect certain library calls. I know I can do this with LD_PRELOAD, but I want a solution that's permanent in the binary, independent of the environment, and that works for setuid/setgid binaries, none of which LD_PRELOAD can achieve.
What I'd like to do is add code from additional object files (possibly in new sections, if necessary) and add the symbols from these object files to the binary's symbol table so that the newly added version of the code gets used in place of the shared library code. I believe this should be possible without actually performing any relocations in the existing code; even though they're in the same file, these should be able to be resolved at runtime in the usual PLT way (for what it's worth I only care about functions, not data).
Please don't give me answers along the line of "You don't want to do this!" or "That's not portable!" What I'm working on is a way of interfacing binaries with slightly-ABI-incompatible alternate shared-library implementations. The platform in question is i386-linux (i.e. 32-bit) if it matters. Unless I'm mistaken about what's possible, I could write some tools to parse the ELF files and perform my hacks, but I suspect there's a fancy way to use the GNU linker and other tools to accomplish this without writing new code.
I suggest the elfsh et al. tools from the ERESI (alternate) project, if you want to instrument the ELF files themselves. Compatibility with i386-linux is not a problem, as I've used it myself for the same purpose.
The relevant how-tos are here.
You could handle some of the dynamic linking in your program itself. Read the man page for dlsym(3) in particular, and dlopen(3), dlerror(3), and dlclose(3) for the rest of the dynamic linking interface.
A simple example -- say I want to override dup2(2) from libc. I could use the following code (let's call it "dltest.c"):
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <dlfcn.h>
int (*prev_dup2)(int oldfd, int newfd);
int dup2(int oldfd, int newfd) {
printf("DUP2: %d --> %d\n", oldfd, newfd);
return prev_dup2(oldfd, newfd);
}
int main(void) {
int i;
prev_dup2 = dlsym(RTLD_NEXT, "dup2");
if (!prev_dup2) {
printf("dlsym failed to find 'dup2' function!\n");
return 1;
}
if (prev_dup2 == dup2) {
printf("dlsym found our own 'dup2' function!\n");
return 1;
}
i = dup2(1,3);
if (i == -1) {
perror("dup2() failed");
}
return 0;
}
Compile with:
gcc -o dltest dltest.c -ldl
The statically linked dup2() function overrides the dup2() from the library. This works even if the function is in another .c file (and is compiled as a separate .o).
If your overriding functions are themselves dynamically linked, you may want to use dlopen() rather than trusting the linker to get the libraries in the correct order.
EDIT: I suspect that if a different function within the overridden library calls an overridden function, the original function gets called rather than the override. I don't know what will happen if one dynamic library calls another.
I don't seem to be able to just add comment to this question, so posting it as an "answer". Sorry about it, doing that just to hopefully help other folks who search an answer.
So, I seem to have similar usecase, but I explicitly find any modification to existing binaries unacceptable (for me), so I'm looking for standalone proxy approach: Proxy shared library (sharedlib, shlib, so) for ELF?
Related
We have a software project with real-time constraints largely written in C++ but making use of a number of C libraries, running in a POSIX operating system. To satisfy real-time constraints, we have moved almost all of our text logging off of stderr pipe and into shared memory ring buffers.
The problem we have now is that when old code, or a C library, calls assert, the message ends up in stderr and not in our ring buffers with the rest of the logs. We'd like to find a way to redirect the output of assert.
There are three basic approaches here that I have considered:
1.) Make our own assert macro -- basically, don't use #include <cassert>, give our own definition for assert. This would work but it would be prohibitively difficult to patch all of the libraries that we are using that call assert to include a different header.
2.) Patch libc -- modify the libc implementation of __assert_fail. This would work, but it would be really awkward in practice because this would mean that we can't build libc without building our logging infra. We could make it so that at run-time, we can pass a function pointer to libc that is the "assert handler" -- that's something that we could consider. The question is if there is a simpler / less intrusive solution than this.
3.) Patch libc header so that __assert_fail is marked with __attribute__((weak)). This means that we can override it at link-time with a custom implementation, but if our custom implementation isn't linked in, then we link to the regular libc implementation. Actually I was hoping that this function already would be marked with __attribute__((weak)) and I was surprised to find that it isn't apparently.
My main question is: What are the possible downsides of option (3) -- patching libc so that this line: https://github.com/lattera/glibc/blob/master/assert/assert.h#L67
extern void __assert_fail (const char *__assertion, const char *__file,
unsigned int __line, const char *__function)
__THROW __attribute__ ((__noreturn__));
is marked with __attribute__((weak)) as well ?
Is there a good reason I didn't think of that the maintainers didn't already do this?
How could any existing program that is currently linking and running successfully against libc break after I patch the header in this way? It can't happen, right?
Is there a significant run-time cost to using weak-linking symbols here for some reason? libc is already a shared library for us, and I would think the cost of dynamic linking should swamp any case analysis regarding weak vs. strong resolution that the system has to do at load time?
Is there a simpler / more elegant approach here that I didn't think of?
Some functions in glibc, particularly, strtod and malloc, are marked with a special gcc attribute __attribute__((weak)). This is a linker directive -- it tells gcc that these symbols should be marked as "weak symbols", which means that if two versions of the symbol are found at link time, the "strong" one is chosen over the weak one.
The motivation for this is described on wikipedia:
Use cases
Weak symbols can be used as a mechanism to provide default implementations of functions that can be replaced by more specialized (e.g. optimized) ones at link-time. The default implementation is then declared as weak, and, on certain targets, object files with strongly declared symbols are added to the linker command line.
If a library defines a symbol as weak, a program that links that library is free to provide a strong one for, say, customization purposes.
Another use case for weak symbols is the maintenance of binary backward compatibility.
However, in both glibc and musl libc, it appears to me that the __assert_fail function (to which the assert.h macro forwards) is not marked as a weak symbol.
https://github.com/lattera/glibc/blob/master/assert/assert.h
https://github.com/lattera/glibc/blob/master/assert/assert.c
https://github.com/cloudius-systems/musl/blob/master/include/assert.h
You don't need attribute((weak)) on symbol __assert_fail from glibc. Just write your own implementation of __assert_fail in your program, and the linker should use your implementation, for example:
#include <stdio.h>
#include <assert.h>
void __assert_fail(const char * assertion, const char * file, unsigned int line, const char * function)
{
fprintf(stderr, "My custom message\n");
abort();
}
int main()
{
assert(0);
printf("Hello World");
return 0;
}
That's because when resolving symbols by the linker the __assert_fail symbol will already be defined by your program, so the linker shouldn't pick the symbol defined by libc.
If you really need __assert_fail to be defined as a weak symbol inside libc, why not just objcopy --weaken-symbol=__assert_fail /lib/libc.so /lib/libc_with_weak_assert_fail.so. I don't think you need to rebuild libc from sources for that.
If I were you, I would probably opt for opening a pipe(2) and fdopen(2)'ing stderr to take the write end of that pipe. I'd service the read end of the pipe as part of the main poll(2) loop (or whatever the equivalent is in your system) and write the contents to the ring buffer.
This is obviously slower to handle actual output, but from your write-up, such output is rare, so the impact ought to be negligable (especially if you already have a poll or select this fd can piggyback on).
It seems to me that tweaking libc or relying on side-effects of the tools might break in the future and will be a pain to debug. I'd go for the guaranteed-safe mechanism and pay the performance price if at all possible.
I have tested such a simple program below
/* a shared library */
dispatch_write_hello(void)
{
fprintf(stderr, "hello\n");
}
extern void
print_hello(void)
{
dispatch_write_hello();
}
My main program is like this:
extern void
dispatch_write_hello(void)
{
fprintf(stderr, "overridden\n");
}
int
main(int argc, char **argv)
{
print_hello();
return 0;
}
The result of the program is "overriden". To figure this out why this happens, I used gdb. The call chain is like this:
_dl_runtime_resolve -> _dl_fixup ->_dl_lookup_symbol_x
I found the definition of _dl_lookup_symbol_x in glibc is
Search loaded objects' symbol tables for a definition of the symbol UNDEF_NAME, perhaps with a requested version for the symbol
So I think when trying to find the symbol dispatch_write_hello, it first of all looks up in the main object file, and then in the shared library. This is the cause of this problem. Is my understanding right? Many thanks for your time.
Given that you mention _dl_runtime_resolve, I assume that you're on Linux system (thanks #Olaf for clarifying this).
A short answer to your question - yes, during symbols interposition dynamic linker will first look inside the executable and only then scan shared libraries. So definition of dispatch_write_hello will prevail.
EDIT
In case you wonder why runtime linker needs to resolve the call to dispatch_write_hello in print_hello to anything besides dispatch_write_hello in the same translation unit - this is caused by so called semantic interposition support in GCC. By default compiler treats any call inside library code (i.e. code compiled with -fPIC) as potentially interposable at runtime unless you specifically tell it not to, via -fvisibility-hidden, -Wl,-Bsymbolic, -fno-semantic-interposition or __attribute__((visibility("hidden"))). This has been discussed on the net many times e.g. in the infamous Sorry state of dynamic libraries on Linux.
As a side note, this feature incures significant performance penalty compared to other compilers (Clang, Visual Studio).
I was searching for asked question. i saw this link https://hev.cc/2512.html which is doing exactly the same thing which I want. But there is no explanation of whats going on. I am also confused whether shared library with out main() can be made executable if yes how? I can guess i have to give global main() but know no details. Any further easy reference and guidance is much appreciated
I am working on x86-64 64 bit Ubuntu with kernel 3.13
This is fundamentally not sensible.
A shared library generally has no task it performs that can be used as it's equivalent of a main() function. The primary goal is to allow separate management and implementation of common code operations, and on systems that operate that way to allow a single code file to be loaded and shared, thereby reducing memory overhead for application code that uses it.
An executable file is designed to have a single point of entry from which it performs all the operations related to completing a well defined task. Different OSes have different requirements for that entry point. A shared library normally has no similar underlying function.
So in order to (usefully) convert a shared library to an executable you must also define ( and generate code for ) a task which can be started from a single entry point.
The code you linked to is starting with the source code to the library and explicitly codes a main() which it invokes via the entry point function. If you did not have the source code for a library you could, in theory, hack a new file from a shared library ( in the absence of security features to prevent this in any given OS ), but it would be an odd thing to do.
But in practical terms you would not deploy code in this manner. Instead you would code a shared library as a shared library. If you wanted to perform some task you would code a separate executable that linked to that library and code. Trying to tie the two together defeats the purpose of writing the library and distorts the structure, implementation and maintenance of that library and the application. Keep the application and the library apart.
I don't see how this is useful for anything. You could always achieve the same functionality from having a main in a separate binary that links against that library. Making a single file that works as both is solidly in the realm of "silly computer tricks". There's no benefit I can see to having a main embedded in the library, even if it's a test harness or something.
There might possible be some performance reasons, like not having function calls go through the indirection of the PLT.
In that example, the shared library is also a valid ELF executable, because it has a quick-and-dirty entry-point that grabs the args for main from where the ABI says they go (i.e. copies them from the stack into registers). It also arranges for the ELF interpreter to be set correctly. It will only work on x86-64, because no definition is provided for init_args for other platforms.
I'm surprised it actually works; I thought all the crap the usual CRT (startup) code does was actually needed for stdio to work properly. It looks like it doesn't initialize extern char **environ;, since it only gets argc and argv from the stack, not envp.
Anyway, when run as an executable, it has everything needed to be a valid dynamically-linked executable: an entry-point which runs some code and exits, an interpreter, and a dependency on libc. (ELF shared libraries can depend on (i.e. link against) other ELF shared libraries, in the same way that executables can).
When used as a library, it just works as a normal library containing some function definitions. None of the stuff that lets it work as an executable (entry point and interpreter) is even looked at.
I'm not sure why you don't get an error for multiple definitions of main, since it isn't declared as a "weak" symbol. I guess shared-lib definitions are only looked for when there's a reference to an undefined symbol. So main() from call.c is used instead of main() from libtest.so because main already has a definition before the linker looks at libtest.
To create shared Dynamic Library with Example.
Suppose with there are three files are : sum.o mul.o and print.o
Shared library name " libmno.so "
cc -shared -o libmno.so sum.o mul.o print.o
and compile with
cc main.c ./libmno.so
I am trying to use create a project for LPC1769 on LPCXpresso. I have a C file calling
#include <string.h>
int main()
{
//some stuff
strnlen(SomeString, someInt);
}
to which I get an error:
Undefined reference to 'strnlen'
The weird part is that there is no problem with strcpy, strncpy or other common string functions.
I am building for a Cortex-M3 processor
Compiler used is: arm-none-eabi-gcc
In Eclipse, I have ticked the MCU linker option : No startup or default libs
I am running Eclipse on Ubuntu
While it may be easy enough to bypass this by just using strlen, I am actually facing a problem using a library which uses strnlen, and I don't want to mess with the library source.
The strnlen function was (until fairly recently) a Linux-specific function (some documentation such as the GNU libc manual still says that it is a "GNU extension"). The current manual page says it is part of POSIX.1-2008. Since you are cross-compiling, it is possible that the target machine's runtime library does not have this function. A forum posting from 2011 said just that.
I add the same problem and I found out that using -std=gnu++11 compiler flag solves it.
The following may work for you (since strnlen() is not a part of the runtime lib).
Define your own/local version of the strnlen().
int strnlen(char *param, int maxlen)
{
// Perform appropriate string manipulation ... as needed.
// Return what you need.
};
You want this include instead:
#include <string.h>
The difference between <> and "" is that <> searches for header files in your systems include folder. The "" searches for header files in the current directory and in any other include folders specified by -I directory
So I was looking through the linux glibc source and I don't see where it actually does anything. The following is from io/chdir.c but it is indicative of many of the source files. What's going on here? Obviously I am missing something. What's the secret, where does it make a system call or actually do something?
stub_warning is some legacy craziness. __set_errno seems to be a simple macro that sets errno. And while I find a million usages of weak_alias I don't see it defined anywhere.
Is there a helpful guide to understanding how glibc works somewhere?
#include <errno.h>
#include <stddef.h>
#include <unistd.h>
/* Change the current directory to PATH. */
int
__chdir (path)
const char *path;
{
if (path == NULL)
{
__set_errno (EINVAL);
return -1;
}
__set_errno (ENOSYS);
return -1;
}
stub_warning (chdir)
weak_alias (__chdir, chdir)
#include <stub-tag.h>
What you've found is a stub function for systems it's not implemented on. You need to look under the sysdeps tree for the actual implementation. The following may be of interest:
sysdeps/unix/sysv/linux
sysdeps/posix
sysdeps/i386 (or x86_64 or whatever your cpu arch is)
The actual system call code for chdir() is auto-generated on most systems supported by glibc, by the script make-syscalls.sh. That's why you can't find it in the source tree.
That's a generic stub that is used if another definition doesn't exist; weak_alias is a cpp macro which tells the linker that __chdir should be used when chdir is requested, but only if no other definition is found. (See weak symbols for more details.)
chdir is actually a system call; there will be per-OS system call bindings in the gibc source tree, which will override the stub definition with a real one that calls into the kernel. This allows glibc to present a stable interface across systems which may not have all of the system calls that glibc knows about.
Note that the actual system calls aren't defined anywhere in the source tree - they're generated at build time from syscalls.list (linked is the one in sysdeps/unix, there are additional ones further down), a series of macros in sysdep.h (linked linux/i386), and a script that actually generates the source files.