Override file access functions with statically linked musl (to implement a read-only virtual FS) - c

If dlsym is available in dynamic linking setup, I can get access to the original impl pointers using dlsym with RTLD_NEXT and use them in my overrides, e.g. as follows:
// paste these in main.c
#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <dlfcn.h>
int open(const char *path, int flags)
{
fprintf(stderr, "log_file_access_preload: open(\"%s\", %d)\n", path, flags);
typedef int (*orig_open_func_type)(const char *pathname, int flags);
orig_open_func_type orig_func = (orig_open_func_type)dlsym(RTLD_NEXT, "open");
return orig_func(path, flags);
}
FILE* fopen(const char *path, const char *mode)
{
fprintf(stderr, "log_file_access_preload: fopen(\"%s\", \"%s\")\n", path, mode);
typedef FILE* (*orig_fopen_func_type)(const char *path, const char *mode);
orig_fopen_func_type orig_func = (orig_fopen_func_type)dlsym(RTLD_NEXT, "fopen");
return orig_func(path, mode);
}
Is there a way to do static linking in such a way that doesn't hide the original libc/POSIX symbols and so that I can use them in my overrides? Should I create my own copy of musl *.a files with renamed original symbols? Should it work? Is there another way?
Usecase: implement redirection of file read/access functions for a custom LaTeX program (compilation process is controlled by me, statically built with musl) to read files from ISO or TAR archive (that contains a prepared TeX Directory Structure) without extraction to disk

Related

Looking for ways to 'mock' posix functions in C/C++ code

I am trying to find somewhat elegant ways to mock and stub function calls to the standard C library functions.
While stubbing-off calls to C files of the project is easy by just linking other C files in the tests, stubbing the standard C functions is harder.
They are just there when linking.
Currently, my approach is to include the code-under-test from my test.cpp file, and placing defines like this:
#include <stdio.h>
#include <gtest/gtest.h>
#include "mymocks.h"
CMockFile MockFile;
#define open MockFile.open
#define close MockFile.close
#define read MockFile.read
#include "CodeUnderTestClass.cpp"
#undef open
#undef close
#undef read
// test-class here
This is cumbersome, and sometimes I run across code that uses 'open' as member names elsewhere or causes other collisions and issues with it. There are also cases of the code needing different defines and includes than the test-code.
So are there alternatives? Some link-time tricks or runtime tricks to override standard C functions? I thought about run-time hooking the functions but that might go too far as usually binary code is loaded read-only.
My unit-tests run only on Debian-Linux with gcc on amd64. So gcc, x64 or Linux specific tricks are also welcome.
I know that rewriting all the code-under-test to use an abstracted version of the C functions is an option, but that hint is not very useful for me.
Use library preloading to substitute system libraries with your own.
Consider following test program code, mytest.c:
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
int main(void) {
char buf[256];
int fd = open("file", O_RDONLY);
if (fd >= 0) {
printf("fd == %d\n", fd);
int r = read(fd, buf, sizeof(buf));
write(0, buf, r);
close(fd);
} else {
printf("can't open file\n");
}
return 0;
}
It will open a file called file from the current directory, print it's descriptor number (usually 3), read its content and then print it on the standard output (descriptor 0).
Now here is your test library code, mock.c:
#include <string.h>
#include <unistd.h>
int open(const char *pathname, int flags) {
return 100;
}
int close(int fd) {
return 0;
}
ssize_t read(int fd, void *buf, size_t count) {
strcpy(buf, "TEST!\n");
return 7;
}
Compile it to a shared library called mock.so:
$ gcc -shared -fpic -o mock.so mock.c
If you compiled mytest.c to the mytest binary, run it with following command:
$ LD_PRELOAD=./mock.so ./mytest
You should see the output:
fd == 100
TEST!
Functions defined in mock.c were preloaded and used as a first match during the dynamic linking process, hence executing your code, and not the code from the system libraries.
Update:
If you want to use "original" functions, you should extract them "by hand" from the proper shared library, using dlopen, dlmap and dlclose functions. Because I don't want to clutter previous example, here's the new one, the same as previous mock.c plus dynamic symbol loading stuff:
#include <stdio.h>
#include <dlfcn.h>
#include <string.h>
#include <unistd.h>
#include <stdlib.h>
#include <gnu/lib-names.h>
// this declares this function to run before main()
static void startup(void) __attribute__ ((constructor));
// this declares this function to run after main()
static void cleanup(void) __attribute__ ((destructor));
static void *sDlHandler = NULL;
ssize_t (*real_write)(int fd, const void *buf, size_t count) = NULL;
void startup(void) {
char *vError;
sDlHandler = dlopen(LIBC_SO, RTLD_LAZY);
if (sDlHandler == NULL) {
fprintf(stderr, "%s\n", dlerror());
exit(EXIT_FAILURE);
}
real_write = (ssize_t (*)(int, const void *, size_t))dlsym(sDlHandler, "write");
vError = dlerror();
if (vError != NULL) {
fprintf(stderr, "%s\n", vError);
exit(EXIT_FAILURE);
}
}
void cleanup(void) {
dlclose(sDlHandler);
}
int open(const char *pathname, int flags) {
return 100;
}
int close(int fd) {
return 0;
}
ssize_t read(int fd, void *buf, size_t count) {
strcpy(buf, "TEST!\n");
return 7;
}
ssize_t write(int fd, const void *buf, size_t count) {
if (fd == 0) {
real_write(fd, "mock: ", 6);
}
real_write(fd, buf, count);
return count;
}
Compile it with:
$ gcc -shared -fpic -o mock.so mock.c -ldl
Note the -ldl at the end of the command.
So: startup function will run before main (so you don't need to put any initialization code in your original program) and initialize real_write to be the original write function. cleanup function will run after main, so you don't need to add any "cleaning" code at the end of main function either.
All the rest works exactly the same as in the previous example, with the exception of newly implemented write function. For almost all the descriptors it will work as the original, and for file descriptor 0 it will write some extra data before the original content. In that case the output of the program will be:
$ LD_PRELOAD=./mock.so ./mytest
fd == 100
mock: TEST!

How to use statx syscall?

Ubuntu 18.04
I'm trying to use statx syscall introduced in the Linux Kernel 4.11. There is a manual entry:
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h> /* Definition of AT_* constants */
int statx(int dirfd, const char *pathname, int flags,
unsigned int mask, struct statx *statxbuf);
So I tried to write an example by myself:
const char *dir_path = NULL;
const char *file_path = NULL;
//read from command line arguments
int dir_fd = open(dir_path, O_DIRECTORY);
struct statx st; //<--------------------------- compile error
statx(dir_fd, file_path, 0, &statx);
But it simply does not compile. The error is the sizeof(statx) is unknown. And actually it is not defined in sys/stat.h, but in linux/stat.h which is not included by sys/stat.h. But after including linux/stat.h the problem is there is no definition for
int statx(int dirfd, const char *pathname, int flags,
unsigned int mask, struct statx *statxbuf);
I expected that since
$ uname -r
4.15.0-39-generic
and 4.15.0-39-generic newer than 4.11 I can use it.
What's wrong?
Currently as the glibc does not provide a wrapper for the statx call, you have to use your kernels definitions. So either copy the statx structure definition from your kernel or just use it from the API the linux kernel provides. The struct statx is currently defined in linux/stat.h.
linux provides a example call to statx available here.
#update library support was added in glibc 2.28

Overriding getdirentries in C

I would like to override getdirentries (and others, like lstat) libc syscalls.
I can override -for example- lstat and chmod, but I can't override getdirentries (and amongst others fstatfs).
Example code is:
#include <errno.h>
#include <dlfcn.h>
#include <stdio.h>
#include <strings.h>
#include <string.h>
#include <sys/_timespec.h>
#include <sys/stat.h>
#include <sys/mount.h>
#ifndef RTLD_NEXT
#define RTLD_NEXT ((void *) -1l)
#endif
int (*getdirentries_orig)(int fd, char *buf, int nbytes, long *basep);
int (*lstat_orig)(const char *path, struct stat *sb);
int (*fstatfs_orig)(int fd, struct statfs *buf);
int (*chmod_orig)(const char *path, mode_t mode);
#define HOOK(func) func##_##orig = dlsym(RTLD_NEXT,#func)
int getdirentries(int fd, char *buf, int nbytes, long *basep) {
HOOK(getdirentries);
printf("getdirentries\n");
return getdirentries_orig(fd, buf, nbytes, basep);
}
int lstat(const char *path, struct stat *sb) {
HOOK(lstat);
printf("lstat\n");
return (lstat_orig(path, sb));
}
int fstatfs(int fd, struct statfs *buf) {
HOOK(fstatfs);
printf("fstatfs\n");
return fstatfs_orig(fd, buf);
}
int chmod(const char *path, mode_t mode) {
HOOK(chmod);
printf("chmod\n");
return chmod_orig(path, mode);
}
I compile this on FreeBSD with:
cc -Wall -g -O2 -fPIC -shared -o preload.so preload.c
(on Linux, adding -ldl may be needed)
and use it with LD_PRELOAD=./preload.so bash.
If I then issue an ls -l, I get "lstat" printed multiple times, that's good.
But ls calls multiple getdirentries too, according to ktrace, and its override function does not get called. fstatfs also doesn't work.
How can I override getdirentries, fstatfs and possibly other syscalls, and why they aren't working in this case?
Thanks,
As it turns out, readdir() in libc/readdir.c (readdir is what ls calls and that should call getdirentries) calls _getdirentries, not getdirentries. If I override _getdirentries, it works. The same for fstatfs, so this is why my program did not work.

How to detect file activities of exec'ed child process on Linux in C?

I have my program, which exec's another process (not mine, consider it a blackbox). Is there a way to detect operations, like open() and close(), for this child process?
Especially I'm interested in finding all newly created files, or existing files, that are opened with intention to be created (O_CREAT flag for open()).
The working approach is to redefine the open() within my own shared library and preload it inside exec()'ed process via LD_PRELOAD environment variable. Thanks to #alk for the approach.
The code for redefined open() looks like:
#include <fcntl.h>
#include <dlfcn.h>
#include <stdarg.h>
#include <sys/types.h>
extern "C" {
int open(const char *pathname, int flags, ...) {
bool has_mode = false;
mode_t mode = 0;
if (flags & O_CREAT) {
va_list ap;
va_start(ap, flags);
mode = va_arg(ap, mode_t);
has_mode = true;
va_end(ap);
}
using Fn = int (*)(const char * pathname, int flags, ...);
Fn new_open = reinterpret_cast<Fn>(dlsym(RTLD_NEXT, "open"));
// Do something useful.
if (has_mode) {
return new_open(pathname, flags, mode);
} else {
return new_open(pathname, flags);
}
}
} // extern "C"
The only problem is with fcntl.h - it may have some geeky declaration for the function open(). You need this file to get definition of the O_CREAT. Another way is to include the file with definition directly: in my case it's the file asm-generic/fcntl.h.

Intercepting stat()

I have successfuly intercepted calls to read(),write(),open(),unlink(),rename(), creat() but somehow with exactly the same semantics intercepting stat() is not taking place. I have changed the execution environmnet using LD_PRELOAD.
Am I missing something?
The code is quite huge, which part of it will be most helpful to post so you can help?
Thanks.
Edit: I kept the interposed stat() wrapper simple to check if it works.
int stat(const char *path,struct stat *buff)
{
printf("client invoke: stat %s",path);
return 1;
}
Compile a function that calls stat(); see what reference(s) are generated (nm -g stat.o). Then you'll have a better idea of which function(s) to interpose. Hint: it probably isn't called stat().
If you are compiling with 64 bit file offsets, then stat() is either a macro or a redirected function declaration that resolves to stat64(), so you will have to interpose on that function too.
Well it was not very simple when running in linux. Gnu libc does some tricks. You need to intercept the __xstat and if you want to call the original save the call.
Here is how I got it to work
gcc -fPIC -shared -o stat.so stat.c -ldl
#define _GNU_SOURCE
#include <stdio.h>
#include <dlfcn.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
static int (*old_xstat)(int ver, const char *path, struct stat *buf) = NULL;
static int (*old_xstat64)(int ver, const char *path, struct stat64 *buf) = NULL;
int __xstat(int ver, const char *path, struct stat *buf)
{
if ( old_xstat == NULL ) {
old_xstat = dlsym(RTLD_NEXT, "__xstat");
}
printf("xstat %s\n",path);
return old_xstat(ver,path, buf);
}
int __xstat64(int ver, const char *path, struct stat64 *buf)
{
if ( old_xstat64 == NULL ) {
old_xstat64 = dlsym(RTLD_NEXT, "__xstat64");
}
printf("xstat64 %s\n",path);
return old_xstat64(ver,path, buf);
}

Resources