Intercepting stat() - c

I have successfuly intercepted calls to read(),write(),open(),unlink(),rename(), creat() but somehow with exactly the same semantics intercepting stat() is not taking place. I have changed the execution environmnet using LD_PRELOAD.
Am I missing something?
The code is quite huge, which part of it will be most helpful to post so you can help?
Thanks.
Edit: I kept the interposed stat() wrapper simple to check if it works.
int stat(const char *path,struct stat *buff)
{
printf("client invoke: stat %s",path);
return 1;
}

Compile a function that calls stat(); see what reference(s) are generated (nm -g stat.o). Then you'll have a better idea of which function(s) to interpose. Hint: it probably isn't called stat().

If you are compiling with 64 bit file offsets, then stat() is either a macro or a redirected function declaration that resolves to stat64(), so you will have to interpose on that function too.

Well it was not very simple when running in linux. Gnu libc does some tricks. You need to intercept the __xstat and if you want to call the original save the call.
Here is how I got it to work
gcc -fPIC -shared -o stat.so stat.c -ldl
#define _GNU_SOURCE
#include <stdio.h>
#include <dlfcn.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
static int (*old_xstat)(int ver, const char *path, struct stat *buf) = NULL;
static int (*old_xstat64)(int ver, const char *path, struct stat64 *buf) = NULL;
int __xstat(int ver, const char *path, struct stat *buf)
{
if ( old_xstat == NULL ) {
old_xstat = dlsym(RTLD_NEXT, "__xstat");
}
printf("xstat %s\n",path);
return old_xstat(ver,path, buf);
}
int __xstat64(int ver, const char *path, struct stat64 *buf)
{
if ( old_xstat64 == NULL ) {
old_xstat64 = dlsym(RTLD_NEXT, "__xstat64");
}
printf("xstat64 %s\n",path);
return old_xstat64(ver,path, buf);
}

Related

Override file access functions with statically linked musl (to implement a read-only virtual FS)

If dlsym is available in dynamic linking setup, I can get access to the original impl pointers using dlsym with RTLD_NEXT and use them in my overrides, e.g. as follows:
// paste these in main.c
#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <dlfcn.h>
int open(const char *path, int flags)
{
fprintf(stderr, "log_file_access_preload: open(\"%s\", %d)\n", path, flags);
typedef int (*orig_open_func_type)(const char *pathname, int flags);
orig_open_func_type orig_func = (orig_open_func_type)dlsym(RTLD_NEXT, "open");
return orig_func(path, flags);
}
FILE* fopen(const char *path, const char *mode)
{
fprintf(stderr, "log_file_access_preload: fopen(\"%s\", \"%s\")\n", path, mode);
typedef FILE* (*orig_fopen_func_type)(const char *path, const char *mode);
orig_fopen_func_type orig_func = (orig_fopen_func_type)dlsym(RTLD_NEXT, "fopen");
return orig_func(path, mode);
}
Is there a way to do static linking in such a way that doesn't hide the original libc/POSIX symbols and so that I can use them in my overrides? Should I create my own copy of musl *.a files with renamed original symbols? Should it work? Is there another way?
Usecase: implement redirection of file read/access functions for a custom LaTeX program (compilation process is controlled by me, statically built with musl) to read files from ISO or TAR archive (that contains a prepared TeX Directory Structure) without extraction to disk

Looking for ways to 'mock' posix functions in C/C++ code

I am trying to find somewhat elegant ways to mock and stub function calls to the standard C library functions.
While stubbing-off calls to C files of the project is easy by just linking other C files in the tests, stubbing the standard C functions is harder.
They are just there when linking.
Currently, my approach is to include the code-under-test from my test.cpp file, and placing defines like this:
#include <stdio.h>
#include <gtest/gtest.h>
#include "mymocks.h"
CMockFile MockFile;
#define open MockFile.open
#define close MockFile.close
#define read MockFile.read
#include "CodeUnderTestClass.cpp"
#undef open
#undef close
#undef read
// test-class here
This is cumbersome, and sometimes I run across code that uses 'open' as member names elsewhere or causes other collisions and issues with it. There are also cases of the code needing different defines and includes than the test-code.
So are there alternatives? Some link-time tricks or runtime tricks to override standard C functions? I thought about run-time hooking the functions but that might go too far as usually binary code is loaded read-only.
My unit-tests run only on Debian-Linux with gcc on amd64. So gcc, x64 or Linux specific tricks are also welcome.
I know that rewriting all the code-under-test to use an abstracted version of the C functions is an option, but that hint is not very useful for me.
Use library preloading to substitute system libraries with your own.
Consider following test program code, mytest.c:
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
int main(void) {
char buf[256];
int fd = open("file", O_RDONLY);
if (fd >= 0) {
printf("fd == %d\n", fd);
int r = read(fd, buf, sizeof(buf));
write(0, buf, r);
close(fd);
} else {
printf("can't open file\n");
}
return 0;
}
It will open a file called file from the current directory, print it's descriptor number (usually 3), read its content and then print it on the standard output (descriptor 0).
Now here is your test library code, mock.c:
#include <string.h>
#include <unistd.h>
int open(const char *pathname, int flags) {
return 100;
}
int close(int fd) {
return 0;
}
ssize_t read(int fd, void *buf, size_t count) {
strcpy(buf, "TEST!\n");
return 7;
}
Compile it to a shared library called mock.so:
$ gcc -shared -fpic -o mock.so mock.c
If you compiled mytest.c to the mytest binary, run it with following command:
$ LD_PRELOAD=./mock.so ./mytest
You should see the output:
fd == 100
TEST!
Functions defined in mock.c were preloaded and used as a first match during the dynamic linking process, hence executing your code, and not the code from the system libraries.
Update:
If you want to use "original" functions, you should extract them "by hand" from the proper shared library, using dlopen, dlmap and dlclose functions. Because I don't want to clutter previous example, here's the new one, the same as previous mock.c plus dynamic symbol loading stuff:
#include <stdio.h>
#include <dlfcn.h>
#include <string.h>
#include <unistd.h>
#include <stdlib.h>
#include <gnu/lib-names.h>
// this declares this function to run before main()
static void startup(void) __attribute__ ((constructor));
// this declares this function to run after main()
static void cleanup(void) __attribute__ ((destructor));
static void *sDlHandler = NULL;
ssize_t (*real_write)(int fd, const void *buf, size_t count) = NULL;
void startup(void) {
char *vError;
sDlHandler = dlopen(LIBC_SO, RTLD_LAZY);
if (sDlHandler == NULL) {
fprintf(stderr, "%s\n", dlerror());
exit(EXIT_FAILURE);
}
real_write = (ssize_t (*)(int, const void *, size_t))dlsym(sDlHandler, "write");
vError = dlerror();
if (vError != NULL) {
fprintf(stderr, "%s\n", vError);
exit(EXIT_FAILURE);
}
}
void cleanup(void) {
dlclose(sDlHandler);
}
int open(const char *pathname, int flags) {
return 100;
}
int close(int fd) {
return 0;
}
ssize_t read(int fd, void *buf, size_t count) {
strcpy(buf, "TEST!\n");
return 7;
}
ssize_t write(int fd, const void *buf, size_t count) {
if (fd == 0) {
real_write(fd, "mock: ", 6);
}
real_write(fd, buf, count);
return count;
}
Compile it with:
$ gcc -shared -fpic -o mock.so mock.c -ldl
Note the -ldl at the end of the command.
So: startup function will run before main (so you don't need to put any initialization code in your original program) and initialize real_write to be the original write function. cleanup function will run after main, so you don't need to add any "cleaning" code at the end of main function either.
All the rest works exactly the same as in the previous example, with the exception of newly implemented write function. For almost all the descriptors it will work as the original, and for file descriptor 0 it will write some extra data before the original content. In that case the output of the program will be:
$ LD_PRELOAD=./mock.so ./mytest
fd == 100
mock: TEST!

How to Override A C System Call?

So the problem is the following. The project needs to intercept all file IO
operations, like open() and close(). I am trying to add printf() before calling the corresponding open() or close(). I am not supposed to rewrite the source code by changing open() or close() to myOpen() or myClose() for example. I have been trying to use LD_PRELOAD environment variable. But the indefinite loop problem came up. My problem is like this one.
int open(char * path,int flags,int mode)
{
// print file name
printf("open :%s\n",path);
return __open(path,flags,mode);
}
Yes, you want LD_PRELOAD.
You need to create a shared library (.so) that has code for all functions that you want to intercept. And, you want to set LD_PRELOAD to use that shared library
Here is some sample code for the open function. You'll need to do something similar for each function you want to intercept:
#define _GNU_SOURCE
#include <dlfcn.h>
int
open(const char *file,int flags,int mode)
{
static int (*real_open)(const char *file,int flags,int mode) = NULL;
int fd;
if (real_open == NULL)
real_open = dlsym(RTLD_NEXT,"open");
// do whatever special stuff ...
fd = real_open(file,flags,mode);
// do whatever special stuff ...
return fd;
}
I believe RTLD_NEXT is easiest and may be sufficient. Otherwise, you could add a constructor that does dlopen once on libc
UPDATE:
I am not familiar with C and I got the following problems with gcc. "error: 'NULL' undeclared (first use in this function)",
This is defined by several #include files, so try #include <stdio.h>. You'll need that if you want to call printf.
"error: 'RTLD_NEXT' undeclared (first use in this function)",
That is defined by doing #include <dlfcn.h> [as shown in my example]
and "symbol lookup error: ./hack_stackoverflow.so: undefined symbol: dlsym".
From man dlsym, it says: Link with -ldl So, add -ldl to the line that builds your .so.
Also, you have to be careful to prevent infinite recursion if the "special stuff" does something that loops back on your intercept function.
Notably, you want to call printf. If you intercept the write syscall, bad things may happen.
So, you need to keep track of when you're already in one of your intercept functions and not do anything special if already there. See the in_self variable.
#define _GNU_SOURCE
#include <stdio.h>
#include <dlfcn.h>
ssize_t
write(int fd,const void *buf,size_t len)
{
static ssize_t (*real_write)(int fd,const void *buf,size_t len) = NULL;
static int in_self = 0;
ssize_t err;
if (real_write == NULL)
real_write = dlsym(RTLD_NEXT,"write");
++in_self;
if (in_self == 1)
printf("mywrite: fd=%d buf=%p len=%ld\n",fd,buf,len);
err = real_write(fd,buf,len);
if (in_self == 1)
printf("mywrite: fd=%d buf=%p err=%ld\n",fd,buf,err);
--in_self;
return err;
}
The above works okay for single threaded programs/environments, but if you're intercepting an arbitrary one, it could be multithreaded.
So, we'd have to initialize all the real_* pointers in a constructor. This is a function with a special attribute that tells the dynamic loader to call the function ASAP automatically.
And, we have to put in_self into thread local storage. We do this by adding the __thread attribute.
You may need to link with -lpthread as well as -ldl for the multithreaded version.
Edit: We also have to preserve the correct errno value
Putting it all together:
#define _GNU_SOURCE
#include <stdio.h>
#include <dlfcn.h>
#include <errno.h>
static int (*real_open)(const char *file,int flags,int mode) = NULL;
static ssize_t (*real_write)(int fd,const void *buf,size_t len) = NULL;
__attribute__((constructor))
void
my_lib_init(void)
{
real_open = dlsym(RTLD_NEXT,"open");
real_write = dlsym(RTLD_NEXT,"write");
}
int
open(const char *file,int flags,int mode)
{
int fd;
// do whatever special stuff ...
fd = real_open(file,flags,mode);
// do whatever special stuff ...
return fd;
}
ssize_t
write(int fd,const void *buf,size_t len)
{
static int __thread in_self = 0;
int sverr;
ssize_t ret;
++in_self;
if (in_self == 1)
printf("mywrite: fd=%d buf=%p len=%ld\n",fd,buf,len);
ret = real_write(fd,buf,len);
// preserve errno value for actual syscall -- otherwise, errno may
// be set by the following printf and _caller_ will get the _wrong_
// errno value
sverr = errno;
if (in_self == 1)
printf("mywrite: fd=%d buf=%p ret=%ld\n",fd,buf,ret);
--in_self;
// restore correct errno value for write syscall
errno = sverr;
return ret;
}

How to use statx syscall?

Ubuntu 18.04
I'm trying to use statx syscall introduced in the Linux Kernel 4.11. There is a manual entry:
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h> /* Definition of AT_* constants */
int statx(int dirfd, const char *pathname, int flags,
unsigned int mask, struct statx *statxbuf);
So I tried to write an example by myself:
const char *dir_path = NULL;
const char *file_path = NULL;
//read from command line arguments
int dir_fd = open(dir_path, O_DIRECTORY);
struct statx st; //<--------------------------- compile error
statx(dir_fd, file_path, 0, &statx);
But it simply does not compile. The error is the sizeof(statx) is unknown. And actually it is not defined in sys/stat.h, but in linux/stat.h which is not included by sys/stat.h. But after including linux/stat.h the problem is there is no definition for
int statx(int dirfd, const char *pathname, int flags,
unsigned int mask, struct statx *statxbuf);
I expected that since
$ uname -r
4.15.0-39-generic
and 4.15.0-39-generic newer than 4.11 I can use it.
What's wrong?
Currently as the glibc does not provide a wrapper for the statx call, you have to use your kernels definitions. So either copy the statx structure definition from your kernel or just use it from the API the linux kernel provides. The struct statx is currently defined in linux/stat.h.
linux provides a example call to statx available here.
#update library support was added in glibc 2.28

Overriding getdirentries in C

I would like to override getdirentries (and others, like lstat) libc syscalls.
I can override -for example- lstat and chmod, but I can't override getdirentries (and amongst others fstatfs).
Example code is:
#include <errno.h>
#include <dlfcn.h>
#include <stdio.h>
#include <strings.h>
#include <string.h>
#include <sys/_timespec.h>
#include <sys/stat.h>
#include <sys/mount.h>
#ifndef RTLD_NEXT
#define RTLD_NEXT ((void *) -1l)
#endif
int (*getdirentries_orig)(int fd, char *buf, int nbytes, long *basep);
int (*lstat_orig)(const char *path, struct stat *sb);
int (*fstatfs_orig)(int fd, struct statfs *buf);
int (*chmod_orig)(const char *path, mode_t mode);
#define HOOK(func) func##_##orig = dlsym(RTLD_NEXT,#func)
int getdirentries(int fd, char *buf, int nbytes, long *basep) {
HOOK(getdirentries);
printf("getdirentries\n");
return getdirentries_orig(fd, buf, nbytes, basep);
}
int lstat(const char *path, struct stat *sb) {
HOOK(lstat);
printf("lstat\n");
return (lstat_orig(path, sb));
}
int fstatfs(int fd, struct statfs *buf) {
HOOK(fstatfs);
printf("fstatfs\n");
return fstatfs_orig(fd, buf);
}
int chmod(const char *path, mode_t mode) {
HOOK(chmod);
printf("chmod\n");
return chmod_orig(path, mode);
}
I compile this on FreeBSD with:
cc -Wall -g -O2 -fPIC -shared -o preload.so preload.c
(on Linux, adding -ldl may be needed)
and use it with LD_PRELOAD=./preload.so bash.
If I then issue an ls -l, I get "lstat" printed multiple times, that's good.
But ls calls multiple getdirentries too, according to ktrace, and its override function does not get called. fstatfs also doesn't work.
How can I override getdirentries, fstatfs and possibly other syscalls, and why they aren't working in this case?
Thanks,
As it turns out, readdir() in libc/readdir.c (readdir is what ls calls and that should call getdirentries) calls _getdirentries, not getdirentries. If I override _getdirentries, it works. The same for fstatfs, so this is why my program did not work.

Resources