Intercepting file operations on Linux - c

I'm working on a cloud platform for rendering visual effects and animation. We accept various formats of scene descriptions render them, and return the image outputs to the user. The processing backend is Linux. Sometimes we receive scene descriptions generated on Windows so we get paths that look like 'C:\path\to\file.mb'. I've written a Linux shared library to intercept various filesystem calls and alter the paths to something Linux can understand '/C/path/to/file'. I use the LD_PRELOAD mechanism to insert my functions before the "real" ones and it works great... except when it doesn't.
For example this command:
LD_PRELOAD=/home/robert/path_fix.so Render -s 1 -e 1 C:\path\to\test_scene_03_vray.ma
Will not locate test_scene_03_vray.ma. This also doesn't work:
LD_PRELOAD=/home/robert/path_fix.so echo test > C:\path\to\test.txt
I've been using ltrace to figure out which functions are called with references to path names, but these examples don't show up in my traces:
ltrace -f -C -S -o ltrace.out Render C:\path\to\test_scene_03_vray.ma
Is there a tool other than ltrace that will let me see which function calls are invoked?
Here's the list of what I already have overrides for:
fopen
freopen
opendir
open
creat
openat
stat
lstat
fstatat
__lxstat
__xstat
mkdir
mkdirat
unlink
unlinkat
access
faccessat
rename
renameat
renameat2
chmod
fchmodat
chown
lchown
fchownat
link
linkat
name_to_handle_at
readlink
readlinkat
symlink
symlinkat
rmdir
chdir
Are there more functions that I'm missing here? I tried to implement everything that takes
const char *filepath
as an argument.
Side question: This seems like it might already be a solved problem... Is there a library or other approach that converts Windows paths to unix friendly paths? Keep in mind that the rendering applications are 3rd party proprietary binaries so I can't modify them.
Thanks for any suggestions!

Yes, there are other functions, such as the 64-bit functions open64(). There are also functions such as __open().
On a 64-bit Centos 7 server, for just open I get:
nm -D /lib64/libc.so.6 | grep open
0000000000033d70 T catopen
00000000003c0b80 B _dl_open_hook
000000000006b140 T fdopen
00000000000ba4c0 W fdopendir
00000000000755d0 T fmemopen
000000000006ba00 T fopen
000000000006ba00 W fopen64
000000000006bb60 T fopencookie
0000000000073700 T freopen
0000000000074b60 T freopen64
00000000000eb7a0 T fts_open
0000000000022220 T iconv_open
000000000006b140 T _IO_fdopen
0000000000077230 T _IO_file_fopen
0000000000077180 T _IO_file_open
000000000006ba00 T _IO_fopen
000000000006d1d0 T _IO_popen
000000000006cee0 T _IO_proc_open
0000000000131580 T __libc_dlopen_mode
00000000000e82a0 W open
00000000000e82a0 W __open
00000000000ed0f0 T __open_2
00000000000e82a0 W open64
00000000000e82a0 W __open64
00000000000ed110 T __open64_2
00000000000e8330 W openat
00000000000e8410 T __openat_2
00000000000e8330 W openat64
00000000000e8410 W __openat64_2
00000000000f7860 T open_by_handle_at
00000000000340b0 T __open_catalog
00000000000b9f70 W opendir
00000000000f12b0 T openlog
0000000000073ea0 T open_memstream
00000000000731c0 T open_wmemstream
000000000006d1d0 T popen
0000000000130630 W posix_openpt
00000000000e6ec0 T posix_spawn_file_actions_addopen
For example, on Linux this C code is quite likely to work just fine:
int fd = __open( path, O_RDONLY );
You're also not going to catch statically-linked processes, or those that issue a system call such as open directly.

You can technically use kprobes to intercept the calls but if thr numbrr of files arent too many or dynamicaly generated, i suggest creating symlinks. It is much safer and simpler.

Related

Catching Mach system calls using dtruss

I ran dtruss on vmmap that is a process that read the virtual memory of another remote process.
I would expect that some of mach_port system calls would appear in the output of my command, but couldn't trace any (i.e. mach_vm_read, task_for_pid, etc ..)
The exact command i ran (notice that dtruss is a wrapper script of dtrace in OS-X) :
sudo dtruss vmmap <pid_of_sample_process>
The input argument for vmmap is just a pid of any running process, and the OS version i use is 10.10 (in 10.11 there's entitlement issue when running dtruss on apple products such as vmmap).
Perhaps someone can tell me how to identify the system call i'm looking for... Should I look for the explicit name in dtruss output, or just a general call number of my desired syscall (sadly, i haven't found any of them) :
./bsd/kern/trace.codes:0xff004b10 MSG_mach_vm_read
It looks to me like it's not using Mach APIs. It's using the libproc interface. I'm seeing many proc_info() syscalls, which is what's behind library calls like proc_pidinfo().
I used:
sudo dtrace -n 'pid$target::proc_*:entry {}' -c 'vmmap <some PID>'
to trace the various libproc functions being called. I see calls to proc_name(), proc_pidpath(), and proc_pidinfo() to get information about the target process and then calls to proc_regionfilename() to get information about the VM regions.
By the way, vmmap doesn't read the memory of the other process, it just reports information about the VM regions, not their contents. So, I wouldn't expect to see mach_vm_read() or the like.

Performing the equivalent of chattr +i filename.txt from Linux application

Is there any interface in the Linux userspace API that will allow me to perform the action equivalent to
chattr +i myfile
chattr -i myfile
If possible I need to do this from within my application but I cannot find anything online that suggests how one would go about doing this from the Linux API. I would have thought there would be some kind of ioctl call available to do this but I simply cannot find any details about it.
have a look at:
http://www.danlj.org/lad/src/setflags.c.html
and if you do some strace on chattr, you could have found out that it calls stuff that looks like:
ioctl(fd, EXT2_IOC_SETFLAGS, flags)
(have a look at this thread)

Override libc functions called from another libc function with LD_PRELOAD

I've a project aiming to run php-cgi chrooted for mass virtual hosting (more than 10k virtual host), with each virtual host having their own chroot, under Ubuntu Lucid x86_64.
I would like to avoid creating the necessary environment inside each chroot for things like /dev/null, /dev/zero, locales, icons... and whatever which could be needed by php modules thinking that they run outside chroot.
The goal is to make php-cgi run inside a chroot, but allowing him access to files outside the chroot as long as those files are (for most of them) opened in read-only mode, and on an allowed list (/dev/log, /dev/zero, /dev/null, path to the locales...)
The obvious way seems to create (or use if it exists) a kernel module, which could hook and redirect trusted open() paths, outside of the chroot.
But I don't think it's the easiest way:
I've never done a kernel module, so I do not correctly estimate the difficulty.
There seems to be multiple syscall to hook file "open" (open, connect, mmap...), but I guess there is a common kernel function for everything related to file opening.
I do want to minimize the number of patchs to php or it's module, to minimize the amount of work needed each time I will update our platform to the latest stable PHP release (and so update from upstream PHP releases more often and quickly), so I find better to patch the behavior of PHP from the outside (because we have a particular setup, so patching PHP and propose patch to upstream is not relevant).
Instead, I'm currently trying an userland solution : hook libc functions with LD_PRELOAD, which works well in most cases and is really quick to implement, but I've encountered a problem which I'm unable to resolve alone.
(The idea is to talk to a daemon running outside the chroot, and get file descriptor from it using ioctl SENDFD and RECVFD).
When I call syslog() (without openlog() first), syslog() calls connect() to open a file.
Example:
folays#phenix:~/ldpreload$ strace logger test 2>&1 | grep connect
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(1, {sa_family=AF_FILE, path="/dev/log"}, 110) = 0
So far so good, I've tried to hook the connect() function of libc, without success.
I've also tried to put some flags to dlopen() inside the _init() function of my preload library to test if some of them could make this work, without success
Here is the relevant code of my preload library:
void __attribute__((constructor)) my_init(void)
{
printf("INIT preloadz %s\n", __progname);
dlopen(getenv("LD_PRELOAD"), RTLD_NOLOAD | RTLD_DEEPBIND | RTLD_GLOBAL |
RTLD_NOW);
}
int connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen)
{
printf("HOOKED connect\n");
int (*f)() = dlsym(RTLD_NEXT, "connect");
int ret = f(sockfd, addr, addrlen);
return ret;
}
int __connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen)
{
printf("HOOKED __connect\n");
int (*f)() = dlsym(RTLD_NEXT, "connect");
int ret = f(sockfd, addr, addrlen);
return ret;
}
But the connect() function of the libc still takes precedence over mine:
folays#phenix:~/ldpreload$ LD_PRELOAD=./lib-preload.so logger test
INIT preloadz logger
[...] no lines with "HOOKED connect..." [...]
folays#phenix:~/ldpreload$
Looking at the code of syslog() (apt-get source libc6 , glibc-2.13/misc/syslog.c), it seems to call openlog_internal, which in turn call __connect(), at misc/syslog.c line 386:
if (LogFile != -1 && !connected)
{
int old_errno = errno;
if (__connect(LogFile, &SyslogAddr, sizeof(SyslogAddr))
== -1)
{
Well, objdump shows me connect and __connect in the dynamic symbol table of libc:
folays#phenix:~/ldpreload$ objdump -T /lib/x86_64-linux-gnu/libc.so.6 |grep -i connec
00000000000e6d00 w DF .text 000000000000005e GLIBC_2.2.5 connect
00000000000e6d00 w DF .text 000000000000005e GLIBC_2.2.5 __connect
But no connect symbol in the dynamic relocation entries, so I guess that it explains why I cannot successfully override the connect() used by openlog_internal(), it probably does not use dynamic symbol relocation, and probably has the address of the __connect() function in hard (a relative -fPIC offset?).
folays#phenix:~/ldpreload$ objdump -R /lib/x86_64-linux-gnu/libc.so.6 |grep -i connec
folays#phenix:~/ldpreload$
connect is a weak alias to __connect:
eglibc-2.13/socket/connect.c:weak_alias (__connect, connect)
gdb is still able to breakpoint on the libc connect symbol of the libc:
folays#phenix:~/ldpreload$ gdb logger
(gdb) b connect
Breakpoint 1 at 0x400dc8
(gdb) r test
Starting program: /usr/bin/logger
Breakpoint 1, connect () at ../sysdeps/unix/syscall-template.S:82
82 ../sysdeps/unix/syscall-template.S: No such file or directory.
in ../sysdeps/unix/syscall-template.S
(gdb) c 2
Will ignore next crossing of breakpoint 1. Continuing.
Breakpoint 1, connect () at ../sysdeps/unix/syscall-template.S:82
82 in ../sysdeps/unix/syscall-template.S
(gdb) bt
#0 connect () at ../sysdeps/unix/syscall-template.S:82
#1 0x00007ffff7b28974 in openlog_internal (ident=<value optimized out>, logstat=<value optimized out>, logfac=<value optimized out>) at ../misc/syslog.c:386
#2 0x00007ffff7b29187 in __vsyslog_chk (pri=<value optimized out>, flag=1, fmt=0x40198e "%s", ap=0x7fffffffdd40) at ../misc/syslog.c:274
#3 0x00007ffff7b293af in __syslog_chk (pri=<value optimized out>, flag=<value optimized out>, fmt=<value optimized out>) at ../misc/syslog.c:131
Of course, I could completely skip this particular problem by doing an openlog() myself, but I guess that I will encounter the same type of problem with some others functions.
I don't really understand why openlog_internal does not use dynamic symbol relocation to call __connect(), and if it's even possible to hook this __connect() call by using simple LD_PRELOAD mechanism.
The others way I see how it could be done:
Load libc.so from an LD_PRELOAD with dlopen, get the address of the libc's __connect with dlsym() and then patch the function (ASM wise) to get the hook working. It seems really overkill and error prone.
Use a modified custom libc for PHP to fix those problems directly at the source (open / connect / mmap functions...)
Code a LKM, to redirect file access where I want. Pros : no need of ioctl(SENDFD) and no daemon outside the chroot.
I would really appreciate to learn, if it is ever possible, how I could still hook the call to __connect() issued by openlog_internal, suggestions, or links to kernel documentation related to syscall hooking and redirection.
My google searches related to "hook syscalls" found lot of references to LSM, but it seems to only allow ACLs answering "yes" or "no", but no redirection of open() paths.
Thanks for reading.
It's definitely not possible with LD_PRELOAD without building your own heavily-modified libc, in which case you might as well just put the redirection hacks directly inside. There are not necessarily calls to open, connect, etc. whatsoever. Instead there may be calls to a similar hidden function bound at library-creation time (not dynamically rebindable) or even inline syscalls, and this can of course change unpredictably with the version.
Your options are either a kernel module, or perhaps using ptrace on everything inside the "chroot" and modifying the arguments to syscalls whenever the tracing process encounters one that needs patching up. Neither sounds easy...
Or you could just accept that you need a minimal set of critical device nodes and files to exist inside a chroot for it to work. Using a different libc in place of glibc, if possible, would help you minimize the number of additional files needed.

Line Number Info in ltrace and strace tools

Is it possible that I can view the line number and file name (for my program running with ltrace/strace) along with the library call/system call information.
Eg:
code section :: ptr = malloc(sizeof(int)*5); (file:code.c, line:21)
ltrace or any other tool: malloc(20) :: code.c::21
I have tried all the options of ltrace/strace but cannot figure out a way to get this info.
If not possible through ltrace/strace, do we have any parallel tool option for GNU/Linux?
You may be able to use the -i option (to output the instruction pointer at the time of the call) in strace and ltrace, combined with addr2line to resolve the calls to lines of code.
No It's not possible. Why don't you use gdb for this purpose?
When you are compiling application with gcc use -ggdb flags to get debugger info into your program and then run your program with gdb or equivalent frontend (ddd or similar)
Here is quick gdb manual to help you out a bit.
http://www.cs.cmu.edu/~gilpin/tutorial/
You can use strace-plus that can collects stack traces associated with each system call.
http://code.google.com/p/strace-plus/
Pretty old question, but I found a way to accomplish what OP wanted:
First use strace with -k option, which will generate a stack trace like this:
openat(AT_FDCWD, NULL, O_RDONLY) = -1 EFAULT (Bad address)
> /usr/lib/libc-2.33.so(__open64+0x5b) [0xefeab]
> /usr/lib/libc-2.33.so(_IO_file_open+0x26) [0x816f6]
> /usr/lib/libc-2.33.so(_IO_file_fopen+0x10a) [0x818ca]
> /usr/lib/libc-2.33.so(__fopen_internal+0x7d) [0x7527d]
> /mnt/r/build/tests/main(main+0x90) [0x1330]
> /usr/lib/libc-2.33.so(__libc_start_main+0xd5) [0x27b25]
> /mnt/r/build/tests/main(_start+0x2e) [0x114e]
The address of each function call are displayed at the end of each line, and you can paste it to addr2line to retrieve the file and line. For example, we want to locate the call in main() (fifth line of the stack trace).
addr2line -e tests/main 0x1330
It will show something like this:
/mnt/r/main.c:55

Linux programming: which device a file is in

I would like to know which entry under /dev a file is in. For example, if /dev/sdc1 is mounted under /media/disk, and I ask for /media/disk/foo.txt, I would like to get /dev/sdc as response.
Using stat system call on that file I will get its partition major and minor numbers (8 and 33, for sdc1). Now I need to get the "root" device (sdc) or its major/minor from that. Is there any syscall or library function I could use to link a partition to its main device? Or even better, to get that device directly from the file?
brw-rw---- 1 root floppy 8, 32 2011-04-01 20:00 /dev/sdc
brw-rw---- 1 root floppy 8, 33 2011-04-01 20:00 /dev/sdc1
Thanks in advance!
The quick and dirty version: df $file | awk 'NR == 2 {print $1}'.
Programmatically... well, there's a reason I started with the quick and dirty version. There's no portable way to programmatically get the list of mounted filesystems. (getmntent() gets fstab entries, which is not the same thing.) Moreover, you can't even parse the output of mount(8) reliably; on different Unixes, the mountpoint may be the first or the last item. The most portable way to do this ends up being... parsing df output (And even that is iffy, as you noticed with the partition number.). So you're right back to the quick and dirty shell solution anyway, unless you want to traverse /dev and look for block devices with matching major(st_rdev) (major() being from sys/types.h).
If you restrict this to Linux, you can use /proc/mounts to get the list of mounted filesystems. Other specific Unixes can similarly be optimized: for example, on OS X and I think FreeBSD, you can use sysctl() on the vfs tree to get mountpoints. At worst you can find and use the appropriate header file to decipher whatever the mount table file is (and yes, even that varies: on Solaris it's /etc/mnttab, on many other systems it's /etc/mtab, some systems put it in /var/run instead of /etc, and on many Linuxes it's either nonexistent or a symlink to /proc/mounts). And its format is different on pretty much every Unix-like OS.
The information you want exists in sysfs which exposes the linux device tree. This models the relationships between the devices on the system and since you are trying to determine a parent disk device from a partition, this is the place to look. I don't know if there are any hard and fast rules you can rely on to stop your code breaking with future versions of the kernel, but the kernel developers do try to maintain sysfs as a stable interface.
If you look at /sys/dev/block/<major>:<minor>, you'll see it is a symlink with the tail components being block/<disk-device-name>/<partition-device-name>. If you were to perform a readlink(2) system call on that, you could parse the link destination to get the disk device name. In shell (since it's easier to express this way, but doing it in C will be pretty easy):
$ echo $(basename $(dirname $(readlink /sys/dev/block/8:33)))
sdc
Alternatively, you could take advantage of the nesting of partition directories in the disk directories (again in shell, but from C, its an open(2), read(2), and close(2)):
$ cat /sys/dev/block/8:33/../dev
8:32
That assumes your starting major:minor is actually for a partition, not some other sort of non-nested device.
What you looking for is impossible - there is no 1:1 connection between a block device file and the partition it is describing.
Consider:
You can create multiple block device files with different names (but the same major and minor numbers) and they are indistinguishable (N:1)
You can use a block device file as an argument to mount to mount a partition and then delete the block device file leaving the partition mounted. (0:1)
So there is no way to do what you want except in a few specific and narrow cases.
Major number will tell you which device it is: 3 - IDE on 1st controller, 22 - IDE on 2nd controller and 8 for SCSI.
Minor number will tell you partition number and - for IDE devices - if it's primary or secondary drive. This calculation is different for IDE and SCSI.
For IDE it is: x*64 + p, x is drive number on the controller (0 or 1) and p is partition
For SCSI it is: y*16 + p, where y is drive number and p is partition
Not a syscall, but:
df -h /path/to/my/file
From https://unix.stackexchange.com/questions/128471/determine-what-device-a-directory-is-located-on
So you could look at df's source code and see what it does.
I realize this post is old, but this question was the 2nd result in my search and no one has mentioned df -h

Resources