I need to load a C function pointer from a dynamic library for use in my Swift app.
The loading code doesn't have to be Swift, I just need a usable function pointer.
I can do this via dlsym:
let handle = dlopen(theLibrary, RTLD_NOW)
let fun = dlsym(handle, functionName)!
let casted = unsafeBitCast(unsafeBitCast(fun, to: IMP.self), to: (#convention(c)() -> Void).self)
dlclose(handle)
However, I'm concerned about the security of doing this. If the loaded code is being pulled from the dynamic library into my app's process, won't the loaded code have the same permissions as my app's? So, if I disabled app sandbox, couldn't the loaded code modify the users files or make network requests or worse like my app could?
I'm looking for plugin functionality here so I may not get to see the loaded dylibs actual source code, so ideally need a way to restrict the dylib's permissions to prevent possible malware running under the permissions of my app.
How can I either enforce security restrictions on a dylib prior to loading it, or lock-down loaded function pointer code?
Edit Craig Estey makes a very good point that even dlopen can be dangerous.
This is a good question, and a very difficult one to fully address. Most of my answer is, well first thoughts on the topic, however hopefully it will give you ideas on how to proceed.
With Vanilla Kernels
Generally speaking once the code is linked into your binary (e.g. through dynamic loading, it will run with the permissions of your app); You would need the operating system kernel to provide some facility to mark that code "special" somehow in order to restrict the capabilities whilst you're in the context of the plugin.
Darwin
I am not sure if Darwin kernel has the granularity of namespaces that we find e.g. under Linux (which may or may not be enough as well). However you could use end-point-security API available under OS X 11.5 or above to monitor your self. The simple side-channel attack on that is the plugin would unhook you teehee.
Linux
You would set up a new namespace, say using setns and spawn a thread in it after limiting its access and downgrading the context and require the plugin to communicate via an IPC channel rather than direct function calls.
Write a kext
Bad and rather impractical idea. However you could write a kext that tracks and drops the process into a sandbox and then lifts it back up. The only way that would work would be to have an ioctl on a special nonce (e.g. a file descriptor) that is passed from the kernel to the privileged code. The code would then drop down to a sandbox do its job, and release the nonce (e.g. close the file) and that would lift it back up. During this time the code will not be allowed to perform any type of code injection loading of other dlls, spawning of threads, etc. (you would have do a deep dive on possible attack vectors and close them all) that would let the plugin insert trampoline code on the return address of the stack. Here too I would simply say spawn off a new sub-thread or a sub-process that uses IPC rather than communicating through a function call interface.
once the nonce is dropped the app is then reelevated to privileged mode.
Related
The problem is to prohibit access to some files (files from my "blacklist"). It implies that nobody besides me (my own kernel module) can either read or modify these files.
I've already asked this question here, on StackOverflow, but I haven't gotten an answer. There was only one solution offered to change file's permissions and file's owner. However, it isn't enough for my goals, since file's permissions as well as file's owner can be easily modified by someone else.
But I haven't given up, I've carried on studying this problem.
I replaced some fields of the system calls table by the pointers to my own functions. Thus I managed to prohibit any USER module to get an access to the files from my blacklist; in addition, this approach doesn't depend on file's permissions or file's owner. However, the key word is "user modules". I mean that any kernel module still can easily get an access to the files from my blacklist via calling, for instance, the filp_open() function. I looked through the Linux code sources and it turned out that all these system calls that I hooked (open, openat, ...) are simple wrappers and no more.
Could you help me? Is there a way to do something with filp_open that is similar to what I've done with system calls? Any other solutions (without hooking) are welcome.
What you are asking for is impossible. Theoretically, this could be achieved by running the kernel under a custom-made hypervisor or running on custom-made hardware, but it would be extremely complicated (if not impossible) to achieve in reality.
You cannot protect the kernel from itself. In any normal scenario (i.e. no dedicated hardware or hypervisor), the Linux kernel runs at the highest privilege level on the machine, and can therefore revert any changes you make if it wants. If your module needs to deny access to some file to the whole kernel, then there's really something conceptually wrong about what you are doing. Moreover, you seem to be assuming that other kernel modules would be somehow "interested" in messing with your module: why would that be the case?
Furthermore, even changing permissions or overriding syscalls does not solve any problem: unless you correctly configure kernel lockdown (kernel >= v5.4) and/or some other security measure like module signing (and ideally also secure boot), the root user is always able to insert modules and subvert your "security" measures.
If you need to deprive root of access to these files, then as I said there's really something logically wrong. The root user can already do whatever it wants with whichever configuration file it wants, of course destroying important configuration files is going to break the system, but that's not really something that you can avoid. Assuming that root is evil doesn't make much sense as a threat model in any normal scenario.
I understand that dlopen/dlclose/dlsym/dlerror APIs are used to interface to a
dynamic loading library but how does this work for a binary already compiled with
the -L flag referring the .so library (shared library, background/load time linking). Edit: I am referring to live updating of shared library without restarting the app.
Cong Ma provided some great pointers. Here is what I could conclude after some more research :
If an app is linked to an .so, and the shared library is replaced with a new copy of it, without stopping the app, the outcome would be 'non-deterministic' or 'unpredictable'.
During the 'linking/loading' phase of the app, the library gets mapped to the address space of the app/process.
Unix-like OS maintains COW, so for the reference, if copy is made on write, the app would have no impact though objective of being able to use new .so code would not be achieved
It all about the memory region that the app is accessing - if the access region does not have a change in address, the app won't experience any impact.
The new version of the lib may have an incremental nature, which might not impact the app - there can be compilation magic for the relocatable address generation too.
You may be lucky sometimes to fall in those above categories and have no issues while replacing the .so with a running app referring to it but practically you will encounter SIGSEGV/SIGBUS etc. most of the time, as the address/cross-reference gets jumbled up with a considerable change in the library.
dlopen() solves the problem as we reduce the window of access of the library. If you do dlopen() and keep it open for long enough you are exposing your app to the same problem. So the idea is to reduce the window of access to fail-proof the update scenario.
I'd say it's possible to reload a dynamic library while an application is running. But to do this you'd have to design such "live reloads" into the library from the start.
Conceptually it's not that difficult.
First, you have to provide a mechanism to signal the application to reload the library. This is actually the easy part - at least in the Unix world: rename or unlink the original library and put the new one in its place, then have the application call dlclose() on the old library, and dlopen() on the new. Windows is probably more difficult. On Unix or Unix-type systems, you can even do the unlink/rename while the application is still running. In fact, on Unix-type systems you do not want to overwrite the existing library at all as that will cause you serious problems. The existing on-disk library is the backing store for the pages mapped from that library into the process's address space - physically overwriting the shared library's file contents will again result in undefined behavior - the OS loads a page it expects to find the entry point of the function foo() in but that address is now halfway through the bar() function. The unlink or rename is safe because the file contents won't be deleted from disk while they're still mapped by the process.
Second, a library can't be unloaded while it's in use (it's probably possible to unload a library while it's being used on some OSes, but I can't see how that could result in anything other than undefined behavior, so "can't" is de facto correct if not de jure...).
So you have to control access to the library when it's being reloaded. This isn't really all that difficult - have a shim library that presents your API to any application, and use a read/write lock in that library. Each call into the library has to be done with a read lock held on the read/write lock, and the library can only be reloaded when a write lock is held on the read/write lock.
Those are the easy parts.
The hard part is designing and coding your library so that nothing is lost or corrupted whenever it's reloaded. How you do that depends on your code. But if you make any mistake, more undefined behavior awaits.
Since you have to halt access to the library to reload it anyway, and you can't carry state over through the reload process, it's going to be a LOT easier to just bounce the application.
Don't assume. Read about it... Inside stories on shared libraries and dynamic loading. tldr: Shared library is loaded (mostly) before the app is executed, rather than "in the background".
Do you mean "live updating of shared library without restarting the app"? I don't think this is possible without hacking your own loader. See answers to this question.
So here my problem. I have a Linux program running on a VM which uses OpenCL via dlopen to execute some commands. About half way through the program's execution it will sleep, and upon resume can make no assumptions about any state on the GPU (in fact, the drivers may have been reloaded and the physical GPU may have changed). Before sleeping dlclose is called and this does unload the OpenCL memory regions (which is a good thing), but the libraries OpenCL uses (cuda and nvidia libraries in this case) are NOT unloaded. Thus when the program resumes and tries to reinitalize everything again, things fail.
So what I'm looking for is a method to effectively unlink/unload the shared libraries that OpenCL was using so it can properly "restart" itself. As a note, the process itself can be paused (stopped) during this transition for as long as needed, but it may not be killed.
Also, assume I'm working with root access and have more or less no restrictions on what files I can modify/touch.
For a less short-term practical solution, you could probably try forcing re-initialization sequence on a loaded library, and/or force unloading.
libdl is just an ordinary library you can copy, modify, and link against instead of using the system one. One option is disabling RTLD_NODELETE handling. Another is running DT_FINI/DT_INIT sequence for a given library.
dl_iterate_phdr, if available, can probably be used to do the same without the need to modify libdl.
Poking around in libdl and/or using LD_DEBUG (if supported) may shed some light on why it is not being unloaded in the first place.
If they don't unload, this is either because something is keeping a reference to them, or they have the nodelete flag set, or because the implementation of dlclose does not support unloading them for some other reason. Note that there is no requirement that dlclose ever unload anything:
An application writer may use dlclose() to make a statement of intent on the part of the process, but this statement does not create any requirement upon the implementation. When the symbol table handle is closed, the implementation may unload the executable object files that were loaded by dlopen() when the symbol table handle was opened and those that were loaded by dlsym() when using the symbol table handle identified by handle.
Source: http://pubs.opengroup.org/onlinepubs/9699919799/functions/dlclose.html
The problem you're experiencing is a symptom of why it's an utterly bad idea to have graphics drivers loaded in the userspace process, but fixing this is both a technical and political challenge (lots of people are opposed to fixing it).
The only short-term practical solution is probably either factoring out the code that uses these libraries as a separate process that you can kill and restart, and talking to it via some IPC mechanism, or simply having your whole program serialize its state and re-exec itself on resume.
I am developing a photo booth application that uses 3 modules to provide printing, capturing, and triggering functionality. The idea is that people can develop modules for it that extend this functionality. These modules are implemented as shared libraries that are loaded at runtime when the user clicks "start".
I am trying to implement a printer module that "prints" to a facebook image gallery. I want to use libcurl for this. My problem is with the initialization function: curl_global_init() The libcurl API documentation states that this function is absolutely not thread safe. From the docs:
This function is not thread safe. You must not call it when any other thread in the program (i.e. a thread sharing the same memory) is running. This doesn't just mean no other thread that is using libcurl. Because curl_global_init() calls functions of other libraries that are similarly thread unsafe, it could conflict with any other thread that uses these other libraries.
Elsewhere in the documentation it says:
The global constant situation merits special consideration when the code you are writing to use libcurl is not the main program, but rather a modular piece of a program, e.g. another library. As a module, your code doesn't know about other parts of the program -- it doesn't know whether they use libcurl or not. And its code doesn't necessarily run at the start and end of the whole program.
A module like this must have global constant functions of its own, just like curl_global_init() and curl_global_cleanup(). The module thus has control at the beginning and end of the program and has a place to call the libcurl functions.
...which seems to address the issue. However, this seems to imply that my module's init() and finalize() functions would be called at the program's beginning and end. Since the modules are designed to be swappable at runtime, there is no way I can do this. Even if I could, my application uses GLib, which per their documentation, it is never safe to assume there are no threads running:
...Since version 2.32, the GLib threading system is automatically initialized at the start of your program, and all thread-creation functions and synchronization primitives are available right away.
Note that it is not safe to assume that your program has no threads even if you don't call g_thread_new() yourself. GLib and GIO can and will create threads for their own purposes...
My question is: is there any way to safely call curl_global_init() in my application? Can I put the calls to curl_global_init() and curl_global_cleanup() in my module's init() and finalize() functions? Do I need to find another HTTP library?
First, you won't really find any other library without these restrictions since they are inherited by libcurl from 3rd party (SSL mostly) libraries with those restrictions. For example OpenSSL.
This said, the thread safe situation for global_init is very unfortunate and something we (in the curl project) really strongly dislike but cannot do much about as long as we use those other libraries. This also means that the exact situation for you depends on exactly which dependency libraries your libcurl is built to use.
You will in most situations be perfectly fine with calling curl_global_init() from your modules init() function the way you suggest. I can't guarantee this to be safe with 100% certainty of course since there are a few unknowns here that I cannot speak to.
I'm looking to make a custom filesystem for a project I'm working on. Currently I am looking at writing it in Python combined with fusepy, but it got me wondering how a compiled non-userspace filesystem is made in Linux. Are there specific libraries that you need to work with or functions you need to implement for the mount command to work properly. Overall I'm not sure how the entire process works.
Yup you'd be programming to the kernel interfaces, specifically the VFS layer at a minimum. Edit Better link [1]
'Full' documentation is in the kernel tree: http://www.mjmwired.net/kernel/Documentation/filesystems/vfs.txt. Of course, the fuse kernel module is programmed to exactly the same interface
This, however, is not what you'd call a library. It is a kernel component and intrinsically there, so the kernel doesn't have to know how a filesystem is implemented to work with one.
[1] google was wrong: the first hit wasn't the best :)
If you'd like to write it in Python, fuse is a good option. There are lots of tutorials for this, such as the one here: http://sourceforge.net/apps/mediawiki/fuse/index.php?title=FUSE_Python_tutorial
In short: Linux is a monolithic kernel with some module-loading capabilities. That means that every kernel feature (filesystems, scheduler, drivers, memory management, etc.) is part of the one big program called Linux. Loadable modules are just a specialized way of run-time linking, which allows the user to pick those features as needed, but they're all still developed mostly as a single program.
So, to create a new filesystem, you just add new C source code files to the kernel code, defining the operations your filesystem has to perform. Then, create an initialization function that allocates a new instance of the VFS structure, fills it with the appropriate function pointers and registers with the VFS.
Note that FUSE is nothing more than a userlevel accessible API to do the same, so the FUSE hooks correspond (roughly) to the VFS operations.