Execute or skip ld.so.preload shared library code based upon conditions - c

TL;DR: Can I just skip loading a shared library placed in ld.so.preload or specific code from it when executing a binary and RUID != UID?
Hi.
I'm writing a shared object, a library, that I load via ld.so.preload in order to hook some functions. I want to know if I can, based upon some conditions, skip loading such library or some parts of it.
Context: I'm working with TOCTTOU (Time To Check To Time To Use) vulnerability and I'm writing a userland library, that will be loaded via linux's loader ld.so.preload feature. The main idea is to hook all functions that operate with files, do some checkings and then call the original function so it's transparent to users and other programs.
Now, the thing is: As I don't want to overload the system and I want my library to have as little impact, overhead as possible, I'd like to execute it, that is, to hook the functions, only when RUID != EUID. That's one of the file-based TOCTTOU premises, it happens when the attacker has lesser privileges than the vulnerable application.
The only way that comes to my mind as of right now is to surround every function declaration with:
if(RUID == EUID){
call original function;
} else {
do my checkings;
call original function;
}
EDIT: The above code could be replaced by its equivalent, shorter:
if(RUID != EUID){
do my checkings;
}
call original function;
But that's actually pretty awful since I'm hooking almost 50 functions plus working with __attribute constructor and destructor and It'd mean to fill my code with if-else blocks in each and every single function.
Please bear in mind that ld.so.preload loads the listed libraries before any other library.
I'd like to know if there is any way of just not loading the library based upon the RUID != EUID condition or, alternatively, load the library but skip the hooking code.

Since my library is system-wide, I cannot create any wrapper for it.
You want to preload your library only into setuid or setgid programs.
Since the number of such programs on any system is quite small and statically known (you can find them all with a cron job every hour), the easiest solution is to create a wrapper for every such program.
The wrapper can itself be a single very small program that has the exact same setuid / setgid permissions as the "target" program, and performs an equivalent of
/bin/env "LD_PRELOAD=..." target-prog args

Related

Multiple calls to a dll from Dymola

I am implementing a model that requires to call a dll library twice, in order to receive from it specific values
the first call is to set up a system inside the library (for example, a powertrain from a catalogue of components and some design parameters)
the second call is to retrieve the performance of a component from such system (let's say the efficiency of a specific electric machine when used in that powertrain)
These two calls to the dll go together, and may be repeated during simulation time.
So far, I've only managed to interface my model to the dll through separate calls from Modelica external functions (one for the first call, one for the second). However, the state of the system is reset between the first and the second call.
Is there a way in Modelica to load a dll, call the same instance of
it multiple times, and eventually close it when the job is done?
Perhaps is it only possible to achieve such feature by bundling the
whole functionality in an external function?
Or am I attempting something that just doesn't work, because of some
technical aspects that I am not aware of? (I don't know, perhaps the
way it all gets compiled during translation)
If I understand this correctly, it looks like your external DLL has some kind of object pointer that is returned when you instantiate it, and this needs to be passed in at every subsequent function call to other functions in the DLL (to preserve the state).
So to do this in Modelica, you need to create an external object class. These are used to preserve state externally and has constructors and destructors to manage its memory. You can write small wrapper C functions to interface with your DLL functions that you can directly include in Modelica annotations, or write a wrapper lib.
Documentation
https://build.openmodelica.org/Documentation/ModelicaReference.Classes.ExternalObject.html
Simple Example
https://www.claytex.com/tech-blog/external-object-example-detecting-initial-rising-edge/
Detailed Example
https://github.com/modelica-3rdparty/ExternData/releases

How can I "dump" a Function to a file?

For example, I have a function func():
int func (int a, int b) {return a + b;}
Now I want write it to a file, so that I can use the system-call mmap to load it with PROT_EXEC and I can call it from another program.What should I do for it?
If you know what signature you need and a static library or the location of a shared library at compile time, you probably just want to include the header and link against the output library. If you want to invoke a function dynamically, you probably want dlopen / dlsym (UNIX) or LoadLibrary / GetProcAddress (Windows) for loading the libary dynamically and retrieving the address of the function by name.
Note that the cases where you actually need to load a library dynamically (at least explicitly) are pretty rare. This is often used for modular architectures (e.g. "plugins" or "extensions") where individual pieces of the application are distributed separately (which can be achieved more securely using IPC rather than dynamic loading... see my note below). Or for cases where your application is not allowed to include dependencies statically and needs to conditionally supply behavior based on the existence of certain library dependencies in the environment in which it happens to be executing. In most cases, though, you'll simply want to include a header that declares the symbols you need and compile for each target platform (possibly using #if...#else macros if there are symbols that vary across OSes or OS versions).
From a stability, security, and code complexity standpoint, I personally recommend that you avoid dynamic library loading. For core system functionality, it's reasonable to link against a dynamic library, but you'll want to do it in a way where the burden of dynamic loading is entirely on your toolchain (i.e. you shouldn't need to call dlopen or LoadLibrary explicitly). For other functionality, it is almost always better to statically link (assuming you distribute updates when there are security fixes for your dependencies), since this will avoid you getting broken by incompatible version updates and also prevent your users from experiencing dependency hell (you require version A but some other application requires version B); modular architectures are often better (and more securely) achieved through inter-process communication (IPC), since dynamically loaded libraries live in the process of the program that loads them (thereby giving them access to the entire process's virtual memory space), whereas with interprocess-communication, each component would be a separate process, and individual components would only have access to information that was given to it explicitly by the calling process, which would make it more difficult for a malicious component to steal data from the caller or other components or to produce instability.
The sanest thing if you want this to actually be used in the real world is probably to just compile the source as part of your program on each platform, like a regular function.
Next best is probably a separate process that you talk to rather than merge with.
Semi-sane (but still not a great choice, see our discussion in the other answer) would be making the shared library, like Michael Aaron Safyan said.
But if you want to know how it works just because - say, you want to write your own dynamic linker, or are doing some kind of runtime code generation like a JIT compiler, or if you just wanna know - you can make a raw code file.
To use it, what we'd have to do is similar to what the linker does - load the code at a particular address that it is made to work on and run it. There is position independent code that can run at any address, too.
Let's first get our function compiled and linked, then output into a raw image for a certain address. Assume the function is func in the file func.c and we're using gcc on Linux. (A Windows compiler would have similar options - gcc on Windows is exactly the same, I believe, but something like Digital Mars's C compiler does it differently with the linker command being /BINARY for instance)
Anyway, here's what I ran:
gcc -c func.c # makes func.o
ld func.o --oformat=binary -e func -o func.binary
This generates a file called func.binary. You can disassemble it most easily with ndisasm -b 64 func.binary (or -b 32 if you compiled the C in 32 bit mode) to confirm it looks right - I see an add instruction there, so looks good to me.
If you loaded that and mmaped then called it... it should work.
Problems will be quick to come up though:
If there's more than one function in that file, they'll all be squished together.
The addresses they try to use to call each other may be totally wrong.
Global variables and other static data will be messed up.
And there's more. The operating system uses more complex file formats for executables and libraries for a reason!
To go to the next step, you could consider writing an ELF or PE loader which reads that metadata off a standard file. Of course, once you get into much of this, you'll be doing exactly what the OS provides with dlopen and LoadLibrary.... so unless the goal is to just learn about the guts, just call those functions and call it done!

How to intercept C library calls in windows?

I have a devilish-gui.exe, a devilish.dll and a devilish.h from a C codebase that has been lost.
devilish-gui is still used from the customer and it uses devilish.dll
devilish.h is poorly documented in a 30-pages pdf: it exposes a few C functions that behave in very different ways according to the values in the structs provided as arguments.
Now, I have to use devilish.dll to write a new devilish-webservice. No, I can't rewrite it.
The documentation is almost useless, but since I have devilish-gui.exe I'd like to write a different implementation of the devilish.h so that it log function's call and arguments in a file, and than calls the original dll function. Something similar to what ltrace does on linux, but specialized for this weird library.
How can I write such "intercepting" dll on windows and inject it between devilish.dll and devilish-gui.exe?
A couple of possibilities:
Use Detours.
If you put your implementation of devilish.dll in the same directory as devilish-gui.exe, and move the real implementation of devilish.dll into a subdirectory, Windows will load your implementation instead of the real one. Your implementation can then forward to the real one. I'm assuming that devilish-gui isn't hardened against search path attacks.
Another approach would be to use IntelliTrace to collect a trace log of all the calls into devilish.dll.

C callbacks and non-Go threads

How does one call Go code in C from threads that weren't created by Go?
What do I assign to a C function pointer such that threads not created by Go can call that pointer and enter into Go code?
Update0
I don't want to use SWIG.
The callbacks will be coming from threads Go hasn't seen before. Neither cgo/life nor anything in pkg/runtime demonstrates this behaviour AFAICT.
You can do this, but the solution is relatively slow (about 22µs per call on my machine).
The answer is for the C code to use C thread primitives to communicate with another goroutine that will actually run the callback.
I have created a Go package that provides this functionality: rog-go.googlecode.com/hg/exp/callback.
There is an example package demonstrating its use here. The example demonstrates a call back to an arbitrary Go closure from a thread created outside of the Go runtime. Another example is here. This demonstrates a typical C callback interface and layers a Go callback on top of it.
To try out the first example:
goinstall rog-go.googlecode.com/hg/exp/example/looper
cd $GOROOT/src/pkg/rog-go.googlecode.com/hg/exp/example/looper
gotest
To try out the second example:
goinstall rog-go.googlecode.com/hg/exp/example/event
cd $GOROOT/src/pkg/rog-go.googlecode.com/hg/exp/example/event
gotest
Both examples assume that pthreads are available. Of course, this is just a stop-gap measure until cgo is fixed, but the technique for calling arbitrary Go closures in a C callback will be applicable even then.
Here is the documentation for the callback package:
PACKAGE
package callback
import "rog-go.googlecode.com/hg/exp/callback"
VARIABLES
var Func = callbackFunc
Func holds a pointer to the C callback function.
When called, it calls the provided function f in a
a Go context with the given argument.
It can be used by first converting it to a function pointer
and then calling from C.
Here is an example that sets up the callback function:
//static void (*callback)(void (*f)(void*), void *arg);
//void setCallback(void *c){
// callback = c;
//}
import "C"
import "rog-go.googlecode.com/hg/exp/callback"
func init() {
C.setCallback(callback.Func)
}
I'll assume you mean from C code compiled with gcc?
IIRC, this either can't be done or can't easily be done using 6g+cgo and friends. Go uses a different calling convention (as well as the segmented stacks and such).
However, you can write C code for [685]c (or even [685]a) and call into go easily using package·function() (you can even call methods IIRC). See the Source of the runtime package for examples.
Update:
Coming back to this question after the update, and giving it some more thought. This can't be done in a standard fashion using 6c or cgo. Especially because the threads are not started by the go runtime, the current implementation would fail. The scheduler would suddenly have a thread under its control that it does not know about; additionally, that thread would be missing some thread-local variables the go runtime uses for managing stacks and some other things. Also, if the go function returns a value (or several) the C code can't access it on the currently supported platforms, as go returns values on the stack (you could access them with assembly though). With these things in mind, I do believe you could still do this using channels. It would require your C code to be a little too intimate with the inner workings of the go runtime, but it would work for a given implementation. While using channels may not be the solution you're looking for, it could possibly fit more nicely with the concepts of Go than callbacks. If your C code reimplemented at least the sending methods in The channel implementation (that code is written for 6c, so it would have to be adapted for gcc most likely, and it calls the go runtime, which we've determined can't be done from a non-go thread), you should be able to lock the channel and push a value to it. The go scheduler can continue to manage it's own threads, but now it can receive data from other threads started in C.
Admittedly, it's a hack; I haven't looked close enough, but it would probably take a few other hacks to get it working (I believe the channels themselves maintain a list of the goroutines that are waiting on them [EDIT: confirmed: runtime·ready(gp);], so you'd need something in your go code to wake up the receiving channel or to warranty the go code won't receive on the channel until you've already pushed a value). However, I can't see any reason this can't work, whereas there are definite reasons that running code generated by 6g on a thread created in C can't.
My original answer still holds though: barring an addition to the language or runtime, this can't yet be done the way you'd like (I'd love to be proven wrong here).
You can find a real-world application of rog's callback package in these bindings for the PortAudio audio I/O library: http://code.google.com/p/portaudio-go/. Might make it easier to understand..
(Thanks for implementing that, rog. It's just what I needed!)

Some general C questions

I am trying to fully understand the process pro writing code in some language to execution by OS. In my case, the language would be C and the OS would be Windows. So far, I read many different articles, but I am not sure, whether I understand the process right, and I would like to ask you if you know some good articles on some subjects I couldn´t find.
So, what I think I know about C (and basically other languages):
C compiler itself handles only data types, basic math operations, pointers operations, and work with functions. By work with functions I mean how to pass argument to it, and how to get output from function. During compilation, function call is replaced by passing arguments to stack, and than if function is not inline, its call is replaced by some symbol for linker. Linker than find the function definition, and replace the symbol to jump adress to that function (and of course than jump back to program).
If the above is generally true and I get it right, where to final .exe file actually linker saves the functions? After the main() function? And what creates the .exe header? Compiler or Linker?
Now, additional capabilities of C, today known as C standart library is set of functions and the declarations of them, that other programmers wrote to extend and simplify use of C language. But these functions like printf() were (or could be?) written in different language, or assembler. And there comes my next question, can be, for example printf() function be written in pure C without use of assembler?
I know this is quite big question, but I just mostly want to know, wheather I am right or not. And trust me, I read a lots of articles on the web, and I would not ask you, If I could find these infromation together on one place, in one article. Insted I must piece by piece gather informations, so I am not sure if I am right. Thanks.
I think that you're exposed to some information that is less relevant as a beginning C programmer and that might be confusing you - part of the goal of using a higher level language like this is to not have to initially think about how this process works. Over time, however, it is important to understand the process. I think you generally have the right understanding of it.
The C compiler merely takes C code and generates object files that contain machine language. Most of the object file is taken by the content of the functions. A simple function call in C, for example, would be represented in the compiled form as low level operators to push things into the stack, change the instruction pointer, etc.
The C library and any other libraries you would use are already available in this compiled form.
The linker is the thing that combines all the relevant object files, resolves all the dependencies (e.g., one object file calling a function in the standard library), and then creates the executable.
As for the language libraries are written in: Think of every function as a black box. As long as the black box has a standard interface (the C calling convention; that is, it takes arguments in a certain way, returns values in a certain way, etc.), how it is written internally doesn't matter. Most typically, the functions would be written in C or directly in assembly. By the time they make it into an object file (or as a compiled library), it doesn't really matter how they were initially created, what matters is that they are now in the compiled machine form.
The format of an executable depends on the operating system, but much of the body of the executable in windows is very similar to that of the object files. Imagine as if someone merged together all the object files and then added some glue. The glue does loading related stuff and then invokes the main(). When I was a kid, for example, people got a kick out of "changing the glue" to add another function before the main() that would display a splash screen with their name.
One thing to note, though is that regardless of the language you use, eventually you have to make use of operating system services. For example, to display stuff on the screen, to manage processes, etc. Most operating systems have an API that is also callable in a similar way, but its contents are not included in your EXE. For example, when you run your browser, it is an executable, but at some point there is a call to the Windows API to create a window or to load a font. If this was part of your EXE, your EXE would be huge. So even in your executable, there are "missing references". Usually, these are addressed at load time or run time, depending on the operating system.
I am a new user and this system does not allow me to post more than one link. To get around that restriction, I have posted some idea at my blog http://zhinkaas.blogspot.com/2010/04/how-does-c-program-work.html. It took me some time to get all links, but in totality, those should get you started.
The compiler is responsible for translating all your functions written in C into assembly, which it saves in the object file (DLL or EXE, for example). So, if you write a .c file that has a main function and a few other function, the compiler will translate all of those into assembly and save them together in the EXE file. Then, when you run the file, the loader (which is part of the OS) knows to start running the main function first. Otherwise, the main function is just like any other function for the compiler.
The linker is responsible for resolving any references between functions and variables in one object file with the references in other files. For example, if you call printf(), since you do not define the function printf() yourself, the linker is responsible for making sure that the call to printf() goes to the right system library where printf() is defined. This is done at compile-time.
printf() is indeed be written in pure C. What it does is call a system call in the OS which knows how to actually send characters to the standard output (like a window terminal). When you call printf() in your program, at compile time, the linker is responsible for linking your call to the printf() function in the standard C libraries. When the function is passed at run-time, printf() formats the arguments properly and then calls the appropriate OS system call to actually display the characters.

Resources