Hooking in C and windows - c

I'm looking for a quick guide to basic dll hooking in windows with C, but all the guides I can find are either not C, or not windows.
(The DLL is not part of windows, but a third party program)
I understand the principle, but I don't know how to go about it.
I have pre-existing source code in C++ that shows what I need to hook into, but I don't have any libraries for C, or know how to hook from scratch.

The detours license terms are quite restrictive.
If you merely want to hook certain functions of a DLL it is often cheaper to use a DLL-placement attack on the application whose DLL you want to hook. In order to do this, provide a DLL with the same set of exports and forward those that you don't care about and intercept the rest. Whether that's C or C++ doesn't really matter. This is often technically feasible even with a large number of exports but has its limitations with exported data and if you don't know or can't discern the calling convention used.
If you must use hooking there are numerous ways including to write a launcher and rewrite the prepopulated (by the loader) IAT to point to your code while the main thread of the launched application is still suspended (see the respective CreateProcess flag). Otherwise you are likely going to need at least a little assembly knowledge to get the jumps correct. There are plenty of liberally licensed disassembler engines out there that will allow you to calculate the proper offsets for patching (because you don't want to patch the middle of a multi-byte opcode, for example).
You may want to edit your question again to include what you wrote in the comments (keyword: "DLL hooking").

loading DLLs by LoadLibrary()
This is well known bad practice.
You might want to look up "witch" or "hctiw", the infamous malware dev. there's a reason he's so infamous - he loaded DLLs with LoadLibrary(). try to refrain from bad practice like that.

Related

How to manage features that depend on libraries that may or may not be installed?

I'm writing some new functionality for a graphics program (written mostly in C, with small parts in C++). The new functionality will make use of libgmic. Some users of the program will have libgmic installed, quite a lot will not. The program is monolithic: I'm not writing a plugin, this will be part of the main program. Compiling the program with the right headers is easy, but I need to be able to check at runtime whether the library is installed on the user's system or not in order to enable / disable the particular menu item so that users without the library installed can't invoke this piece of functionality and crash the program. What's the best way of going about this?
You need to load the library at runtime with dlopen (or LoadLibrary on Windows) instead of linking to it, get function pointers with dlsym (GetProcAddress on Windows) and use them instead of function prototypes from the headers. Otherwise your program will simply fail to startup without the library (or crash, in some cases).
Some libraries support such usage well, such as providing types for all the functions you need. With others you’re on your own but that’s still possible.

Is it possible to build a C standard library that is agnostic to the both the OS and compiler being used?

First off, I know that any such library would need to have at least some interface shimming to interact with system calls or board support packages or whatever (For example; newlib has a limited and well defined interface to the OS/BSP that doesn't assume intimate access to the OS). I can als see where there will be need for something similar for interacting with the compiler (e.g. details of some things left as "implementation defined" by the standard).
However, most libraries I've looked into end up being way more wedded to the OS and compiler than that. The OS assumptions seems to assume access to whatever parts they want from a specific OS environment and the compiler interactions practically in collusion with the compiler implementation it self.
So, the basic questions end up being:
Would it be possible (and practical) to implement a full C standard library that only has a very limited and well defined set of prerequisites and/or interface for being built by any compiler and run on any OS?
Do any such implementation exist? (I'm not asking for a recommendation of one to use, thought I would be interested in examples that I can make my own evaluations of.)
If either of the above answers is "no"; then why? What fundamentally makes it impossible or impractical? Or why hasn't anyone bothered?
Background:
What I'm working on that has lead me to this rabbit hole is an attempt to make a fully versioned, fully hermetic build chain. The goal being that I should be able to build a project without any dependencies on the local environment (beyond access to the needs source control repo and say "a valid posix shell" or "language-agnostic-build-tool-of-choice is installed and runs"). Given those dependencies, the build should do the exact same thing, with the same compiler and libraries, regardless of which versions of which compilers and libraries are or are not installed. Universal, repeatable, byte identical output is the target I'm wanting to move in the direction of.
Causing the compiler-of-choice to be run from a repo isn't too hard, but I've yet to have found a C standard library that "works right out of the box". They all seem to assume some magic undocumented interface between them and the compiler (e.g. __gnuc_va_list), or at best want to do some sort of config-probing of the hosting environment which would be something between useless and counterproductive for my goals.
This looks to be a bottomless rabbit hole that I'm reluctant to start down without first trying to locate alternatives.

Does NtDll really export C runtime functions, and can I use these in my application?

I was looking at the NtDll export table on my Windows 10 computer, and I found that it exports standard C runtime functions, like memcpy, sprintf, strlen, etc.
Does that mean that I can call them dynamically at runtime through LoadLibrary and GetProcAddress? Is this guaranteed to be the case for every Windows version?
If so, it is possible to drop the C runtime library altogether (by just using the CRT functions from NtDll), therefore making my program smaller?
There is absolutely no reason to call these undocumented functions exported by NtDll. Windows exports all of the essential C runtime functions as documented wrappers from the standard system libraries, namely Kernel32. If you absolutely cannot link to the C Runtime Library*, then you should be calling these functions. For memory, you have the basic HeapAlloc and HeapFree (or perhaps VirtualAlloc and VirtualFree), ZeroMemory, FillMemory, MoveMemory, CopyMemory, etc. For string manipulation, the important CRT functions are all there, prefixed with an l: lstrlen, lstrcat, lstrcpy, lstrcmp, etc. The odd man out is wsprintf (and its brother wvsprintf), which not only has a different prefix but also doesn't support floating-point values (Windows itself had no floating-point code in the early days when these functions were first exported and documented.) There are a variety of other helper functions, too, that replicate functionality in the CRT, like IsCharLower, CharLower, CharLowerBuff, etc.
Here is an old knowledge base article that documents some of the Win32 Equivalents for C Run-Time Functions. There are likely other relevant Win32 functions that you would probably need if you were re-implementing the functionality of the CRT, but these are the direct, drop-in replacements.
Some of these are absolutely required by the infrastructure of the operating system, and would be called internally by any CRT implementation. This category includes things like HeapAlloc and HeapFree, which are the responsibility of the operating system. A runtime library only wraps those, providing a nice standard-C interface and some other niceties on top of the nitty-gritty OS-level details. Others, like the string manipulation functions, are just exported wrappers around an internal Windows version of the CRT (except that it's a really old version of the CRT, fixed back at some time in history, save for possibly major security holes that have gotten patched over the years). Still others are almost completely superfluous, or seem so, like ZeroMemory and MoveMemory, but are actually exported so that they can be used from environments where there is no C Runtime Library, like classic Visual Basic (VB 6).
It is also interesting to point out that many of the "simple" C Runtime Library functions are implemented by Microsoft's (and other vendors') compiler as intrinsic functions, with special handling. This means that they can be highly optimized. Basically, the relevant object code is emitted inline, directly in your application's binary, avoiding the need for a potentially expensive function call. Allowing the compiler to generate inlined code for something like strlen, that gets called all the time, will almost undoubtedly lead to better performance than having to pay the cost of a function call to one of the exported Windows APIs. There is no way for the compiler to "inline" lstrlen; it gets called just like any other function. This gets you back to the classic tradeoff between speed and size. Sometimes a smaller binary is faster, but sometimes it's not. Not having to link the CRT will produce a smaller binary, since it uses function calls rather than inline implementations, but probably won't produce faster code in the general case.
* However, you really should be linking to the C Runtime Library bundled with your compiler, for a variety of reasons, not the least of which is security updates that can be distributed to all versions of the operating system via updated versions of the runtime libraries. You have to have a really good reason not to use the CRT, such as if you are trying to build the world's smallest executable. And not having these functions available will only be the first of your hurdles. The CRT handles a lot of stuff for you that you don't normally even have to think about, like getting the process up and running, setting up a standard C or C++ environment, parsing the command line arguments, running static initializers, implementing constructors and destructors (if you're writing C++), supporting structured exception handling (SEH, which is used for C++ exceptions, too) and so on. I have gotten a simple C app to compile without a dependency on the CRT, but it took quite a bit of fiddling, and I certainly wouldn't recommend it for anything remotely serious. Matthew Wilson wrote an article a long time ago about Avoiding the Visual C++ Runtime Library. It is largely out of date, because it focuses on the Visual C++ 6 development environment, but a lot of the big picture stuff is still relevant. Matt Pietrek wrote an article about this in the Microsoft Journal a long while ago, too. The title was "Under the Hood: Reduce EXE and DLL Size with LIBCTINY.LIB". A copy can still be found on MSDN and, in case that becomes inaccessible during one of Microsoft's reorganizations, on the Wayback Machine. (Hat tip to IInspectable and Gertjan Brouwer for digging up the links!)
If your concern is just the need to distribute the C Runtime Library DLL(s) alongside your application, you can consider statically linking to the CRT. This embeds the code into your executable, and eliminates the requirement for the separate DLLs. Again, this bloats your executable, but does make it simpler to deploy without the need for an installer or even a ZIP file. The big caveat of this, naturally, is that you cannot benefit to incremental security updates to the CRT DLLs; you have to recompile and redistribute the application to get those fixes. For toy apps with no other dependencies, I often choose to statically link; otherwise, dynamically linking is still the recommended scenario.
There are some C runtime functions in NtDll. According to Windows Internals these are limited to string manipulation functions. There are other equivalents such as using HeapAlloc instead of malloc, so you may get away with it depending on your requirements.
Although these functions are acknowledged by Microsoft publications and have been used for many years by the kernel programmers, they are not part of the official Windows API and you should not use of them for anything other than toy or demo programs as their presence and function may change.
You may want to read a discussion of the option for doing this for the Rust language here.
Does that mean that I can call them dynamically at runtime through
LoadLibrary and GetProcAddress?
yes. even more - why not use ntdll.lib (or ntdllp.lib) for static binding to ntdll ? and after this you can direct call this functions without any GetProcAddress
Is this guaranteed to be the case for every Windows version?
from nt4 to win10 exist many C runtime functions in ntdll, but it set is different. usual it grow from version to version. but some of then less functional compare msvcrt.dll . say for example printf from ntdll not support floating point format, but in general functional is same
it is possible to drop the C runtime library altogether (by just using
the CRT functions from NtDll), therefore making my program smaller?
yes, this is 100% possible.

Decompile a c dll to use pinvoke on

Can you decompile a c dll to use pinvoke on or use reflector?
How do I get the method names and signatures?
Simply put there is no trivial way to do what you want. You can use a disassembler library such as distorm to disassemble the code around the exported entry points, though. There are some heuristics one can use, but many of those will only work with 32bit calling conventions (__stdcall and __cdecl) in particular. Personally I find the Python bindings for it useful, but libdasm can do the same.
Any other tool with disassembler capabilities will be of great value, such as OllyDbg or Immunity Debugger.
Note: if you have a program that already calls the DLL in question, it is most of the time very worthwhile to run that under a debugger (of course only if the code can be trusted, but your question basically implies that) and set breakpoints at the exported functions. From that point on you can infer a lot more from the runtime behavior and the stack contents of the running target. However, this will still be tricky - particularly with __cdecl where a function may take an arbitrary amount of parameters. In such a case you'd have to sift through the calling program for xrefs to the respective function and infer from the stack cleanup following the call how many parameters/bytes it discards. Of course looking at the push instructions before the call will also have some value, though it requires a little experience especially when calls are nested and you have to discern which push belongs to which call.
Basically you will have to develop a minimal set of heuristics matching your case, unless you have already licensed one of the expensive tools (and know how to wield them) that come with their own heuristics that have usually been fine-tuned for a long time.
If you happen to own an IDA Pro (or Hex-Rays plugin) license already you should use that, of course. Also, the freeware versions of IDA, although lagging behind, can handle 32bit x86 PE files (which includes DLLs, of course), but the license may be an obstacle here depending on the project you're working on ("no commercial use allowed").
You can use dependency walker.
http://www.dependencywalker.com/
You can find the exported function names with dumpbin or Dependency Walker. But to know how to call the functions you really need a header file and some documentation. If you don't have those then you will have to reverse engineer the DLL and that is a very challenging task.

Best way to implement plugin framework - are DLLs the only way (C/C++ project)?

Introduction:
I am currently developing a document classifier software in C/C++ and I will be using Naive-Bayesian model for classification. But I wanted the users to use any algorithm that they want(or I want in the future), hence I went to separate the algorithm part in the architecture as a plugin that will be attached to the main app # app start-up. Hence any user can write his own algorithm as a plugin and use it with my app.
Problem Statement:
The way I am intending to develop this is to have each of the algorithms that user wants to use to be made into a DLL file and put into a specific directory. And at the start, my app will search for all the DLLs in that directory and load them.
My Questions:
(1) What if a malicious code is made as a DLL (and that will have same functions mandated by plugin framework) and put into my plugins directory? In that case, my app will think that its a plugin and picks it and calls its functions, so the malicious code can easily bring down my entire app down (In the worst case could make my app as a malicious code launcher!!!).
(2) Is using DLLs the only way available to implement plugin design pattern? (Not only for the fear of malicious plugin, but its a generic question out of curiosity :) )
(3) I think a lot of softwares are written with plugin model for extendability, if so, how do they defend against such attacks?
(4) In general what do you think about my decision to use plugin model for extendability (do you think I should look at any other alternatives?)
Thank you
-MicroKernel :)
Do not worry about malicious plugins. If somebody managed to sneak a malicious DLL into that folder, they probably also have the power to execute stuff directly.
As an alternative to DLLs, you could hook up a scripting language like Python or Lua, and allow scripted plugins. But maybe in this case you need the speed of compiled code?
For embedding Python, see here. The process is not very difficult. You can link statically to the interpreter, so users won't need to install Python on their system. However, any non-builtin modules will need to be shipped with your application.
However, if the language does not matter much to you, embedding Lua is probably easier because it was specifically designed for that task. See this section of its manual.
See 1. They don't.
Using a plugin model sounds like a fine solution, provided that a lack of extensibility really is a problem at this point. It might be easier to hard-code your current model, and add the plugin interface later, if it turns out that there is actually a demand for it. It is easy to add, but hard to remove once people started using it.
Malicious code is not the only problem with DLLs. Even a well-meaning DLL might contain a bug that could crash your whole application or gradually leak memory.
Loading a module in a high-level language somewhat reduces the risk. If you want to learn about embedding Python for example, the documentation is here.
Another approach would be to launch the plugin in a separate process. It does require a bit more effort on your part to implement, but it's much safer. The seperate process approach is used by Google's Chrome web browser, and they have a document describing the architecture.
The basic idea is to provide a library for plugin writers that includes all the logic for communicating with the main app. That way, the plugin author has an API that they use, just as if they were writing a DLL. Wikipedia has a good list of ways for inter-process communication (IPC).
1) If there is a malicious dll in your plugin folder, you are probably already compromised.
2) No, you can load assembly code dynamically from a file, but this would just be reinventing the wheel, just use a DLL.
3) Firefox extensions don't, not even with its javascript plugins. Everything else I know uses native code from dynamic libraries, and is therefore impossible to guarantee safety. Then again Chrome has NaCL which does extensive analysis on the binary code and rejects it if it can't be 100% sure it doesn't violate bounds and what not, although I'm sure they will have more and more vulnerabilities as time passes.
4) Plugins are fine, just restrict them to trusted people. Alternatively, you could use a safe language like LUA, Python, Java, etc, and load a file into that language but restrict it only to a subset of API that wont harm your program or environment.
(1) Can you use OS security facilities to prevent unauthorized access to the folder where the DLL's are searched or loaded from? That should be your first approach.
Otherwise: run a threat analysis - what's the risk, what are known attack vectors, etc.
(2) Not necessarily. It is the most straigtforward if you want compiled plugins - which is mostly a question of performance, access to OS funcitons, etc. As mentioned already, consider scripting languages.
(3) Usually by writing "to prevent malicous code execution, restrict access to the plugin folder".
(4) There's quite some additional cost - even when using a plugin framework you are not yet familiar with. it increases cost of:
the core application (plugin functionality)
the plugins (much higher isolation)
installation
debugging + diagnostics (bugs that occur only with a certain combinaiton of plugins)
administration (users must know of, and manage plugins)
That pays only if
installing/updating the main software is much more complex than updating the plugins
individual components need to be updated individually (e.g. a user may combine different versions of plugins)
other people develop plugins for your main application
(There are other benefits of moving code into DLL's, but they don't pertain to plugins as such)
What if a malicious code is made as a DLL
Generally, if you do not trust dll, you can't load it one way or another.
This would be correct for almost any other language even if it is interpreted.
Java and some languages do very hard job to limit what user can do and this works only because they run in virtual machine.
So no. Dll loaded plug-ins can come from trusted source only.
Is using DLLs the only way available to implement plugin design pattern?
You may also embed some interpreter in your code, for example GIMP allows writing plugins
in python.
But be aware of fact that this would be much slower because if nature of any interpreted language.
We have a product very similar in that it uses modules to extend functionality.
We do two things:
We use BPL files which are DLLs under the covers. This is a specific technology from Borland/Codegear/Embarcadero within C++ Builder. We take advantage of some RTTI type features to publish a simple API similar to the main (argv[]) so any number of paramters can be pushed onto the stack and popped off by the DLL.
We also embed PERL into our application for things that are more business logic in nature.
Our software is an accounting/ERP suite.
Have a look at existing plugin architectures and see if there is anything that you can reuse. http://git.dronelabs.com/ethos/about/ is one link I came across while googling glib + plugin. glib itself might may it easier to develop a plugin architecture. Gstreamer uses glib and has a very nice plugin architecture that may give you some ideas.

Resources