How to trap read write system calls? - c

Whenever i attempt to write anything on my pendrive, a write system call is generated. What i want to do is, this write call should be trapped and and the user should be requested to input predecided password( which i can define during coding itself).
Please tell me whether this is possible or not? and if yes than how should i do it?

The windows DDK has an example of hooking the file reads/writes/copies in filesys\minifilter, with both pre and post op callbacks, that should have you set for the kernel side of things. For the gui part you'll need something to do a non-blocking spin till the drives signals an event, you'll probably also want a pipe or mapped memory view to pass data around

EasyHook is supposed to give you the ability to hook kernel functions. I have not tried it, so your mileage may vary. Be sure to hook functions cautiously - you may degrade the performance of your machine to a point where it's unusable. What you want is to interact with the user, meaning that you must put the hooked function on hold, and issue a callback into user space. This is probably not an exercise for mere mortals.
At any rate, good luck!

Related

Ensure that UID/GID check in system call is executed in RCU-critical section

Task
I have a small kernel module I wrote for my RaspBerry Pi 2 which implements an additional system call for generating power consumption metrics. I would like to modify the system call so that it only gets invoked if a special user (such as "root" or user "pi") issues it. Otherwise, the call just skips the bulk of its body and returns success.
Background Work
I've read into the issue at length, and I've found a similar question on SO, but there are numerous problems with it, from my perspective (noted below).
Question
The linked question notes that struct task_struct contains a pointer element to struct cred, as defined in linux/sched.h and linux/cred.h. The latter of the two headers doesn't exist on my system(s), and the former doesn't show any declaration of a pointer to a struct cred element. Does this make sense?
Silly mistake. This is present in its entirety in the kernel headers (ie: /usr/src/linux-headers-$(uname -r)/include/linux/cred.h), I was searching in gcc-build headers in /usr/include/linux.
Even if the above worked, it doesn't mention if I would be getting the the real, effective, or saved UID for the process. Is it even possible to get each of these three values from within the system call?
cred.h already contains all of these.
Is there a safe way in the kernel module to quickly determine which groups the user belongs to without parsing /etc/group?
cred.h already contains all of these.
Update
So, the only valid question remaining is the following:
Note, that iterating through processes and reading process's
credentials should be done under RCU-critical section.
... how do I ensure my check is run in this critical section? Are there any working examples of how to accomplish this? I've found some existing kernel documentation that instructs readers to wrap the relevant code with rcu_read_lock() and rcu_read_unlock(). Do I just need to wrap an read operations against the struct cred and/or struct task_struct data structures?
First, adding a new system call is rarely the right way to do things. It's best to do things via the existing mechanisms because you'll benefit from already-existing tools on both sides: existing utility functions in the kernel, existing libc and high-level language support in userland. Files are a central concept in Linux (like other Unix systems) and most data is exchanged via files, either device files or special filesystems such as proc and sysfs.
I would like to modify the system call so that it only gets invoked if a special user (such as "root" or user "pi") issues it.
You can't do this in the kernel. Not only is it wrong from a design point of view, but it isn't even possible. The kernel knows nothing about user names. The only knowledge about users in the kernel in that some privileged actions are reserved to user 0 in the root namespace (don't forget that last part! And if that's new to you it's a sign that you shouldn't be doing advanced things like adding system calls). (Many actions actually look for a capability rather than being root.)
What you want to use is sysfs. Read the kernel documentation and look for non-ancient online tutorials or existing kernel code (code that uses sysfs is typically pretty clean nowadays). With sysfs, you expose information through files under /sys. Access control is up to userland — have a sane default in the kernel and do things like calling chgrp, chmod or setfacl in the boot scripts. That's one of the many wheels that you don't need to reinvent on the user side when using the existing mechanisms.
The sysfs show method automatically takes a lock around the file, so only one kernel thread can be executing it at a time. That's one of the many wheels that you don't need to reinvent on the kernel side when using the existing mechanisms.
The linked question concerns a fundamentally different issue. To quote:
Please note that the uid that I want to get is NOT of the current process.
Clearly, a thread which is not the currently executing thread can in principle exit at any point or change credentials. Measures need to be taken to ensure the stability of whatever we are fiddling with. RCU is often the right answer. The answer provided there is somewhat wrong in the sense that there are other ways as well.
Meanwhile, if you want to operate on the thread executing the very code, you can know it wont exit (because it is executing your code as opposed to an exit path). A question arises what about the stability of credentials -- good news, they are also guaranteed to be there and can be accessed with no preparation whatsoever. This can be easily verified by checking the code doing credential switching.
We are left with the question what primitives can be used to do the access. To that end one can use make_kuid, uid_eq and similar primitives.
The real question is why is this a syscall as opposed to just a /proc file.
See this blogpost for somewhat elaborated description of credential handling: http://codingtragedy.blogspot.com/2015/04/weird-stuff-thread-credentials-in-linux.html

Programmatically Detect Context Switch via Assembly

I am aware that one cannot listen for, detect, and perform some action upon encountering context switches on Windows machines via managed languages such as C#, Java, etc. However, I was wondering if there was a way of doing this using assembly (or some other language, perhaps C)? If so, could you provide a small code snippet that gives an idea of how to do this (as I am relatively new to kernel programming)?
What this code will essentially be designed to do is run in the background on a standard Windows UI and listen for when a particular process is either context switched in or out of the CPU. Upon hearing either of these actions, it will send a signal. To clarify, I am looking to detect only the context switches directly involving a specific process, not any context switches. What I ultimately would like to achieve is to be able to notify another machine (via the internet signal) whenever a specific process begins making use of the CPU, as well as when it ceases doing so.
My first attempt at doing this involved simply calculating the CPU usage percentage of the specific process, but this ultimately proved to be too course-grained to catch the most minute calculations. For example, I wrote a test program that simply performed the operation 2+2 and placed the answer inside of an int. The CPU usage method did not pick up on this. Thus, I am looking for something lower level, hence the origin of this question. If there are potential alternatives, I would be more than happy to field them.
There's Event Tracing for Windows (ETW), which you can configure to receive messages about a variety of events occurring in the system.
You should be able to receive messages about thread scheduling events. The CSwitch class of events is for that.
Sorry, I don't know any good ETW samples that you could easily reuse for your task. Read MSDN and look around.
Simon pointed out a good link explaining why ETW can be useful. Very enlightening: http://randomascii.wordpress.com/2012/05/11/the-lost-xperf-documentationcpu-scheduling/
Please see the edits below. In particular #3, ETW appears to be the way to go.
In theory you could install your own trap handler for the old int 2Eh and the new sysenter. However, in practice this isn't going to be as easy anymore as it used to be because of Patchguard (since Vista) and signing requirements. I'm not aware of any other generic means to detect context switches, meaning you'd have to roll your own. All context switches of the OS go through call gates (the aforementioned trap handlers) and ReactOS allows you to peek behind the scenes if you feel uncomfortable with debugging/disassembling.
However, in either case there shouldn't be a generic way to install something like this without kernel mode privileges (usually referred to as ring 0) - anything else would be a security flaw in Windows. I'm not aware of a Windows-supplied method to achieve what you want either.
The book "Undocumented Windows NT" has a pretty good chapter about the exact topic (although obviously targeted at the old int 2Eh method).
If you can live with hooking only certain functions, you may be able to get away with some filter driver(s) or user-mode API hooking. Depends on your exact requirements.
Update: reading your updated question, I think you need to read up on the internals, in particular on the concept of IRQLs (not to be confused with IRQs from DOS times) and the scheduler. The problem is that there can - and usually will - be literally hundreds of context switches every second. However, your watcher process (the one watching for context switches) will, like any user-mode process be preemptable. This means that there is no way for you to achieve real-time signaling or anything close to it, which puts a big question mark on the method.
What is it actually that you want to achieve? The number of context switches doesn't really give you anything. Every single SEH exception will cause a context switch. What is it that you are interested in? Perhaps performance counters cater your needs better?
Update 2: the sheer amount of context switches even for a single thread will be flabbergasting within a single second. So assuming you'd install your own trap handler, you'd still end up (adversely) affecting all other threads on the system (after all you'd catch every context switch and then see whether it's the process/threads you care about and then do your thing or pass it on).
If you could tell us what you ultimately want to achieve, not with the means already pre-defined, we may be able to suggest alternatives.
Update 3: so apparently I was wrong in one respect here. Windows comes with something on board that signals context switches. And ETW can be harnessed to tap into those. Thanks to Simon for pointing out.

Is this is a good way to intercept system calls?

I am writing a tool. A part of that tool will be its ability to log the parameters of the system calls. Alright I can use ptrace for that purpose, but ptrace is pretty slow. A faster method that came to my mind was to modify the glibc. But this is getting difficult, as gcc magically inserts its own built in functions as system call wrappers than using the code defined in glibc. Using -fno-builtin is also not helping there.
So I came up with this idea of writing a shared library, which includes every system call wrapper, such as mmap and then perform the logging before calling the actual system call wrapper function. For example pseudo code of what my mmap would look like is given below.
int mmap(...)
{
log_parameters(...);
call_original_mmap(...);
...
}
Then I can use LD_PRELOAD to load this library firstup. Do you think this idea will work, or am I missing something?
No method that you can possibly dream up in user-space will work seamlessly with any application. Fortunately for you, there is already support for doing exactly what you want to do in the kernel. Kprobes and Kretprobes allow you to examine the state of the machine just preceeding and following a system call.
Documentation here: https://www.kernel.org/doc/Documentation/kprobes.txt
As others have mentioned, if the binary is statically linked, the dynamic linker will skip over any attempts to intercept functions using libdl. Instead, you should consider launching the process yourself and detouring the entry point to the function you wish to intercept.
This means launching the process yourself, intercepting it's execution, and rewriting it's memory to place a jump instruction at the beginning of a function's definition in memory to a new function that you control.
If you want to intercept the actual system calls and can't use ptrace, you will either have to find the execution site for each system call and rewrite it, or you may need to overwrite the system call table in memory and filtering out everything except the process you want to control.
All system calls from user-space goes through a interrupt handler to switch to kernel mode, if you find this handler you probably can add something there.
EDIT I found this http://cateee.net/lkddb/web-lkddb/AUDITSYSCALL.html. Linux kernels: 2.6.6–2.6.39, 3.0–3.4 have support for system call auditing. This is a kernel module that has to be enabled. Maybe you can look at the source for this module if it's not to confusing.
If the code you are developing is process-related, sometimes you can develop alternative implementations without breaking the existing code. This is helpful if you are rewriting an important system call and would like a fully functional system with which to debug it.
For your case, you are rewriting the mmap() algorithm to take advantage of an exciting new feature(or enhancing with new feature). Unless you get everything right on the first try, it would not be easy to debug the system: A nonfunctioning mmap() system call is certain to result in a nonfunctioning system. As always, there is hope.
Often, it is safe to keep the remaining algorithm in place and construct your replacement on the side. You can achieve this by using the user id (UID) as a conditional with which to decide which algorithm to use:
if (current->uid != 7777) {
/* old algorithm .. */
} else {
/* new algorithm .. */
}
All users except UID 7777 will use the old algorithm. You can create a special user, with UID 7777, for testing the new algorithm. This makes it much easier to test critical process-related code.

C function that doesn't 'wait' for any input, but detects if any?

Is there a C function that doesn't wait for input but if there is one, it detects it?
What I'm trying to do here is continue a loop endlessly until any key is pressed.
I'm a newbie, and all the input functions I've learned so far waits for the user to input something..
I hope I'm clear, although if I'm not I'm happy to post the code..
WIndows kbhit( ) does exactly this non-blocking keyboard char-ready check, and there's a kbhit( ) for Linux over here
Since nobody's stated it clearly....
The important thing to note is that the standard library provided by C does not provide the capability you're looking for. Achieving it, then, requires the use of third party libraries and/or special knowledge about the operating system you're using.
Typically, you'll have some of those third-party libraries available. If you were using Visual Studio, for example, you would be able to use http://msdn.microsoft.com/en-us/library/58w7c94c(v=VS.100).aspx. I'm not sure what's available to you with your setup.
you should use select or poll
You might also want to check signal() if all you need is a way to stop the loop and run your end of the program function.
It depends what you exactly want to do, but in general:
A) You keep your program single-threaded and check input through a non-blocking input read.
B) You spawn a different thread that will handle the input and communicate the results back to the main thread.

Is there a better way than parsing /proc/self/maps to figure out memory protection?

On Linux (or Solaris) is there a better way than hand parsing /proc/self/maps repeatedly to figure out whether or not you can read, write or execute whatever is stored at one or more addresses in memory?
For instance, in Windows you have VirtualQuery.
In Linux, I can mprotect to change those values, but I can't read them back.
Furthermore, is there any way to know when those permissions change (e.g. when someone uses mmap on a file behind my back) other than doing something terribly invasive and using ptrace on all threads in the process and intercepting any attempt to make a syscall that could affect the memory map?
Update:
Unfortunately, I'm using this inside of a JIT that has very little information about the code it is executing to get an approximation of what is constant. Yes, I realize I could have a constant map of mutable data, like the vsyscall page used by Linux. I can safely fall back on an assumption that anything that isn't included in the initial parse is mutable and dangerous, but I'm not entirely happy with that option.
Right now what I do is I read /proc/self/maps and build a structure I can binary search through for a given address's protection. Any time I need to know something about a page that isn't in my structure I reread /proc/self/maps assuming it has been added in the meantime or I'd be about to segfault anyways.
It just seems that parsing text to get at this information and not knowing when it changes is awfully crufty. (/dev/inotify doesn't work on pretty much anything in /proc)
I do not know an equivalent of VirtualQuery on Linux. But some other ways to do it which may or may not work are:
you setup a signal handler trapping SIGBUS/SIGSEGV and go ahead with your read or write. If the memory is protected, your signal trapping code will be called. If not your signal trapping code is not called. Either way you win.
you could track each time you call mprotect and build a corresponding data structure which helps you in knowing if a region is read or write protected. This is good if you have access to all the code which uses mprotect.
you can monitor all the mprotect calls in your process by linking your code with a library redefining the function mprotect. You can then build the necessary data structure for knowing if a region is read or write protected and then call the system mprotect for really setting the protection.
you may try to use /dev/inotify and monitor the file /proc/self/maps for any change. I guess this one does not work, but should be worth the try.
There sorta is/was /proc/[pid|self]/pagemap, documentation in the kernel, caveats here:
https://lkml.org/lkml/2015/7/14/477
So it isn't completely harmless...

Resources