Detect when memory address is being written to - c

I have an array of bytes that is used as an emulated system RAM. I want to make a bullet-proof patch for a given cell, that detects when it's being written to, and overwrites it instantly. Using a loop like
for (;;) {
address = x;
sleep(y);
}
has a flaw that there's a minimum possible value for sleep, which appears to be nearly identical to the emulated frame length, so it'd only patch the address once per frame. So, if it's written to 100 times per frame by a game, such a patch will make little sense.
I have some hooks on writing, but those only catch writes by reading the game's code being executed, while I want to make such patches work for any memory region, not just RAM, hence I can't rely on interpreting the emulated code too much (it simply doesn't match for all regions I want to patch).
So I need some pragrammatical watchpoint, having a pointer to the array, and a byte I want to watch change.

Although C is not an object-oriented language, I would use an object-oriented approach here:
Wrap the emulated memory up in an opaque pointer that can only be read and written to with a specific set of functions (e.g. memory_write_byte and memory_read_byte).
Make the memory object maintain a list of function pointers that point to callback functions for handling write events. Whenever a write happens, make it call all those callbacks.
The part of the code that wants to monitor that spot in memory can register a callback with the memory object, and whenever the callback gets called it can modify the memory if needed.

I'd look into shared memory ala mmap. Using mmap you can have the same page shared by two processes and one of the processes can be read only.
When a write on this memory region occurs a SIGSEGV would be generated, which you can catch, and then take some sort of an action. This is using UNIX terminology, but you can do the same thing with windows it is just slightly more involved.

Related

copy_from_user and segmentation

I was reading a paragraph from the "The Linux Kernel Module Programming Guide" and I have a couple of doubts related to the following paragraph.
The reason for copy_from_user or get_user is that Linux memory (on
Intel architecture, it may be different under some other processors)
is segmented. This means that a pointer, by itself, does not reference
a unique location in memory, only a location in a memory segment, and
you need to know which memory segment it is to be able to use it.
There is one memory segment for the kernel, and one for each of the
processes.
However it is my understanding that Linux uses paging instead of segmentation and that virtual addresses at and above 0xc0000000 have the kernel mapping in.
Do we use copy_from_user in order to accommodate older kernels?
Do the current linux kernels use segmentation in any way at all? If so how?
If (1) is not true, are there any other advantages to using copy_from_user?
Yeah. I don't like that explanation either. The details are essentially correct in a technical sense (see also Why does Linux on x86 use different segments for user processes and the kernel?) but as you say, linux typically maps the memory so that kernel code could access it directly, so I don't think it's a good explanation for why copy_from_user, etc. actually exist.
IMO, the primary reason for using copy_from_user / copy_to_user (and friends) is simply that there are a number of things to be checked (dangers to be guarded against), and it makes sense to put all of those checks in one place. You wouldn't want every place that needs to copy data in and out from user-space to have to re-implement all those checks. Especially when the details may vary from one architecture to the next.
For example, it's possible that a user-space page is actually not present when you need to copy to or from that memory and hence it's important that the call be made from a context that can accommodate a page fault (and hence being put to sleep).
Also, user-space data pointers need to be checked carefully to ensure that they actually point to user-space and that they point to data regions, and that the copy length doesn't wrap beyond the end of the valid regions, and so forth.
Finally, it's possible that user-space actually doesn't share the same page mappings with the kernel. There used to be a linux patch for 32-bit x86 that made the complete 4G of virtual address space available to user-space processes. In that case, kernel code could not make the assumption that a user-space pointer was directly accessible, and those functions might need to map individual user-space pages one at a time in order to access them. (See 4GB/4GB Kernel VM Split)

The implementation of copy_from_user()

I am just wondering why does copy_from_user(to, from, bytes) do real copy? Because it just wants kernel to access user-space data, can it directly maps physical address to kernel's address space without moving the data?
Thanks,
copy_from_user() is usually used when writing certain device drivers. Note that there is no "mapping" of bytes here, the only thing that is happening is the copying of bytes from a certain virtual location mapped in user-space to bytes in a location in kernel-space. This is done to enforce separation of kernel and user and to prevent any security flaws -- you never want the kernel to start accessing and reading arbitrary user memory locations or vice-versa. That is why arguments and results from syscalls are copied to/from the user before they actually run.
"Before this it's better to know why copy_from_user() is used"
Because the Kernel never allow a user space application to access Kernel memory directly, because if the memory pointed is invalid or a fault occurs while reading, this would the kernel to panic by just simply using a user space application.
"And that's why!!!!!!"
So while using copy_from_user is all that it could create an error to the user and it won't affect the kernel functionality
Even though it's an extra effort it ensures the safe and secure operation of Kernel
copy_from_user() does a few checks before it starts copying data. Directly manipulating data from user-space is never a good idea because it exists in a virtual address space which might get swapped out.
http://www.ibm.com/developerworks/linux/library/l-kernel-memory-access/
one of the major requirement in system call implementation is to check the validity of user parameter pointer passed as argument, kernel should not blindly follow the user pointer as the user pointer can play tricks in many ways. Major concerns are:
1. it should be a pointer from that process address space - so that it cant get into some other process address space.
2. it should be a pointer from user space - it should not trick to play with a kernel space pointer.
3. it should not bypass memory access restrictions.
that is why copy_from_user() is performed. It is blocking and process sleeps until page fault handler can bring the page from swap file to physical memory.

Reading the start address and length (virtual memory map) of a process

As started here
I need to know how to read the start address and length (virtual memory map) of a process.
I would like to map a process memory. I would like to read values of a process memory and write values to them.
I'm curious about how programs like Cheat-O'matic (cheat-o-matic.softonic.com.br) work. First thing I thought was that the process would be loaded in a contiguous memory location. But that seems not right.
Call repeatedly VirtualQueryEx, starting with address zero and increasing each time of the value obtained in the RegionSize member of the MEMORY_BASIC_INFORMATION structure you passed to it. To obtain a meaningful map obviously the process should be paused.
Still, even after you got this memory map, I'm not sure what you can do with it: unless you know (by other means) the internals of the process you are accessing all you get to know is locations where you can read or write without triggering an access violation, not the meaning of their content. You should really clarify what you are trying to achieve, Read/WriteProcessMemory usually aren't a solution for "normal" problems.

read() system call does a copy of data instead of passing the reference

The read() system call causes the kernel to copy the data instead of passing the buffer by reference. I was asked the reason for this in an interview. The best I could come up with were:
To avoid concurrent writes on the same buffer across multiple processes.
If the user-level process tries to access a buffer mapped to kernel virtual memory area it will result in a segfault.
As it turns out the interviewer was not entirely satisfied with either of these answers. I would greatly appreciate if anybody could elaborate on the above.
A zero copy implementation would mean the user level process would have to be given access to the buffers used internally by the kernel/driver for reading. The user would have to make an explicit call to the kernel to free the buffer after they were done with it.
Depending on the type of device being read from, the buffers could be more than just an area of memory. (For example, some devices could require the buffers to be in a specific area of memory. Or they could only support writing to a fixed area of memory be given to them at startup.) In this case, failure of the user program to "free" those buffers (so that the device could write more data to them) could cause the device and/or its driver to stop functioning properly, something a user program should never be able to do.
The buffer is specified by the caller, so the only way to get the data there is to copy them. And the API is defined the way it is for historical reasons.
Note, that your two points above are no problem for the alternative, mmap, which does pass the buffer by reference (and writing to it than writes to the file, so you than can't process the data in place, while many users of read do just that).
I might have been prepared to dispute the interviewer's assertion. The buffer in a read() call is supplied by the user process and therefore comes from the user address space. It's also not guaranteed to be aligned in any particular way with respect to page frames. That makes it tricky to do what is necessary to perform IO directly into the buffer ie. map the buffer into the device driver's address space or wire it for DMA. However, in limited circumstances, this may be possible.
I seem to remember the BSD subsystem used by Mac OS X used to copy data between address spaces had an optimisation in this respect, although I may be completely mistaken.

Is there a way to pre-emptively avoid a segfault?

Here's the situation:
I'm analysing a programs' interaction with a driver by using an LD_PRELOADed module that hooks the ioctl() system call. The system I'm working with (embedded Linux 2.6.18 kernel) luckily has the length of the data encoded into the 'request' parameter, so I can happily dump the ioctl data with the right length.
However quite a lot of this data has pointers to other structures, and I don't know the length of these (this is what I'm investigating, after all). So I'm scanning the data for pointers, and dumping the data at that position. I'm worried this could leave my code open to segfaults if the pointer is close to a segment boundary (and my early testing seems to show this is the case).
So I was wondering what I can do to pre-emptively check whether the current process owns a particular offset before trying to dereference? Is this even possible?
Edit: Just an update as I forgot to mention something that could be very important, the target system is MIPS based, although I'm also testing my module on my x86 machine.
Open a file descriptor to /dev/null and try write(null_fd, ptr, size). If it returns -1 with errno set to EFAULT, the memory is invalid. If it returns size, the memory is safe to read. There may be a more elegant way to query memory validity/permissions with some POSIX invention, but this is the classic simple way.
If your embedded linux has the /proc/ filesystem mounted, you can parse the /proc/self/maps file and validate the pointer/offsets against that. The maps file contains the memory mappings of the process, see here
I know of no such possibility. But you may be able to achieve something similar. As man 7 signal mentions, SIGSEGV can be caught. Thus, I think you could
Start with dereferencing a byte sequence known to be a pointer
Access one byte after the other, at some time triggering SIGSEGV
In SIGSEGV's handler, mark a variable that is checked in the loop of step 2
Quit the loop, this page is done.
There's several problems with that.
Since several buffers may live in the same page, you might output what you think is one buffer that are, in reality, several. You may be able to help with that by also LD_PRELOADing electric fence which would, AFAIK cause the application to allocate a whole page for every dynamically allocated buffer. So you would not output several buffers thinking it is only one, but you still don't know where the buffer ends and would output much garbage at the end. Also, stack based buffers can't be helped by this method.
You don't know where the buffers end.
Untested.
Can't you just check for the segment boundaries? (I'm guessing by segment boundaries you mean page boundaries?)
If so, page boundaries are well delimited (either 4K or 8K) so simple masking of the address should deal with it.

Resources