How to read user process addresses from kernel space? - c

Actually, I am working on QNX. Somepoint in the kernel space when one process want to send a message to another process and these both processes are blocked, I can get the values of stack pointer and frame pointer for each process.
Next, I want to access the stack of each process but my problem is that these values (sp and fp) are virtual addresses which are valid in user processes. How could I read words from these user addresses in kernel space?

Unless you are a kernel developer employed by QNX your code never runs in "kernel space." Only the kernel and process 1 (which QNX calls "proc" and pidin displays as procnto or procnto-instr) run in "kernel space," none of which you are able to modify.
If you want to debug the processes then you can connect to them using gdb and inspect the contents of their memories. You can do this without knowing the physical address of the memory pointed to by the virtual sp.
If you want to read memory from another program then you can do:
fd = open("/proc/PID/as", O_RDONLY);
lseek(fd, virtual_address_to_read, SEEK_SET);
read(fd, buffer, cnt_bytes_to_read);
QNX documents this at the following location:
http://www.qnx.com/developers/docs/6.5.0_sp1/index.jsp?topic=%2Fcom.qnx.doc.neutrino_prog%2Fprocess.html&cp=13_7_3_4_1&anchor=Address_space

Related

Could a custom syscall access another process' memory?

For educational purposes, I've managed to create a custom syscall that just prints a message in the kernel's log.
What I was thinking about now is to create a "cross-process memcpy" syscall that receives another process' PID, a memory address of that process' memory space, a lenght, and a pointer in the current's process memory space, and that copies memory from the other process to the current one.
My idea would be to write a program that asks the user for a string, and then prints its PID, the address of the variable in which the string is stored, and it's length. Then I'd write another process that asks for that PID, address and length, and uses my custom syscall to copy that info from the other process to this one.
In theory, I understand that the kernel should be able to access everything, including the other process memory. But in practice I've found that there are copy_from_user or copy_to_user functions to copy memory between userspace and kernelspace, but they don't receive a PID or any other process identifier. So it seems the syscall has somehow context information regarding the caller process - and I don't know if there's any limitation or API that prevents/allows to access another process' memory space from a syscall.
Does the Linux kernel have any API to access another process' memory, given it's PID and memory address?
Does the Linux kernel have any API to access another process' memory, given it's PID and memory address?
Yes, get_user_pages.
Note that the other process is not mapped into the address space of the caller. get_user_pages obtains the underlying pages.
We can use get_user_pages to get a reference on a range of pages which covers the requested area to be read or written. Then carefully copy the data into and out of those pages such that we only touch the requested area.
The /proc/<pid>/mem mechanism might be based on get_user_pages; in any case, it's worth taking a look to see how it works.
Also look at the ptrace system call and its PTRACE_PEEKDATA and PTRACE_POKEDATA operations. You may be able to solve your problem using ptrace or else crib something from its implementation.
Introducing a system call to access memory is probably a bad idea. You have to make sure it's securely coded and that it checks the credentials of the caller, otherwise you can open up a huge security hole.

Is there a system call in linux to reserve virtual address space (not memory, just address space)

I have a user space "platform" forking different processes. All these processes start executing a platform plat_init() function, and then run some other application code (which is not mine. i.e. I cannot change this code).
At some point in time some of these processes may do a plat_shared_mem_alloc() to allocate shared memory. This function returns a handle H (one handle per shared memory block allocated at each call). Another function, plat_get_shared_memory_address(H) can be called to retrieve the address (in the process virtual space) of H.
H can be sent to other linux processes (using usual IPC).
I would like any call to plat_get_shared_memory_address(H) made by any processes to return the same address (for the same handle H).
In other words, I want to map the shared physical memory to the same virtual address in all processes using it, regardless on when the mapping is done.
I thought the plat_shared_mem_alloc() could call shm_open() to create a "file" in the file system, and I am aware mmap() has the MAP_FIXED flag to force the virtual address of the mapping.
But nothing guarantee that if a process P1 maps a shared memory handle at address A, then the same address A is/will be available in another process P2 address space. Maybe P2's application's code has already mapped something at address A before calling plat_get_shared_memory_address(H), and mmap() will fail.
So I am thinking of blocking some address space in each process during plat_init() call (which I know comes first). and using some of that address space when needed for mmap().
In other words, is there a system call to block (reserve) some virtual address space of a process (without allocation any memory at this time), so I could, later on, if needed, map things at the same address in my different processes?

how to unmap the memory which is mapped using remap_pfn_range()

There are multiple approaches to map kernel memory to user space.
Some say use splice(), mmap(), etc.
I am calling mmap() with a descriptor of our own pseudo char device file like '/dev/mem'.
When calling mmap() with with our own pseudo char device file, internally in mmap registered file_operation function pointer, we can invoke remap_pfn_range() for mapping memory..
Now that process might have got terminated/killed/clean exit().
How to remove those mappings from kernel space. I am working on ARMv7-A.
Can anyone explain what happens about these memory mappings when the process gets killed/terminated? Does kernel remove the mappings by itself or do we need to unmap explicitly?
You can handle it via release when last copy exits.
struct file_operations {
...
int (*release) (struct inode *, struct file *);
...
}
From LDD
This operation is invoked when the file structure is being released.
Like open, release can be missing.[ Note that release isn't invoked
every time a process calls close. Whenever a file structure is shared
(for example, after a fork or a dup), release won't be invoked until
all copies are closed. If you need to flush pending data when any copy
is closed, you should implement the flush method.
mmap maps an external (to the process) memory space to the virtual address space of the process calling it. Memory maybe a shared memory segment, a file... Unlike the physical memory segment it maps to, mmap just creates a "link" to that segment, and returns an address that can be seen and used from the calling process.
When the mmap calling process terminates (naturally, killed..) the mappings it created are automatically unmapped.
The physical memory region that was mapped, however, and that may be used by other processes (or no), remains available.
man mmap
You may close the mapping from the program before it dies,
int munmap(void *addr, size_t length);

scan memory of calling process

I have to scan the memory space of a calling process in C. This is for homework. My problem is that I don't fully understand virtual memory addressing.
I'm scanning the memory space by attempting to read and write to a memory address. I can not use proc files or any other method.
So my problem is setting the pointers.
From what I understand the "User Mode Space" begins at address 0x0, however, if I set my starting point to 0x0 for my function, then am I not scanning the address space for my current process? How would you recommend adjusting the pointer -- if at all -- to address the parent process address space?
edit: Ok sorry for the confusion and I appreciate the help. We can not use proc file system, because the assignment is intended for us to learn about signals.
So, basically I'm going to be trying to read and then write to an address in each page of memory to test if it is R, RW or not accessible. To see if I was successful I will be listening for certain signals -- I'm not sure how to go about that part yet. I will be creating a linked list of structure to represent the accessibility of the memory. The program will be compiled as a 32 bit program.
With respect to parent process and child process: the exact text states
When called, the function will scan the entire memory area of the calling process...
Perhaps I am mistaken about the child and parent interaction, due to the fact we've been covering this (fork function etc.) in class, so I assumed that my function would be scanning a parent process. I'm going to be asking for clarification from the prof.
So, judging from this picture I'm just going to start from 0x0.
From a userland process's perspective, its address space starts at address 0x0, but not every address in that space is valid or accessible for the process. In particular, address 0x0 itself is never a valid address. If a process attempts to access memory (in its address space) that is not actually assigned to that process then a segmentation results.
You could actually use the segmentation fault behavior to help you map out what parts of the address space are in fact assigned to the process. Install a signal handler for SIGSEGV, and skip through the whole space, attempting to read something from somewhere in each page. Each time you trap a SIGSEGV you know that page is not mapped for your process. Go back afterward and scan each accessible page.
Do only read, however. Do not attempt to write to random memory, because much of the memory accessible to your programs is the binary code of the program itself and of the shared libraries it uses. Not only do you not want to crash the program, but also much of that memory is probably marked read-only for the process.
EDIT: Generally speaking, a process can only access its own (virtual) address space. As #cmaster observed, however, there is a syscall (ptrace()) that allows some processes access to some other processes' memory in the context of the observed process's address space. This is how general-purpose debuggers usually work.
You could read (from your program) the /proc/self/maps file. Try first the following two commands in a terminal
cat /proc/self/maps
cat /proc/$$/maps
(at least to understand what are the address space)
Then read proc(5), mmap(2) and of course wikipages about processes, address space, virtual memory, MMU, shared memory, VDSO.
If you want to share memory between two processes, read first shm_overview(7)
If you can't use /proc/ (which is a pity) consider mincore(2)
You could also non-portably try reading from (and perhaps rewriting the same value using volatile int* into) some address and catching SIGSEGV signal (with a sigsetjmp(3) in the signal handler), and do that in a -dichotomical- loop (in multiple of 4Kbytes) - from some sane start and end addresses (certainly not from 0, but probably from (void*)0x10000 and up to (void*)0xffffffffff600000)
See signal(7).
You could also use the Linux (Gnu libc) specific dladdr(3). Look also into ptrace(2) (which should be often used from some other process).
Also, you could study elf(5) and read your own executable ELF file. Canonically it is /proc/self/exe (a symlink) but you should be able to get its from the argv[0] of your main (perhaps with the convention that your program should be started with its full path name).
Be aware of ASLR and disable it if your teacher permits that.
PS. I cannot figure out what your teacher is expecting from you.
It is a bit more difficult than it seems at the first sight. In Linux every process has its own memory space. Using any arbitrary memory address points to the memory space of this process only. However there are mechanisms which allow one process to access memory regions of another process. There are certain Linux functions which allow this shared memory feature. For example take a look at
this link which gives some examples of using shared memory under Linux using shmget, shmctl and other system calls. Also you can search for mmap system call, which is used to map a file into a process' memory, but can also be used for the purpose of accessing memory of another process.

Working of Open System Call

I am reading about Memory Mapped Files, the souce says it is faster than traditional methods to open a file or read a file such as an open system call and read system call respectively without giving the description that how open or read system call works.
So here's my question how the open system call works?
As far i know it will load the file into the memory, whereas by using mapped file only their addresses will be saved in the memory and when needed the requested page may be brought into the memory.
I expect clarification over my so far understanding.
EDIT
My previous understanding written above is almost wrong, for coorrect explanation refer to the accepted answer by Pawel.
Since you gave no details I'm assuming you are interested in behavior of Unix-like systems.
Actually open() system call only creates a file descriptor which then may be used by either mmap() or read().
Both memory mapped I/O and standard I/O internally access files on disk through page cache, a buffer in which files are cached in order to reduce number of I/O operations.
Standard I/O approach (using write() and read()) involves performing a system call which then copies data from (or to if you are writing) page cache to a buffer chosen by application. In addition to that non-sequential access requires another system call lseek(). System calls are expensive and so is copying data.
When a file is memory mapped usually a memory region in process address space is mapped directly to page cache, so that all reads and writes of already loaded data can be performed without any additional delay (no system calls, no data copying). Only when an application attempts to access file region that is not already loaded a page fault occurs and the kernel loads required data (whole page) from disk.
EDIT:
I see that I also have to explain memory paging. On most modern architectures there is physical memory which is a real piece of hardware and virtual memory which creates address spaces for processes. Kernel decides how addresses in virtual memory are mapped to addresses in physical memory. The smallest unit is a memory page (usually, but not always 4K). It does not have to be 1:1 mapping, for example all virtual memory pages may be mapped to the same physical address.
In memory mapped I/O part of application address space and kernel's page cache are mapped to the same physical memory region, hence program is able to directly access page cache.
Pawel has beautifully explained how read/writes are performed. Let me explain the original question: How does fopen(3) works:
when user space process encounters fopen(defined in libc or any user space library), it translates it into open(2) system call. First, it takes arguments from fopen, writes them into architecture specific registers along with open() syscall number. This number tells kernel the system call user space program wants to run. After loading these register, user space process interrupts kernel(via softirq, traditionally INT 80H on x86) and blocks.
Kernel verifies the arguments provided and access permissions etc, and then either returns error or invokes actual system call which is vfs_open() in this case. vfs_open() checks for available file descriptor in fd array and allocates struct file. The ref counts of accessed file is increased and fd is returned to user program. That's completes the working of open, and of most of the system calls in general.
open() together with read()/write(), followed by close() is undoubtedly much lengthy process than having memory mapped file in buffer cache.
For a lucid explanation of how open and read work on Linux, you can read this. The code snippets are from an older version of the kernel but the theory still holds.
You would still need to use the open() system call to get a valid file descriptor, which you would pass on to mmap(). As to why mmaped IO is faster, it is because there is no copy of data from (to) user space to (from) kernel space buffers which is what happens with read and write system calls.

Resources