Recently I am study vdso in linux. I tried to modify the data in vvar section but it failed. Following is what I've tried.
According to lwn described, there're two address for vvar:
The first one is a regular kernel-space address whose value is greater than PAGE_OFFSET. If you look at the System.map file, you'll find that this symbol has an address like ffffffff81c76080.
The second address is in a region called the "vvar page". The base address (VVAR_ADDRESS) of this page is defined in the kernel to be at 0xffffffffff5ff000, close to the end of the 64-bit address space. This page is made available read-only to user-space code.
Modify by first address: First address is easy to find by checking the address of _vdso_data in /boot/System.map-kernel-version. After getting the address, I modify the member (__unused) of it in kernel module ( thus, there's not permission issue ). Then, I check the __unused from user space program, it still remain 0 which means that I failed to modify it from kernel space.
Modify by second address: Second address can be found by auxiliary vector. After getting the address of vdso, then we can find vvar section. I pass this address into kernel module and modify its member __unused. The the error occur, showing the permission error issue. ( The reason should be vvar is read-only memory, checking by cat /proc/pid/maps )
I think the first way is almost at the solution but it seems vvar at that address is not be mapped to all process' vvar section. Are there any idea? Thanks in advance.
[EDIT]: The first way get more closer to solution. I modify the cr0 bit, thus allow permission of writing. Even I successfully write it, it still can't be read from linux kernel (by getting vdso_data through __arch_get_k_vdso_data and accessing its member which I modify in user space + kernel module previously)
Related
I am trying to override the kill command In my module by replacing the pointer stored in sys_call_table, to a pointer to function i implemented.
I used the kallsyms_lookup_name() syscall to get the address of the table, and lookup_address() to get the address of the page.
now, my problem is the kernel protecting the page with r/w flag.
So, given a virtual page address, is there a way to modify the r/w flag?
Do you need this?
https://elixir.bootlin.com/linux/v4.3/source/arch/arm/mm/pageattr.c#L68\
At least, you can change your memory attribute for kernel module. Also, you can change your kernel text memory attribute as well.
As far as I tested, if we try to write ro memory in the kernel module, MMU will generate exception and you will see "invalid virtual memory" access.
You have to pass virtual memory address.
The following is an excerpt from my simple driver code.
int vprobe_ioctl( struct file *filep, unsigned int cmd, void *UserInp)
{
case IOCTL_GET_MAX_PORTS:
*(int*)UserInp = TotalPorts;
#if ENABLED_DEBUG
printk("Available port :%u \n ", TotalPorts);
#endif
break;
}
I was not aware about the function copy_to_user which should be used while writing on user space memory. The code directly accesses the user address. But still I am not getting any kernel crash in my development system(x86_64 architecture). It works as expected.
But sometimes I could see kernel crash when I insert the .ko file in some other x86_64 machines. So, I replaced direct accessing with copy_to_user, and it works.
Could anyone please explain,
i) How direct accessing of user address works?
ii) Why am I seeing kernel crash in some systems whereas it works well in some other systems. Is there any kernel configuration mismatch between the systems because of which the kernel could access the user process's virtual address directly?
Note : All the systems I have used have same OS and kernel.-same image generated thru kickstart. - There is no possibility of any differences.
Thanks in advance.
would be interesting to see the crash. now what I'm saying is an assumption based on my knowledge about how the memory works.
user space memory is virtual. it means that the specific process address X is now located on some physical memory, this physical memory is a memory page that is currently allocated to your process. copy to user first checks that the memory given really belongs to the process and other security checks. beside that there is mapping issues.
the kernel memory has its own address space that need to map virtual to physical address. the kernel use the help of mmu (this is different per architecture). In x86 the mapping between the kernel virtual and user virtual is 1:1 (there are different issues here). In other system this is not always true.
I have to scan the memory space of a calling process in C. This is for homework. My problem is that I don't fully understand virtual memory addressing.
I'm scanning the memory space by attempting to read and write to a memory address. I can not use proc files or any other method.
So my problem is setting the pointers.
From what I understand the "User Mode Space" begins at address 0x0, however, if I set my starting point to 0x0 for my function, then am I not scanning the address space for my current process? How would you recommend adjusting the pointer -- if at all -- to address the parent process address space?
edit: Ok sorry for the confusion and I appreciate the help. We can not use proc file system, because the assignment is intended for us to learn about signals.
So, basically I'm going to be trying to read and then write to an address in each page of memory to test if it is R, RW or not accessible. To see if I was successful I will be listening for certain signals -- I'm not sure how to go about that part yet. I will be creating a linked list of structure to represent the accessibility of the memory. The program will be compiled as a 32 bit program.
With respect to parent process and child process: the exact text states
When called, the function will scan the entire memory area of the calling process...
Perhaps I am mistaken about the child and parent interaction, due to the fact we've been covering this (fork function etc.) in class, so I assumed that my function would be scanning a parent process. I'm going to be asking for clarification from the prof.
So, judging from this picture I'm just going to start from 0x0.
From a userland process's perspective, its address space starts at address 0x0, but not every address in that space is valid or accessible for the process. In particular, address 0x0 itself is never a valid address. If a process attempts to access memory (in its address space) that is not actually assigned to that process then a segmentation results.
You could actually use the segmentation fault behavior to help you map out what parts of the address space are in fact assigned to the process. Install a signal handler for SIGSEGV, and skip through the whole space, attempting to read something from somewhere in each page. Each time you trap a SIGSEGV you know that page is not mapped for your process. Go back afterward and scan each accessible page.
Do only read, however. Do not attempt to write to random memory, because much of the memory accessible to your programs is the binary code of the program itself and of the shared libraries it uses. Not only do you not want to crash the program, but also much of that memory is probably marked read-only for the process.
EDIT: Generally speaking, a process can only access its own (virtual) address space. As #cmaster observed, however, there is a syscall (ptrace()) that allows some processes access to some other processes' memory in the context of the observed process's address space. This is how general-purpose debuggers usually work.
You could read (from your program) the /proc/self/maps file. Try first the following two commands in a terminal
cat /proc/self/maps
cat /proc/$$/maps
(at least to understand what are the address space)
Then read proc(5), mmap(2) and of course wikipages about processes, address space, virtual memory, MMU, shared memory, VDSO.
If you want to share memory between two processes, read first shm_overview(7)
If you can't use /proc/ (which is a pity) consider mincore(2)
You could also non-portably try reading from (and perhaps rewriting the same value using volatile int* into) some address and catching SIGSEGV signal (with a sigsetjmp(3) in the signal handler), and do that in a -dichotomical- loop (in multiple of 4Kbytes) - from some sane start and end addresses (certainly not from 0, but probably from (void*)0x10000 and up to (void*)0xffffffffff600000)
See signal(7).
You could also use the Linux (Gnu libc) specific dladdr(3). Look also into ptrace(2) (which should be often used from some other process).
Also, you could study elf(5) and read your own executable ELF file. Canonically it is /proc/self/exe (a symlink) but you should be able to get its from the argv[0] of your main (perhaps with the convention that your program should be started with its full path name).
Be aware of ASLR and disable it if your teacher permits that.
PS. I cannot figure out what your teacher is expecting from you.
It is a bit more difficult than it seems at the first sight. In Linux every process has its own memory space. Using any arbitrary memory address points to the memory space of this process only. However there are mechanisms which allow one process to access memory regions of another process. There are certain Linux functions which allow this shared memory feature. For example take a look at
this link which gives some examples of using shared memory under Linux using shmget, shmctl and other system calls. Also you can search for mmap system call, which is used to map a file into a process' memory, but can also be used for the purpose of accessing memory of another process.
I am just wondering why does copy_from_user(to, from, bytes) do real copy? Because it just wants kernel to access user-space data, can it directly maps physical address to kernel's address space without moving the data?
Thanks,
copy_from_user() is usually used when writing certain device drivers. Note that there is no "mapping" of bytes here, the only thing that is happening is the copying of bytes from a certain virtual location mapped in user-space to bytes in a location in kernel-space. This is done to enforce separation of kernel and user and to prevent any security flaws -- you never want the kernel to start accessing and reading arbitrary user memory locations or vice-versa. That is why arguments and results from syscalls are copied to/from the user before they actually run.
"Before this it's better to know why copy_from_user() is used"
Because the Kernel never allow a user space application to access Kernel memory directly, because if the memory pointed is invalid or a fault occurs while reading, this would the kernel to panic by just simply using a user space application.
"And that's why!!!!!!"
So while using copy_from_user is all that it could create an error to the user and it won't affect the kernel functionality
Even though it's an extra effort it ensures the safe and secure operation of Kernel
copy_from_user() does a few checks before it starts copying data. Directly manipulating data from user-space is never a good idea because it exists in a virtual address space which might get swapped out.
http://www.ibm.com/developerworks/linux/library/l-kernel-memory-access/
one of the major requirement in system call implementation is to check the validity of user parameter pointer passed as argument, kernel should not blindly follow the user pointer as the user pointer can play tricks in many ways. Major concerns are:
1. it should be a pointer from that process address space - so that it cant get into some other process address space.
2. it should be a pointer from user space - it should not trick to play with a kernel space pointer.
3. it should not bypass memory access restrictions.
that is why copy_from_user() is performed. It is blocking and process sleeps until page fault handler can bring the page from swap file to physical memory.
I was stracing some of the common commands in the linux kernel, and saw mprotect() was used a lot many times. I'm just wondering, what is the deciding factor that mprotect() uses to find out that the memory address it is setting a protection value for, is in its own address space?
On architectures with an MMU1, the address that mprotect() takes as an argument is a virtual address. Each process has its own independent virtual address space, so there's only two possibilities:
The requested address is within the process's own address range; or
The requested address is within the kernel's address range (which is mapped into every process).
mprotect() works internally by altering the flags attached to a VMA2. The first thing it must do is look up the VMA corresponding to the address that was passed - if the passed address was within the kernel's address range, then there is no VMA, and so this search will fail. This is exactly the same thing happens if you try to change the protections on an area of the address space that is not mapped.
You can see a representation of the VMAs in a process's address space by examining /proc/<pid>/smaps or /proc/<pid>/maps.
1. Memory Management Unit
2. Virtual Memory Area, a kernel data structure describing a contiguous section of a process's memory.
This is about virtual memory. And about dynamic linker/loader. Most mprotect(2) syscalls you see in the trace are probably related to bringing in library dependencies, though malloc(3) implementation might call it too.
Edit:
To answer your question in comments - the MMU and the code inside the kernel protect one process from the other. Each process has an illusion of a full 32-bit or 64-bit address space. The addresses you operate on are virtual and belong to a given process. Kernel, with the help of the hardware, maps those to physical memory pages. These pages could be shared between processes implicitly as code, or explicitly for interprocess communications.
The kernel looks up the address you pass mprotect in the current process's page table. If it is not in there then it fails. If it is in there the kernel may attempt to mark the page with new access rights. I'm not sure, but it may still be possible that the kernel would return an error here if there were some special reason that the access could not be granted (such as trying to change the permissions of a memory mapped shared file area to writable when the file was actually read only).
Keep in mind that the page table that the processor uses to determine if an area of memory is accessible is not the one that the kernel used to look up that address. The processor's table may have holes in it for things like pages that are swapped out to disk. The tables are related, but not the same.