I am trying to override the kill command In my module by replacing the pointer stored in sys_call_table, to a pointer to function i implemented.
I used the kallsyms_lookup_name() syscall to get the address of the table, and lookup_address() to get the address of the page.
now, my problem is the kernel protecting the page with r/w flag.
So, given a virtual page address, is there a way to modify the r/w flag?
Do you need this?
https://elixir.bootlin.com/linux/v4.3/source/arch/arm/mm/pageattr.c#L68\
At least, you can change your memory attribute for kernel module. Also, you can change your kernel text memory attribute as well.
As far as I tested, if we try to write ro memory in the kernel module, MMU will generate exception and you will see "invalid virtual memory" access.
You have to pass virtual memory address.
Related
Recently I am study vdso in linux. I tried to modify the data in vvar section but it failed. Following is what I've tried.
According to lwn described, there're two address for vvar:
The first one is a regular kernel-space address whose value is greater than PAGE_OFFSET. If you look at the System.map file, you'll find that this symbol has an address like ffffffff81c76080.
The second address is in a region called the "vvar page". The base address (VVAR_ADDRESS) of this page is defined in the kernel to be at 0xffffffffff5ff000, close to the end of the 64-bit address space. This page is made available read-only to user-space code.
Modify by first address: First address is easy to find by checking the address of _vdso_data in /boot/System.map-kernel-version. After getting the address, I modify the member (__unused) of it in kernel module ( thus, there's not permission issue ). Then, I check the __unused from user space program, it still remain 0 which means that I failed to modify it from kernel space.
Modify by second address: Second address can be found by auxiliary vector. After getting the address of vdso, then we can find vvar section. I pass this address into kernel module and modify its member __unused. The the error occur, showing the permission error issue. ( The reason should be vvar is read-only memory, checking by cat /proc/pid/maps )
I think the first way is almost at the solution but it seems vvar at that address is not be mapped to all process' vvar section. Are there any idea? Thanks in advance.
[EDIT]: The first way get more closer to solution. I modify the cr0 bit, thus allow permission of writing. Even I successfully write it, it still can't be read from linux kernel (by getting vdso_data through __arch_get_k_vdso_data and accessing its member which I modify in user space + kernel module previously)
The following is an excerpt from my simple driver code.
int vprobe_ioctl( struct file *filep, unsigned int cmd, void *UserInp)
{
case IOCTL_GET_MAX_PORTS:
*(int*)UserInp = TotalPorts;
#if ENABLED_DEBUG
printk("Available port :%u \n ", TotalPorts);
#endif
break;
}
I was not aware about the function copy_to_user which should be used while writing on user space memory. The code directly accesses the user address. But still I am not getting any kernel crash in my development system(x86_64 architecture). It works as expected.
But sometimes I could see kernel crash when I insert the .ko file in some other x86_64 machines. So, I replaced direct accessing with copy_to_user, and it works.
Could anyone please explain,
i) How direct accessing of user address works?
ii) Why am I seeing kernel crash in some systems whereas it works well in some other systems. Is there any kernel configuration mismatch between the systems because of which the kernel could access the user process's virtual address directly?
Note : All the systems I have used have same OS and kernel.-same image generated thru kickstart. - There is no possibility of any differences.
Thanks in advance.
would be interesting to see the crash. now what I'm saying is an assumption based on my knowledge about how the memory works.
user space memory is virtual. it means that the specific process address X is now located on some physical memory, this physical memory is a memory page that is currently allocated to your process. copy to user first checks that the memory given really belongs to the process and other security checks. beside that there is mapping issues.
the kernel memory has its own address space that need to map virtual to physical address. the kernel use the help of mmu (this is different per architecture). In x86 the mapping between the kernel virtual and user virtual is 1:1 (there are different issues here). In other system this is not always true.
I try to understand the mechanism in Linux of mapping kernel mode space into user mode space using mmap.
First I have a loadable kernel module (LKM) which provides a character device with mmap-functionality. Then a user space application open the device and calls mmap the LKM allocate memory space on the heap of the LKM inside the kernel mode space (virtual high address). On user space side the data pointer points to a virtual low address.
The following picture shows how I imagine the anatomy of memory is. Is this right?
Please let me know if question is not clear, I will try to add more details.
Edit: The picture was edited regarding to Gil Hamilton. The black arrow now points to a physical address.
The drawing is missing out a few important underlying assumptions.
The kernel does not need to mmap() to access user space memory. If a user process has the memory, it's already mapped in the address space by definition. In that sense, the memory is already shared between user and kernel.
mmap() creates a new region in user's virtual address space, so that the address region can be populated by physical memory if later accessed. The actual allocation of memory and modifying the page table entry is done by the kernel.
mmap() only makes sense for managing user-half of the virtual address space. Kernel-half of the address space is managed completely differently.
Also, the kernel-half is shared by all processes in the system. Each process has its dedicated virtual address space, but the page tables are programmed in such a way that the page table entries for the kernel-half are set exactly the same for all processes.
Again, the kernel does not mmap() in order to access user space memory. mmap() is rather a service provided by kernel to user to modify the current mapping in user's virtual address space.
BTW, the kernel actually has a few ways to access user memory if it wants to.
First of all, the kernel has a dedicated region of kernel address space (as part of its kernel space) which maps the entirety of the physical memory present in consecutive fashion. (This is true in all 64-bit system. In 32-bit system the kernel has to 'remap' on-the-fly to achieve this.)
Second, if the kernel is entered via a system call or exception, not by hardware interrupt, you have valid process context, so the kernel can directly "dereference" user space pointer to get the correct value.
Third, if kernel wants to deference a user space pointer of a process while executing in a borrowed context such as in an interrupt handler, kernel can trace process's virtual address by traversing the vm_area_struct tree for permission and walking the page table to find out actual physical page frame.
You can check the memory regions by iterating through vma's "struct vm_area_struct" through current.
If you walk pagetables and derive mapped physical addresses for virtual addresses which is not related to user space then memory layout will be more clear.
Apart from this minor correction in this figure,
BSS is not a segment but section which is embed to Data segment, refer ELF specification for more details, linker script
I am trying to use the mamp() functionality provided in linux-kernel.
As we call mmap() in user-space we try to map virtual memory area of user-space process to the memory in the kernel-space.
the definition of mamp() inside kernel is done in my kernel module which try to allocate some memory in pages & maps it during mmap system call. The memory content of this kernel-space memory could be filled by this module.
The question i want to ask is that after memory mapping the user-space process could access the mapped memory directly with-out any extra kernel overload so there will be no system-call like read() but if the memory(allocated inside kernel-space & mapped in the kernel-space) is containing the pointer to other memory(not mapped) allocated inside the kernel-space then could the user-space process be able to access this unmapped memory with the help of mapped memory's content which are pointer to this unmapped memory.
No, userspace can't chase pointers in mapped memory that point to unmapped kernel memory.
No user-space process can not be able to access the unmapped memory. Kernel wont allow you to access that memory.
You are able to access only that portion of memory which is mapped via mmap.
I think use can use remap_pfn_range function explicitly to remapping the region.
From Linux mmap man page
The effect of changing
the size of the underlying file of a mapping on the pages that correspond to
added or removed regions of the file is unspecified.
No,you can't.
However,If your purpose is to change your mmaped area on the fly,Here are some options:
A. In user space, you can use mremap which expands (or shrinks) an existing memory mapping.
B. In kernel space,in your driver, you need to implement nopage() method or remap_pfn_range,but remap_pfn_range has its limitation which Linux only gives the reserved pages and you even cant remap normal address,such as the one allocated by get_free_page()
I was stracing some of the common commands in the linux kernel, and saw mprotect() was used a lot many times. I'm just wondering, what is the deciding factor that mprotect() uses to find out that the memory address it is setting a protection value for, is in its own address space?
On architectures with an MMU1, the address that mprotect() takes as an argument is a virtual address. Each process has its own independent virtual address space, so there's only two possibilities:
The requested address is within the process's own address range; or
The requested address is within the kernel's address range (which is mapped into every process).
mprotect() works internally by altering the flags attached to a VMA2. The first thing it must do is look up the VMA corresponding to the address that was passed - if the passed address was within the kernel's address range, then there is no VMA, and so this search will fail. This is exactly the same thing happens if you try to change the protections on an area of the address space that is not mapped.
You can see a representation of the VMAs in a process's address space by examining /proc/<pid>/smaps or /proc/<pid>/maps.
1. Memory Management Unit
2. Virtual Memory Area, a kernel data structure describing a contiguous section of a process's memory.
This is about virtual memory. And about dynamic linker/loader. Most mprotect(2) syscalls you see in the trace are probably related to bringing in library dependencies, though malloc(3) implementation might call it too.
Edit:
To answer your question in comments - the MMU and the code inside the kernel protect one process from the other. Each process has an illusion of a full 32-bit or 64-bit address space. The addresses you operate on are virtual and belong to a given process. Kernel, with the help of the hardware, maps those to physical memory pages. These pages could be shared between processes implicitly as code, or explicitly for interprocess communications.
The kernel looks up the address you pass mprotect in the current process's page table. If it is not in there then it fails. If it is in there the kernel may attempt to mark the page with new access rights. I'm not sure, but it may still be possible that the kernel would return an error here if there were some special reason that the access could not be granted (such as trying to change the permissions of a memory mapped shared file area to writable when the file was actually read only).
Keep in mind that the page table that the processor uses to determine if an area of memory is accessible is not the one that the kernel used to look up that address. The processor's table may have holes in it for things like pages that are swapped out to disk. The tables are related, but not the same.