I'm new to kernel modules and currently experimenting with it.
I've read that they have the same level access as the kernel itself.
Does this mean they have access to physical memory and can see/overwrite
values of other processes (including the kernel memory space)?
I have written this simple C code to overwrite every memory address but it's not doing anything (expecting the system to just crash, not sure if this is touching physical memory or it's still virtual memory)
I run it with sudo insmod ./test.ko, the code just hangs there (because of the infinite loop of course) but system works fine when I exit manually.
#include <linux/module.h>
#include <linux/kernel.h>
int init_module(void)
{
unsigned char *p = 0x0;
while (true){
*p=0;
p++;
}
return 0;
}
void cleanup_module(void)
{
//
}
Kernel modules run with kernel privileges (including kernel memory and all peripherals). The reason why your code isn´t working is, that you don´t specify the init and exit module. So you can load the module, but the kernel doesn´t call your methods.
Please take a look at this example for a minimal kernel module. Here you will find some explanation about the needed macros.
Related
From http://www.makelinux.net/ldd3/chp-7-sect-1.shtml
Needless to say, both jiffies and jiffies_64 must be considered
read-only
I wrote a program to verify and it successfully updates the jiffies value.
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/jiffies.h>
static int __init test_hello_init(void)
{
jiffies = 0;
pr_info("jiffies:%lu\n", jiffies);
return 0;
}
static void __exit test_hello_exit(void)
{
}
MODULE_LICENSE("GPL");
module_init(test_hello_init);
module_exit(test_hello_exit);
This module successfully sets the jiffies to zero. Am I missing something?
What you are reading is merely a warning. It is an unwritten contract between you (kernel module developer) and the kernel. You shouldn't modify the value of jiffies since it is not up to you to do so, and is updated by the kernel according to a set of complicated rules that you should not worry about. The jiffies value is used internally by the scheduler, so bad things can happen modifying it. Chances are that the variable you see in your module is only a thread-local copy of the real one, so modifying could have no effect. In any case, you shouldn't do it. It is only provided to you as additional information that your module might need to know to implement some logic.
Of course, since you are working in C, there is no concept of "permissions" for variables. Anything that is mapped in a readable and writable region of memory can be modified, you could even modify data in read-only memory by changing the permissions first. You can do all sorts of bad stuff if you want. There are a lot of things you're not supposed to alter, even if you have the ability to do so.
I am learning about VDSO, wrote a simple application which calls gettimeofday()
#define _GNU_SOURCE
#include <sys/syscall.h>
#include <sys/time.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
struct timeval current_time;
if (gettimeofday(¤t_time, NULL) == -1)
printf("gettimeofday");
getchar();
exit(EXIT_SUCCESS);
}
ldd on the binary shows 'linux-vdso'
$ ldd ./prog
linux-vdso.so.1 (0x00007ffce147a000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f6ef9e8e000)
/lib64/ld-linux-x86-64.so.2 (0x00007f6efa481000)
I did a find for the libvdso library and there is no such library present in my file system.
sudo find / -name 'linux-vdso.so*'
Where is the library present?
It's a virtual shared object that doesn't have any physical file on the disk; it's a part of the kernel that's exported into every program's address space when it's loaded.
It's main purpose to make more efficient to call certain system calls (which would otherwise incur performance issues like this). The most prominent being gettimeofday(2).
You can read more about it here: http://man7.org/linux/man-pages/man7/vdso.7.html
find / -name '*vdso*.so*'
yields
/lib/modules/4.15.0-108-generic/vdso/vdso64.so
/lib/modules/4.15.0-108-generic/vdso/vdso32.so
/lib/modules/4.15.0-108-generic/vdso/vdsox32.so
linux-vdso.so is a virtual symbolic link to the bitness-compatible respective vdso*.so.
vDSO = virtual dynamic shared object
Note on vdsox32:
x32 is a Linux ABI which is kind of a mix between x86 and x64.
It uses 32-bit address size but runs in full 64-bit mode, including all 64-bit instructions and registers available.
Making system calls can be slow. In x86 32-bit systems, you can
trigger a software interrupt (int $0x80) to tell the kernel you
wish to make a system call. However, this instruction is
expensive: it goes through the full interrupt-handling paths in
the processor's microcode as well as in the kernel. Newer
processors have faster (but backward incompatible) instructions
to initiate system calls. Rather than require the C library to
figure out if this functionality is available at run time, the C
library can use functions provided by the kernel in the vDSO.
Note that the terminology can be confusing. On x86 systems, the
vDSO function used to determine the preferred method of making a
system call is named "__kernel_vsyscall", but on x86-64, the term
"vsyscall" also refers to an obsolete way to ask the kernel what
time it is or what CPU the caller is on.
One frequently used system call is gettimeofday(2). This system
call is called both directly by user-space applications as well
as indirectly by the C library. Think timestamps or timing loops
or polling—all of these frequently need to know what time it is
right now. This information is also not secret—any application
in any privilege mode (root or any unprivileged user) will get
the same answer. Thus the kernel arranges for the information
required to answer this question to be placed in memory the
process can access. Now a call to gettimeofday(2) changes from a
system call to a normal function call and a few memory accesses.
Also
You must not assume the vDSO is mapped at any particular location
in the user's memory map. The base address will usually be
randomized at run time every time a new process image is created
(at execve(2) time). This is done for security reasons, to prevent
"return-to-libc" attacks.
And
Since the vDSO is a fully formed ELF image, you can do symbol lookups
on it.
And also
If you are trying to call the vDSO in your own application rather than
using the C library, you're most likely doing it wrong.
as well as
Why does the vDSO exist at all? There are some system calls the
kernel provides that user-space code ends up using frequently, to
the point that such calls can dominate overall performance. This
is due both to the frequency of the call as well as the context-
switch overhead that results from exiting user space and entering
the kernel.
I want to create a loadable kernel module for Linux.
This is the code
#include <linux/module.h>
#include <linux/init.h>
static int __init mymodule_init(void)
{
printk ("My module worked!\n");
return 0;
}
static void __exit mymodule_exit(void)
{
printk ("Unloading my module.\n");
return;
}
module_init(mymodule_init);
module_exit(mymodule_exit);
MODULE_LICENSE("GPL");
Pay now attention to the __init macro. As the doc says:
The __init macro indicates to compiler that that associated function
is only used during initialization. Compiler places all code marked
with __init into a special memory section that is freed after
initialization
I'm trying to understand why the initialization method can end up leaking memory. Is it due to the FIFO disposition of function calls in the stack ?
In very broad strokes:
Executable code (what source code is compiled into) takes up memory. A modern CPU would read the section of memory where the instructions reside, and execute them. For most user space applications, the code segment of a processes memory is loaded once, and is never changed during program execution. The code is always there, unless programmers play around with it.
This isn't a problem, since the OS will manage the processes virtual memory and cold code segments will eventually be unloaded into a swap file. Physical memory is never "wasted" like that in user space.
For the kernel, where code runs in privileged mode, nothing will "unload" unused pages as happens in user mode. If a function is placed into the kernels regular code segment, it will take up physical memory for as long as the kernel runs, which can be quite a long time. If a function is only called once, that's quite a waste of space.
Now while loadable kernel modules can be loaded and unloaded in general, so their code may not take up space indefinitely, it's still somewhat wasteful to take up space for a function that is only going to be called once.
Since moderns CPU's treat code as a form of executable data, it's possible to place that data into a memory segment that is not retained indefinitely. The function is loaded, then called, and then the segment can be used for something else. This is what the __init macro instructs the compiler to do. To emit code which can be easily unloaded after being called.
I intend to develop a application that monitors the traffic on particular ports. For this I need to list all the sk_buff data of all the LIVE sk_buff's in the system. How to do this ?
I have written the following code (basically a kernel module.)
include <linux/module.h> /* Needed by all modules */
#include <linux/kernel.h> /* Needed for KERN_INFO */
#include </usr/src/linux-headers-2.6.38-8-generic/include/linux/skbuff.h>
int init_module(void)
{
struct sk_buff *skb;
printk(KERN_INFO "SKB 1.\n");
return 0;
}
void cleanup_module(void)
{
printk(KERN_INFO "Done 1.\n");
}
But I dont know how I catch the sk)buff's. I have simply declared a sk_buff instance .. thats all ..
Please help me to actually catch them live Sk_buff's in the system.
EDIT
I have tried all the top google search results. They give a very good description of the sk_buff itself, but none of them actually show how to do what I am particularly interested in.
There is no standardized way. Newly created skbs are not put into any list by default that you could read (that is, when they come fresh out of skb_alloc), therefore, there is no way to know all skbs are active from a random code point in the kernel, such as your module. You have at least two options though (both entail modifying core kernel code):
Since all skbuffs are allocated from a kmem_cache pool, you could augment the kmem_cache functionality by some function that tells you about all allocated objects.
Within the __alloc_skb function, add all newly allocated skbs into a data structure of your liking (and don't forget to remove them again when the skb is freed). This is going to be a major bottleneck, but that's what you have to pay.
As usual, the question: why?
Can we access any physical memory via some kernel code.? Because, i wrote a device driver which only had init_module and exit_module.. the code is following.
int init_module(void) {
unsigned char *p = (unsigned char*)(0x10);
printk( KERN_INFO "I got %u \n", *p);
return 0;
}
and a dummy exit_module.. the problem is the computer gets hung when i do lsmod..
What happens? Should i get some kinda permission to access the mem location?
kindly explain.. I'm a beginner!
To access real physical memory you should use phys_to_virt function. In case it is io memory (e.g. PCI memory) you should have a closer look at ioremap.
This whole topic is very complex, if you are a beginner I would suggest some kernel/driver development books/doc.
I suggest reading the chapter about memory in this book:
http://lwn.net/Kernel/LDD3/
It's available online for free. Good stuff!
Inside the kernel, memory is still mapped virtually, just not the same way as in userspace.
The chances are that 0x10 is in a guard page or something, to catch null pointers, so it generates an unhandled page fault in the kernel when you touch it.
Normally this causes an OOPS not a hang (but it can be configured to cause a panic). OOPS is an unexpected kernel condition which can be recovered from in some cases, and does not necessarily bring down the whole system. Normally it kills the task (in this case, insmod)
Did you do this on a desktop Linux system with a GUI loaded? I recommend that you set up a Linux VM (Vmware, virtualbox etc) with a simple (i.e. quick to reboot) text-based distribution if you want to hack around with the kernel. You're going to crash it a bit and you want it to reboot as quickly as possible. Also by using a text-based distribution, it is easier to see kernel crash messages (Oops or panic)