I need to replace a standard system call (e.g. SYS_mkdir) with my own implementation.
As I read in some sources, including this question on Stackoverflow, the sys_call_table is not exported symbol since kernel version 2.6.
I tried the following code:
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/unistd.h>
#include <asm/syscall.h>
int (*orig_mkdir)(const char *path);
....
int init_module(void)
{
orig_mkdir=sys_call_table[__NR_mkdir];
sys_call_table[__NR_mkdir]=own_mkdir;
printk("sys_mkdir replaced\n");
return(0);
}
....
Unfortunately I receive compiler error:
error: assignment of read-only location ‘sys_call_table[83]’
How can I replace the system call?
EDIT: Is there any solution without kernel patching?
this works for me.
See
Linux Kernel: System call hooking example
and
https://bbs.archlinux.org/viewtopic.php?id=139406
asmlinkage long (*ref_sys_open)(const char __user *filename, int flags, umode_t mode);
asmlinkage long new_sys_open(const char __user *filename, int flags, umode_t mode)
{
return ref_sys_open(filename, flags, mode);
}
static unsigned long **aquire_sys_call_table(void)
{
unsigned long int offset = PAGE_OFFSET;
unsigned long **sct;
while (offset < ULLONG_MAX) {
sct = (unsigned long **)offset;
if (sct[__NR_close] == (unsigned long *) sys_close)
return sct;
offset += sizeof(void *);
}
print("Getting syscall table failed. :(");
return NULL;
}
// Crazy copypasted asm stuff. Could use linux function as well...
// but this works and will work in the future they say.
static void disable_page_protection(void)
{
unsigned long value;
asm volatile("mov %%cr0, %0" : "=r" (value));
if(!(value & 0x00010000))
return;
asm volatile("mov %0, %%cr0" : : "r" (value & ~0x00010000));
}
static void enable_page_protection(void)
{
unsigned long value;
asm volatile("mov %%cr0, %0" : "=r" (value));
if((value & 0x00010000))
return;
asm volatile("mov %0, %%cr0" : : "r" (value | 0x00010000));
}
static int __init rootkit_start(void)
{
//Hide me
print("loaded");
if(!(sys_call_table = aquire_sys_call_table()))
return -1;
disable_page_protection();
{
ref_sys_open = (void *)sys_call_table[__NR_open];
sys_call_table[__NR_open] = (unsigned long *)new_sys_open;
}
enable_page_protection();
return 0;
}
static void __exit rootkit_end(void)
{
print("exiting");
if(!sys_call_table) {
return;
}
disable_page_protection();
{
sys_call_table[__NR_open] = (unsigned long *)ref_sys_open;
}
enable_page_protection();
}
Yes there is a solution without patching/rebuilding the kernel. Use the Kprobes infrastructure (or SystemTap).
This will allow you to place "probes" (functions) at any point(s) within the kernel, using a kernel module.
Doing similar stuff by modifying the sys_call_table is now prevented (it's read-only) & is considered a dirty hack! Kprobes/Jprobes/etc are a "clean" way to do so..Also, the documentation and samples provided in the kernel source tree is excellent (look under the kernel src tree- Documentation/kprobes.txt).
The problem is caused due to the fact that sys_call_table is read only. In order to avoid the error, before manipulating the sys_call_table, you have to make it writable as well. The kernel provides a function to achieve it. And that function is given as set_mem_rw().
Just add the below code snippet before manipulating the sys_call_table
set_mem_rw((long unsigned int)sys_call_table,1);
In the exit function of the kernel module,please do not forget to revert back the sys_call_table back to read only.It can be achieved as below.
set_mem_ro((long unsigned int)sys_call_table,1);
First, you need to determine the location of sys_call_table. See here.
Before writing into the just located system table, you have to make its memory pages writable. For that check here and if that doesn't work, try this.
Use LSM infrustructure.
Look at LSM hooks path_mkdir or inode_mkdir for details. One question that needs to be solved is how to register your own LSM module while the system don't allow it explicitly. See the answer for details here:
How can I implement my own hook function with LSM?
Related
Is it possible to run m-type instruction of RISC-V isa from userspace? I have to run misa instruction that provides cpu info from a userspace c program, but according to RISC-V documentation it's possible at machine privilege level. Is it completely impossible or is there a workaround?
I've already tried
#include <stdio.h>
static unsigned long cpuid()
{
unsigned long res;
__asm__ volatile ("csrr %0, misa" : "=r"(res));
return res;
}
int main() {
unsigned long mcpuid = cpuid();
printf("mcpuid: %lu", mcpuid);
return 0;
}
Running it on a RISC-V fedora I get Illegal instruction
I am developing a linux kernel module, which looks like this:
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/init.h>
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Me");
MODULE_DESCRIPTION("Something Something");
int checkSomething(void) {
int someCpuFeature = 0;
__asm__("mov $1, %eax");
__asm__("cpuid");
__asm__("mov %%ecx, %0" : "=r" (someCpuFeature));
if (someCpuFeature & 32) {
return 1;
}
return 0;
}
int __init init_module(void) {
if (!checkSomething()) {
printk(KERN_INFO "Exiting\n");
return 0;
} else {
printk(KERN_INFO "Continuing\n");
}
return 0;
}
static void __exit exit_module(void) {
printk(KERN_INFO "Unloading Module\n");
}
And when i loaded it, i tried to see it's output from dmesg.
but instead of only printing Exiting/Continuing, it also printed a call trace,
and said BUG: scheduling while atomic: insmod/24641/0x06100800.
I searched this bug and found that it has some connection to the scheduler and sleeping at places you shouldn't sleep in, but this is the only functionality of the code,
So i think it has something to do with the cpuid instruction, but i don't know exactly what it is.
any ideas?
Hm, considering the interrupts raised on my machine now, it would look like your assembler is incorrect.
Instead of writing this yourself, I'd recommend relying on the cpuid kernel module (and thus, reading /dev/cpu/NUMBER/cpuid after seeking to your level of interest), as that's future-proof external API. You can also look at /arch/x86/include/asm/processor.h, and use kernel functions like cpu_has to detect your feature. Don't reinvent the wheel – querying details of a CPU on an SMP machine is bound to be painful, and the poor kernel developers had to go through the pain to make this work themselves.
I would like to read msr 0x19a (IA32_CLOCK_MODULATIOn) directly from C code WITH root privilege. However, I get the following segfault error.
a.out[27843] general protection ip:40053b sp:7fffefc38020 error:0 in a.out[400000+1000]
Does anyone know whether this way of calling rdmsr is a viable option?
Thanks in advance!
#include <stdio.h>
#define __init
typedef unsigned uint32_t;
static int __init test3_init(void)
{
uint32_t hi,lo;
hi=0x0; lo=0x0;
asm volatile("mov $0x19a,%ecx");
asm volatile("rdmsr":"=a"(lo),"=d"(hi));
printf("exit_readmsr: hi=%08x lo=%08x\n",hi,lo);
return 0;
}
int main(void)
{
return test3_init();
}
BTW, the code is extract from this answer.
This instruction must be executed at privilege level 0. In other words, you must be inside the kernel.
I am developing android application and in that i am working on NDK. while compiling the files i got the error of selected processor does not support `qadd16 r1,r1,r0'. can anyone explain me why and where this error comes and how to deal with this error? Here is my code snippet of basic_op.h file
static inline Word32 L_add(register Word32 ra, register Word32 rb)
{
Word32 out;
__asm__("qadd %0, %1, %2"
: "=r"(out)
: "r"(ra), "r"(rb));
return (out);
}
Thanks in advance
This happens because QADD instruction is not supported on your target architecture (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0211h/Chddhfig.html). To compile this code you need to enable arm-v7 support in NDK.
Add the line
APP_ABI := armeabi-v7a
to your Application.mk and this code will compile perfectly:
static inline unsigned int L_add(register unsigned int ra, register unsigned int rb)
{
unsigned int out;
__asm__("qadd %0, %1, %2"
: "=r"(out)
: "r"(ra), "r"(rb));
return (out);
}
P.S. I am using Android NDK r8.
P.P.S. Why you need this ugly assembly? The output assembly listing for:
static inline unsigned int L_add(register unsigned int ra, register unsigned int rb)
{
return (ra > 0xFFFFFFFF - rb) ? 0xFFFFFFFF : ra + rb;
}
looks still reasonably efficient and it is much more portable!
For some special reasons (please don't ask me why), for some functions, I want to use a separate stack. So for example, say I want the function malloc to use a different stack for its processing, I need to switch to my newly created stack before it is called and get back to the original stack used by the program after it finishes. So the algorithm would be something like this.
switch_to_new_stack
call malloc
swith back to the original stack
What is the easiest and most efficient way of doing this? Any idea?
It probably doesn't fit your definition of easy or efficient, but the following could be one way to do it:
#include <stdio.h>
#include <stdlib.h>
#include <ucontext.h>
/* utility functions */
static void getctx(ucontext_t* ucp)
{
if (getcontext(ucp) == -1) {
perror("getcontext");
exit(EXIT_FAILURE);
}
}
static void print_sp()
{
#if defined(__x86_64)
unsigned long long x; asm ("mov %%rsp, %0" : "=m" (x));
printf("sp: %p\n",(void*)x);
#elif defined(__i386)
unsigned long x; asm ("mov %%esp, %0" : "=m" (x));
printf("sp: %p\n",(void*)x);
#elif defined(__powerpc__) && defined(__PPC64__)
unsigned long long x; asm ("addi %0, 1, 0" : "=r" (x));
printf("sp: %p\n",(void*)x);
#elif defined(__powerpc__)
unsigned long x; asm ("addi %0, 1, 0" : "=r" (x));
printf("sp: %p\n",(void*)x);
#else
printf("unknown architecture\n");
#endif
}
/* stack for 'my_alloc', size arbitrarily chosen */
static int malloc_stack[1024];
static ucontext_t malloc_context; /* context malloc will run in */
static ucontext_t current_context; /* context to return to */
static void my_malloc(size_t sz)
{
printf("in my_malloc(%zu) ", sz);
print_sp();
}
void call_my_malloc(size_t sz)
{
/* prepare context for malloc */
getctx(&malloc_context);
malloc_context.uc_stack.ss_sp = malloc_stack;
malloc_context.uc_stack.ss_size = sizeof(malloc_stack);
malloc_context.uc_link = ¤t_context;
makecontext(&malloc_context, (void(*)())my_malloc, 1, sz);
if (swapcontext(¤t_context, &malloc_context) == -1) {
perror("swapcontext");
exit(EXIT_FAILURE);
}
}
int main()
{
printf("malloc_stack = %p\n", (void*)malloc_stack);
printf("in main ");
print_sp();
call_my_malloc(42);
printf("in main ");
print_sp();
return 0;
}
This should work on all platforms where makecontext(3) is supported. Quoting from the manpage (where I also got the inspiration for the example code):
The interpretation of ucp->uc_stack is just as in sigaltstack(2), namely, this struct contains the start and length of a memory area to be used as the stack, regardless of the direction of growth of the stack. Thus, it is not necessary for the user program to worry about this direction.
Sample output from PPC64:
$ gcc -o stack stack.c -Wall -Wextra -W -ggdb -std=gnu99 -pedantic -Werror -m64 && ./stack
malloc_stack = 0x10010fe0
in main sp: 0xfffffe44420
in my_malloc(42) sp: 0x10011e20
in main sp: 0xfffffe44420
GCC has support of splitted stacks, which works a bit like you described.
http://gcc.gnu.org/wiki/SplitStacks
The goal of the project is different, but implementation will do what you ask.
The goal of split stacks is to permit a discontiguous stack which is grown automatically as needed. This means that you can run multiple threads, each starting with a small stack, and have the stack grow and shrink as required by the program. It is then no longer necessary to think about stack requirements when writing a multi-threaded program. The memory usage of a typical multi-threaded program can decrease significantly, as each thread does not require a worst-case stack size. It becomes possible to run millions of threads (either full NPTL threads or co-routines) in a 32-bit address space.