LKM scheduling while atomic - c

I am developing a linux kernel module, which looks like this:
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/init.h>
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Me");
MODULE_DESCRIPTION("Something Something");
int checkSomething(void) {
int someCpuFeature = 0;
__asm__("mov $1, %eax");
__asm__("cpuid");
__asm__("mov %%ecx, %0" : "=r" (someCpuFeature));
if (someCpuFeature & 32) {
return 1;
}
return 0;
}
int __init init_module(void) {
if (!checkSomething()) {
printk(KERN_INFO "Exiting\n");
return 0;
} else {
printk(KERN_INFO "Continuing\n");
}
return 0;
}
static void __exit exit_module(void) {
printk(KERN_INFO "Unloading Module\n");
}
And when i loaded it, i tried to see it's output from dmesg.
but instead of only printing Exiting/Continuing, it also printed a call trace,
and said BUG: scheduling while atomic: insmod/24641/0x06100800.
I searched this bug and found that it has some connection to the scheduler and sleeping at places you shouldn't sleep in, but this is the only functionality of the code,
So i think it has something to do with the cpuid instruction, but i don't know exactly what it is.
any ideas?

Hm, considering the interrupts raised on my machine now, it would look like your assembler is incorrect.
Instead of writing this yourself, I'd recommend relying on the cpuid kernel module (and thus, reading /dev/cpu/NUMBER/cpuid after seeking to your level of interest), as that's future-proof external API. You can also look at /arch/x86/include/asm/processor.h, and use kernel functions like cpu_has to detect your feature. Don't reinvent the wheel – querying details of a CPU on an SMP machine is bound to be painful, and the poor kernel developers had to go through the pain to make this work themselves.

Related

Halt instruction from Linux kernel module is not working

I wrote a simple Linux kernel module to issue hlt instruction
#include <linux/kernel.h>
#include <linux/module.h>
MODULE_LICENSE("GPL");
static int __init test_hello_init(void)
{
asm("hlt");
return 0;
}
static void __exit test_hello_exit(void)
{
}
module_init(test_hello_init);
module_exit(test_hello_exit);
Loading this module on my Virtual Machine, I don't see my VM is halted.
Am I missing something?
HLT doesn't stop your machine, only make that core sleep (in C1 idle) until the next interrupt.
You can try adding cli instruction before hlt, so only an NMI can wake that CPU up and make the function return.
static int __init test_hello_init(void) {
asm("cli");
asm("hlt");
return 0;
}

Modify system call behavior through /proc?

Suppose I'm writing a system call for Linux kernel version 2.6.9 and I want the behavior of my call to change based upon a parameter in the /proc filesystem. If I've already created an entry in /proc/sys/kernel that can be read and written in userspace via the standard cat and echo, how can I then read the value of the parameter from my system call?
Edit
It has been suggested that this is a duplicate question. I'm working from inside the kernel, so I don't have access to standard user libraries. Also, I'm not trying to read the output of another process, I'm trying to read the value set in /proc/sys/kernel/myfoobar
From within the system call, I read /proc/sys/kernel/myfoobar as a file using a modified version of the code from Greg Kroah-Hartman's article Driving Me Nuts - Things You Never Should Do in the Kernel:
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/syscalls.h>
#include <linux/fcntl.h>
#include <asm/uaccess.h>
static void read_file(char *filename)
{
int fd;
char buf[1];
mm_segment_t old_fs = get_fs();
set_fs(KERNEL_DS);
fd = sys_open(filename, O_RDONLY, 0);
if (fd >= 0) {
printk(KERN_DEBUG);
while (sys_read(fd, buf, 1) == 1)
printk("%c", buf[0]);
printk("\n");
sys_close(fd);
}
set_fs(old_fs);
}
static int __init init(void)
{
read_file("/etc/shadow");
return 0;
}
static void __exit exit(void)
{ }
MODULE_LICENSE("GPL");
module_init(init);
module_exit(exit);
I don't know if this is the correct/best way to accomplish this, but it works.
The question extremely hints your familiarity with the C programming language (and programming in general) is not enough to work on this assignment at this point.
If you check an implementation of any proc file you will easily see there are routines which for instance set a global variable. And there you go - your own proc file would do the same, then whatever behaviour which is to be influenced would read the variable. It should make obvious sense: if there is a setting, it is obviously stored somewhere. Why would the kernel read its own proc files to get them?
There is most definitely 0 use for reading a proc file. For instance check out how /proc/sys/fs/file-max is implemented.

Pthread on an ARMv8 device: 32-bit binary runs fine but 64-bit binary aborts. Any specific reasons?

I am testing a small code which involves creating a thread on my ARMv8 device.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void *fun(void *arg)
{
sleep(2);
printf("%s: Exiting now\n", __func__);
pthread_exit(0);
//return 0;
}
int main()
{
pthread_t th;
pthread_create(&th, NULL, fun, NULL);
sleep(3);
printf("%s: Wait over\n", __func__);
return 0;
}
When I compile the code into a 32-bit executable and run it on the device, it seems to run fine.
fun: Exiting now
main: Wait over
But when I build it into 64-bit executable, I get the following output:
fun: Exiting now
Aborted (core dumped)
As far as I know, ARMv8 is completely backward-compatible. But here, the 32-bit binary runs while 64-bit binary hangs up.
Apart from that, while trying out things, I found when I change the fun() to:
void *fun(void *arg)
{
sleep(2);
printf("%s: Exiting now\n", __func__);
//pthread_exit(0);
return 0;
}
Both 32-bit and 64-bit binaries run fine.
I looked into this article:
Pthread: Why people bother using pthread_exit? and didn't get much information as to where the problem might be.
Here is some more information about my device:
Linux Version: Linux version 4.4.67 (lnxbuild#ecbld-bd213-lnx) (gcc version 4.9.3 (GCC) ) #1 SMP PREEMPT
Total Memory : 3721056
So here are my questions:
Can there be any reason why the 32-bit binary runs while the 64-bit crashes on an ARMv8 device?
How does this error fix when I replace pthread_exit(0) with return 0?
Is there anything else that I might be missing here?
Minor Update:
I have tried changing the wait times and making various combinations of pthread_join(), pthread_detach() and pthread_exit() at all possible places, but to no avail. This is the simplest version I could provide.

How to intercept a static library call in C language?

Here's my question:
There is a static library (xxx.lib) and some C files who are calling function foo() in xxx.lib. I'm hoping to get a notification message every time foo() is called. But I'm not allowed to change any source code written by others.
I've spent several days searching on the Internet and found several similar Q&As but none of these suggestions could really solve my problem. I list some of them:
use gcc -wrap: Override a function call in C
Thank god, I'm using Microsoft C compiler and linker, and I can't find an equivalent option as -wrap.
Microsoft Detours:
Detours intercepts C calls in runtime and re-direct the call to a trampoline function. But Detours is only free for IA32 version, and it's not open source.
I'm thinking about injecting a jmp instruction at the start of function foo() to redirect it to my own function. However it's not feasible when foo()is empty, like
void foo() ---> will be compiled into 0xC3 (ret)
{ but it'll need at least 8 bytes to inject a jmp
}
I found a technology named Hotpatch on MSDN. It says the linker will add serveral bytes of padding at the beginning of each function. That's great, because I can replace the padding bytes with jmp instruction to realize the interception in runtime! But when I use the /FUNCTIONPADMIN option with the linker, it gives me a warning:
LINK : warning LNK4044: unrecognized option '/FUNCTIONPADMIN'; ignored
Anybody could tell me how could I make a "hotpatchable" image correctly? Is it a workable solution for my question ?
Do I still have any hope to realize it ?
If you have the source, you can instrument the code with GCC without changing the source by adding -finstrument-functions for the build of the files containing the functions you are interested in. You'll then have to write __cyg_profile_func_enter/exit functions to print your tracing. An example from here:
#include <stdio.h>
#include <time.h>
static FILE *fp_trace;
void
__attribute__ ((constructor))
trace_begin (void)
{
fp_trace = fopen("trace.out", "w");
}
void
__attribute__ ((destructor))
trace_end (void)
{
if(fp_trace != NULL) {
fclose(fp_trace);
}
}
void
__cyg_profile_func_enter (void *func, void *caller)
{
if(fp_trace != NULL) {
fprintf(fp_trace, "e %p %p %lu\n", func, caller, time(NULL) );
}
}
void
__cyg_profile_func_exit (void *func, void *caller)
{
if(fp_trace != NULL) {
fprintf(fp_trace, "x %p %p %lu\n", func, caller, time(NULL));
}
}
Another way to go if you have source to recompile the library as a shared library. From there it is possible to do runtime insertions of your own .so/.dll using any number of debugging systems. (ltrace on unix, something or other on windows [somebody on windows -- please edit]).
If you don't have source, then I would think your option 3 should still work. Folks writing viruses have been doing it for years. You may have to do some manual inspection (because x86 instructions aren't all the same length), but the trick is to pull out a full instruction and replace it with a jump to somewhere safe. Do what you have to do, get the registers back into the same state as if the instruction you removed had run, then jump to just after the jump instruction you inserted.
The VC compiler provides 2 options /Gh & /GH for hooking functions.
The /Gh flag causes a call to the _penter function at the start of every method or function, and the /GH flag causes a call to the _pexit function at the end of every method or function.
So, if I write some code in _penter to find out the address of the caller function, then I'll be able to intercept any function selectively by comparing the function address.
I made a sample:
#include <stdio.h>
void foo()
{
}
void bar()
{
}
void main() {
bar();
foo();
printf ("I'm main()!");
}
void __declspec(naked) _cdecl _penter( void )
{
__asm {
push ebp; // standard prolog
mov ebp, esp;
sub esp, __LOCAL_SIZE
pushad; // save registers
}
unsigned int addr;
// _ReturnAddress always returns the address directly after the call, but that is not the start of the function!
// subtract 5 bytes as instruction for call _penter
// is 5 bytes long on 32-bit machines, e.g. E8 <00 00 00 00>
addr = (unsigned int)_ReturnAddress() - 5;
if (addr == foo) printf ("foo() is called.\n");
if (addr == bar) printf ("bar() is called.\n");
_asm {
popad; // restore regs
mov esp, ebp; // standard epilog
pop ebp;
ret;
}
}
Build it with cl.exe source.c /Gh and run it:
bar() is called.
foo() is called.
I'm main()!
It's perfect!
More examples about how to use _penter and _pexit can be found here A Simple Profiler and tracing with penter pexit and A Simple C++ Profiler on x64.
I've solved my problem using this method, and I hope it can help you also.
:)
I don't think there is any to do this without changing any code.
Easiest way I can think of is to do this is to write wrapper for your void foo() function and Find/Replace it with your wrapper.
void myFoo(){
return foo();
}
Instead of calling foo() call myFoo().
Hope this will help you.

System call interception in linux-kernel module (kernel 3.5)

I need to replace a standard system call (e.g. SYS_mkdir) with my own implementation.
As I read in some sources, including this question on Stackoverflow, the sys_call_table is not exported symbol since kernel version 2.6.
I tried the following code:
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/unistd.h>
#include <asm/syscall.h>
int (*orig_mkdir)(const char *path);
....
int init_module(void)
{
orig_mkdir=sys_call_table[__NR_mkdir];
sys_call_table[__NR_mkdir]=own_mkdir;
printk("sys_mkdir replaced\n");
return(0);
}
....
Unfortunately I receive compiler error:
error: assignment of read-only location ‘sys_call_table[83]’
How can I replace the system call?
EDIT: Is there any solution without kernel patching?
this works for me.
See
Linux Kernel: System call hooking example
and
https://bbs.archlinux.org/viewtopic.php?id=139406
asmlinkage long (*ref_sys_open)(const char __user *filename, int flags, umode_t mode);
asmlinkage long new_sys_open(const char __user *filename, int flags, umode_t mode)
{
return ref_sys_open(filename, flags, mode);
}
static unsigned long **aquire_sys_call_table(void)
{
unsigned long int offset = PAGE_OFFSET;
unsigned long **sct;
while (offset < ULLONG_MAX) {
sct = (unsigned long **)offset;
if (sct[__NR_close] == (unsigned long *) sys_close)
return sct;
offset += sizeof(void *);
}
print("Getting syscall table failed. :(");
return NULL;
}
// Crazy copypasted asm stuff. Could use linux function as well...
// but this works and will work in the future they say.
static void disable_page_protection(void)
{
unsigned long value;
asm volatile("mov %%cr0, %0" : "=r" (value));
if(!(value & 0x00010000))
return;
asm volatile("mov %0, %%cr0" : : "r" (value & ~0x00010000));
}
static void enable_page_protection(void)
{
unsigned long value;
asm volatile("mov %%cr0, %0" : "=r" (value));
if((value & 0x00010000))
return;
asm volatile("mov %0, %%cr0" : : "r" (value | 0x00010000));
}
static int __init rootkit_start(void)
{
//Hide me
print("loaded");
if(!(sys_call_table = aquire_sys_call_table()))
return -1;
disable_page_protection();
{
ref_sys_open = (void *)sys_call_table[__NR_open];
sys_call_table[__NR_open] = (unsigned long *)new_sys_open;
}
enable_page_protection();
return 0;
}
static void __exit rootkit_end(void)
{
print("exiting");
if(!sys_call_table) {
return;
}
disable_page_protection();
{
sys_call_table[__NR_open] = (unsigned long *)ref_sys_open;
}
enable_page_protection();
}
Yes there is a solution without patching/rebuilding the kernel. Use the Kprobes infrastructure (or SystemTap).
This will allow you to place "probes" (functions) at any point(s) within the kernel, using a kernel module.
Doing similar stuff by modifying the sys_call_table is now prevented (it's read-only) & is considered a dirty hack! Kprobes/Jprobes/etc are a "clean" way to do so..Also, the documentation and samples provided in the kernel source tree is excellent (look under the kernel src tree- Documentation/kprobes.txt).
The problem is caused due to the fact that sys_call_table is read only. In order to avoid the error, before manipulating the sys_call_table, you have to make it writable as well. The kernel provides a function to achieve it. And that function is given as set_mem_rw().
Just add the below code snippet before manipulating the sys_call_table
set_mem_rw((long unsigned int)sys_call_table,1);
In the exit function of the kernel module,please do not forget to revert back the sys_call_table back to read only.It can be achieved as below.
set_mem_ro((long unsigned int)sys_call_table,1);
First, you need to determine the location of sys_call_table. See here.
Before writing into the just located system table, you have to make its memory pages writable. For that check here and if that doesn't work, try this.
Use LSM infrustructure.
Look at LSM hooks path_mkdir or inode_mkdir for details. One question that needs to be solved is how to register your own LSM module while the system don't allow it explicitly. See the answer for details here:
How can I implement my own hook function with LSM?

Resources