I'm working on an OS class project with a variant of HOCA system.
I'm trying to create the interrupt handler part of the OS where I/O device interrupts are detected and handled.
(If you have no idea about HOCA, that's fine) My question is really about the internal manipulation of C.
The whole system work like this:
Main function of the OS calls an init() where all the parts are initialized.
After initializing the OS, the root process is created and the first application is schedule()'ed to the specific application. Then the application processes are created and schedule()'ed in a tree structure which rooted from the root process.
void schedule(){
proc_t *front;
front = headQueue(RQ); //return the first available process in the Ready Queue
if (checkPointer(front)) {
intschedule(); // load a timeslice to the OS
LDST(&(front->p_s)); // load the state to the OS
// so that OS can process the application specified by p_s
// LDST() is system function to load a state to processor
}
else {
intdeadlock(); // unlock a process from the blocked list and put in RQ
}
}
Using gdb, I see everything is ok, until it processes right before if(checkPointer(front))
int checkPointer(void *p){
return ((p != (void *) ENULL)&&(p != (void *)NULL));
}
gdb respond:
trap: nonexistant memory address: -1 memory size: 131072 ERROR:
address greater than MEMORYSIZE
what's going wrong with this?
checkPointer() is located in another file.
Your help is much appreciated.
Related
I have kernel task that create kernel thread ,and I need to copy data to the user which call my kernel task ,from my kernel thread . So I can pass the current task as parameter to my kernel thread.
But how can I tell the copy_from_user function to copy from other process address space.
this my kernel task
asmlinkage int sys_daniel(struct pt_regs *r )
{
struct task_struct *ts1;
ts1 = kthread_run(kthread_func, current, "thread-1");
return 0;
}
and this the kernel thread I am tring to write
static int kthread_func(struct_task args)
{
spcail_copy_to_user(from,to,len,args->mm)
}
there is any way to edit the kernel thread current->mm or to set in the copy_from_user the address space.
ok so you first need to create a page object with the address you want so I used the
struct page *P
get_user_pages(current,current->mm,(unsigned long)buff,1,1,&p,NULL)
so this basically create page that now we can map to the kernel
so we can use
kernlBuff=(char*)kmap(p);//mapping it to the kernel
kunmap(p);//for unmapp it from the kernel
I'm trying to understand more about process 0, such as, whether it has a memory descriptor (non-NULL task_struct->mm field) or not, and how is it related to the swap or idle process. It seems to me that a single 'process 0' is created on the boot cpu, and then an idle thread is created for every other cpu by idle_threads_init, but I didn't find where the first one( I assume that is the process 0) was created.
Update
In light of the live book that tychen referenced, here is my most up-to-date understanding regarding process 0 (for x86_64), can someone confirm/refute the items below?
An init_task typed task_struct is statically defined, with the task's kernel stack init_task.stack = init_stack, memory descriptor init_task.mm=NULL and init_task.active_mm=&init_mm, where the stack area init_stack and mm_struct init_mm are both statically defined.
The fact that only active_mm is non-NULL means process 0 is a kernel process. Also, init_task.flags=PF_KTHREAD.
Not long after the uncompressed kernel image begins execution, boot cpu starts to use init_stack as kernel stack. This makes the current macro meaningful (for the first time since machine boots up), which makes fork() possible. After this point, the kernel literally runs in process 0's conext.
start_kernel -> arch_call_rest_init -> rest_init, and inside this function, process 1&2 are forked. Within the kernel_init function which is scheduled for process 1, a new thread (with CLONE_VM) is made and hooked to a CPU's run queue's rq->idle, for every other logical CPU.
Interestingly, all idle threads share the same tid 0 (not only tgid). Usually threads share tgid but have distinct tid, which is really Linux's process id. I guess it doesn't break anything because idle threads are locked to their own CPUs.
kernel_init loads the init executable (typically /sbin/init), and switches both current->mm and active_mm to a non-NULL mm_struct, and clears the PF_KTHREAD flag, which makes process 1 a legitimate user space process. While process 2 does not tweak mm, meaning it remains a kernel process, same as process 0.
At the end of rest_init, do_idle takes over, which means all CPU has an idle process.
Something confused me before, but now becomes clear: the init_* objects/labels such as init_task/init_mm/init_stack are all used by process 0, and not the init process, which is process 1.
We really start Linux kernel from start_kernel, and the process 0/idle starts here too.
In the begin of start_kernel, we call set_task_stack_end_magic(&init_stack). This function will set the stack border of init_task, which is the process 0/idle.
void set_task_stack_end_magic(struct task_struct *tsk)
{
unsigned long *stackend;
stackend = end_of_stack(tsk);
*stackend = STACK_END_MAGIC; /* for overflow detection */
}
It's easy to understand that this function get the limitation address and set the bottom to STACK_END_MAGIC as a stack overflow flag. Here is the structure graph.
The process 0 is statically defined . This is the only process that is not created by kernel_thread nor fork.
/*
* Set up the first task table, touch at your own risk!. Base=0,
* limit=0x1fffff (=2MB)
*/
struct task_struct init_task
#ifdef CONFIG_ARCH_TASK_STRUCT_ON_STACK
__init_task_data
#endif
= {
#ifdef CONFIG_THREAD_INFO_IN_TASK
.thread_info = INIT_THREAD_INFO(init_task),
.stack_refcount = REFCOUNT_INIT(1),
#endif
.state = 0,
.stack = init_stack,
.usage = REFCOUNT_INIT(2),
.flags = PF_KTHREAD,
.prio = MAX_PRIO - 20,
.static_prio = MAX_PRIO - 20,
.normal_prio = MAX_PRIO - 20,
.policy = SCHED_NORMAL,
.cpus_ptr = &init_task.cpus_mask,
.cpus_mask = CPU_MASK_ALL,
.nr_cpus_allowed= NR_CPUS,
.mm = NULL,
.active_mm = &init_mm,
......
.thread_pid = &init_struct_pid,
.thread_group = LIST_HEAD_INIT(init_task.thread_group),
.thread_node = LIST_HEAD_INIT(init_signals.thread_head),
......
};
EXPORT_SYMBOL(init_task);
Here are some important thins we need to make it clearly.
INIT_THREAD_INFO(init_task) sets the thread_info as the graph above.
init_stack is defined as below
extern unsigned long init_stack[THREAD_SIZE / sizeof(unsigned long)];
where THREAD_SIZE equal to
#ifdef CONFIG_KASAN
#define KASAN_STACK_ORDER 1
#else
#define KASAN_STACK_ORDER 0
#endif
#define THREAD_SIZE_ORDER (2 + KASAN_STACK_ORDER)
#define THREAD_SIZE (PAGE_SIZE << THREAD_SIZE_ORDER)
so the default size is defined.
The process 0 will only run in kernel space, but in some circumstances as I mention above it needs a virtual memory space, so we set the following
.mm = NULL,
.active_mm = &init_mm,
Let's look back at start_kernel, the rest_init will initialize kernel_init and kthreadd.
noinline void __ref rest_init(void)
{
......
pid = kernel_thread(kernel_init, NULL, CLONE_FS);
......
pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES);
......
}
kernel_init will run execve and then go to user space, change to init process by running , which is process 1.
if (!try_to_run_init_process("/sbin/init") ||
!try_to_run_init_process("/etc/init") ||
!try_to_run_init_process("/bin/init") ||
!try_to_run_init_process("/bin/sh"))
return 0;
kthread becomes the daemon process to manage and schedule other kernel task_struts, which is process 2.
After all this, the process 0 will become idle process and jump out rq which means it will only run when the rq is empty.
noinline void __ref rest_init(void)
{
......
/*
* The boot idle thread must execute schedule()
* at least once to get things moving:
*/
schedule_preempt_disabled();
/* Call into cpu_idle with preempt disabled */
cpu_startup_entry(CPUHP_ONLINE);
}
void cpu_startup_entry(enum cpuhp_state state)
{
arch_cpu_idle_prepare();
cpuhp_online_idle(state);
while (1)
do_idle();
}
Finally, here is a good gitbook for you if you want to get more understanding of Linux kernel.
I'm working with Android 8.1 Pixel2 XL phone.
I have hooked the sys_call_table and have replaced the syscalls with my own functions using the kernel module.
I want to make an application unable to quit.
I'm trying to invalidate an application's sys_exit_group and sys_kill.
What should I do in my own function.
I want to debug an application, but it turns on anti-debugging. So I want to hook the system call
I have tried direct return, but It wasn't work. System will call sys_kill again.But this time, I can't get the application's uid from its pid.
asmlinkage long my_sys_kill(pid_t pid, int sig)
{
char buff[MAX_PATH] = {0};
kuid_t uid = current->cred->uid;
int target_uid = get_uid_from_pid(pid);
if (target_uid == targetuid)
{
printk(KERN_DEBUG "#Tsingxing: kill hooked uid is %d pid is %d, tragetuid is %d, packagename: %s\n",uid.val,pid, target_uid, buff);
return 0;
}
printk(KERN_DEBUG "#Tsingxing:kill called uid is %d,pid is %d, traget_uid is %d\n",uid.val,pid,target_uid);
return origin_sys_kill(pid, sig);
}
asmlinkage long my_sys_exit_group(int error_code)
{
char buff[MAX_PATH] = {0};
kuid_t uid = current->cred->uid;
long tgid = current -> tgid;
long pid = current->pid;
int target_uid = get_uid_from_pid(pid);
if (uid.val == targetuid || target_uid == targetuid)
{
printk(KERN_DEBUG "#Tsingxing:exit group hooked, pid is %ld\n",pid);
return 0;
}
return origin_sys_exit_group(error_code);
}
I have solved this problem. I mixed sys_call_table and compat_sys_call_table. The Target application is using compat_sys_call_table but I'm using the __NR_xxx. I solved the problem using __NR_compat_xxx method. Just return direct in compat_sys_call_exit_group.
At a very high level, this can't work. When an application calls _Exit (possibly/likely at the end of exit), it has no path to any further code to be run. These functions are normally even marked _Noreturn, meaning that the compiler does not leave the registers/calling stack frame in a meaningful state where resumption of execution could occur. Even if it did, the program itself at the source level is not prepared to continue execution.
If the function somehow returned, the next step would be runaway wrong code execution, likely leading to arbitrary code execution under the control of an attacker if the application had been processing untrusted input of any kind.
In practice, the libc side implementation of the exit and _Exit functions likely hardens against kernel bugs (yes, what you're asking for is a bug) whereby SYS_exit_group fails to exit. I haven't verified other implementations lately but I know mine in musl do this, because it's cheap and the alternative is very dangerous.
There is only one kthread and I want to control it to run on specific CPU.
Main process creates and wakes up kthread by kthread_create() and wake_up_process() function.
When the kthread is created, maie process stores pid of the kthread at global variable. Let it called "thread_pid".
I create function to change the CPU of kthread.
It looks like "int change_cpu(int cpu_to_change)".
It uses sched_setaffinity() while passing parameter pid as "thread_pid".
i.e. it calls like "sched_setaffinity(thread_pid, cpu_mask_to_change);".
And it stores value of parameter "cpu_to_change" to global variable. Let it called "thread_cpu".
The kthread has assertion such as "ASSERT(smc_processor_id() == thread_cpu)".
The kthread does not run instead wait for completion usually.
I expect that after change_cpu() function is called, the kthread works well without assertion fail.
But it falls to assertion fail, even sched_setaffinity() works successfully.
Why doesn't it works as expected?
I want to know why this way doesn't work.
Here is dummy code for better understanding.
int thread_cpu;
int thread_pid;
int dummy_kthread(void *args)
{
while(1) {
wait_for_completion();
ASSERT( smc_processor_id() == thread_cpu );
'''
do something
'''
complete();
}
}
int change_cpu(int cpu_to_change)
{
struct cpumask * cpu_mask;
thread_cpu = cpu_to_change;
cpu_mask = set_cpumask(cpu_to_change); // this is not actually exist function.
return sched_setaffinity(thread_pid, cpu_mask);
}
int main(){
struct task_struct *dummy;
dummy = kthread_create(dummy_kthread, NULL, "dummy_kthread");
thread_pid = get_pid(dummy); // this is not actually exist function.
}
One possible cause for sched_setaffinity() to appear not to be working correctly is linked to the dynamic power management of cores. When the core powers off, all of the threads running on that core will migrate away from that core. As a result, the cpumask will be updated accordingly.
In order to prevent cores from powering off, you need to choose "no" for HOTPLUG_CPU when configuring your kernel, or you can manually set the default value to 'n' in the Kconfig file (arch/[architecture]/Kconfig) before compiling your kernel.
I've written a program that uses SIGALRM and a signal handler.
I'm now trying to add this as a test module within the kernel.
I found that I had to replace a lot of the functions that libc provides with their underlying syscalls..examples being timer_create with sys_timer_create timer_settime with sys_timer_settime and so on.
However, I'm having issues with sigaction.
Compiling the kernel throws the following error
arch/arm/mach-vexpress/cpufreq_test.c:157:2: error: implicit declaration of function 'sys_sigaction' [-Werror=implicit-function-declaration]
I've attached the relevant code block below
int estimate_from_cycles() {
timer_t timer;
struct itimerspec old;
struct sigaction sig_action;
struct sigevent sig_event;
sigset_t sig_mask;
memset(&sig_action, 0, sizeof(struct sigaction));
sig_action.sa_handler = alarm_handler;
sigemptyset(&sig_action.sa_mask);
VERBOSE("Blocking signal %d\n", SIGALRM);
sigemptyset(&sig_mask);
sigaddset(&sig_mask, SIGALRM);
if(sys_sigaction(SIGALRM, &sig_action, NULL)) {
ERROR("Could not assign sigaction\n");
return -1;
}
if (sigprocmask(SIG_SETMASK, &sig_mask, NULL) == -1) {
ERROR("sigprocmask failed\n");
return -1;
}
memset (&sig_event, 0, sizeof (struct sigevent));
sig_event.sigev_notify = SIGEV_SIGNAL;
sig_event.sigev_signo = SIGALRM;
sig_event.sigev_value.sival_ptr = &timer;
if (sys_timer_create(CLOCK_PROCESS_CPUTIME_ID, &sig_event, &timer)) {
ERROR("Could not create timer\n");
return -1;
}
if (sigprocmask(SIG_UNBLOCK, &sig_mask, NULL) == -1) {
ERROR("sigprocmask unblock failed\n");
return -1;
}
cycles = 0;
VERBOSE("Entering main loop\n");
if(sys_timer_settime(timer, 0, &time_period, &old)) {
ERROR("Could not set timer\n");
return -1;
}
while(1) {
ADD(CYCLES_REGISTER, 1);
}
return 0;
}
Is such an approach of taking user-space code and changing the calls alone sufficient to run the code in kernel-space?
Is such an approach of taking user-space code and changing the calls
alone sufficient to run the code in kernel-space?
Of course not! What are you doing is to call the implementation of a system call directly from kernel space, but there is not guarantee that they SYS_function has the same function definition as the system call. The correct approach is to search for the correct kernel routine that does what you need. Unless you are writing a driver or a kernel feature you don't nee to write kernel code. System calls must be only invoked from user space. Their main purpose is to offer a safe manner to access low level mechanisms offered by an operating system such as File System, Socket and so on.
Regarding signals. You had a TERRIBLE idea to try to use signal system calls from kernel space in order to receive a signal. A process sends a signal to another process and signal are meant to be used in user space, so between user space processes. Typically, what happens when you send a signal to another process is that, if the signal is not masked, the receiving process is stopped and the signal handler is executed. Note that in order to achieve this result two switches between user space and kernel space are required.
However, the kernel has its internal tasks which have exactly the same structure of a user space with some differences ( e.g. memory mapping, parent process, etc..). Of course you cannot send a signal from a user process to a kernel thread (imagine what happen if you send a SIGKILL to a crucial component). Since kernel threads have the same structure of user space thread, they can receive signal but its default behaviour is to drop them unless differently specified.
I'd recommend to change you code to try to send a signal from kernel space to user space rather than try to receive one. ( How would you send a signal to kernel space? which pid would you specify?). This may be a good starting point : http://people.ee.ethz.ch/~arkeller/linux/kernel_user_space_howto.html#toc6
You are having problem with sys_sigaction because this is the old definition of the system call. The correct definition should be sys_rt_sigaction.
From the kernel source 3.12 :
#ifdef CONFIG_OLD_SIGACTION
asmlinkage long sys_sigaction(int, const struct old_sigaction __user *,
struct old_sigaction __user *);
#endif
#ifndef CONFIG_ODD_RT_SIGACTION
asmlinkage long sys_rt_sigaction(int,
const struct sigaction __user *,
struct sigaction __user *,
size_t);
#endif
BTW, you should not call any of them, they are meant to be called from user space.
You're working in kernel space so you should start thinking like you're working in kernel space instead of trying to port a userspace hack into the kernel. If you need to call the sys_* family of functions in kernel space, 99.95% of the time, you're already doing something very, very wrong.
Instead of while (1), have it break the loop on a volatile variable and start a thread that simply sleeps and change the value of the variable when it finishes.
I.e.
void some_function(volatile int *condition) {
sleep(x);
*condition = 0;
}
volatile int condition = 1;
start_thread(some_function, &condition);
while(condition) {
ADD(CYCLES_REGISTER, 1);
}
However, what you're doing (I'm assuming you're trying to get the number of cycles the CPU is operating at) is inherently impossible on a preemptive kernel like Linux without a lot of hacking. If you keep interrupts on, your cycle count will be inaccurate since your kernel thread may be switched out at any time. If you turn interrupts off, other threads won't run and your code will just infinite loop and hang the kernel.
Are you sure you can't simply use the BogoMIPs value from the kernel? It is essentially what you're trying to measure but the kernel does it very early in the boot process and does it right.