I am trying to implement my own new schedule(). I want to debug my code.
Can I use printk function in sched.c?
I used printk but it doesn't work. What did I miss?
Do you know how often schedule() is called? It's probably called faster than your computer can flush the print buffer to the log. I would suggest using another method of debugging. For instance running your kernel in QEMU and using remote GDB by loading the kernel.syms file as a symbol table and setting a breakpoint. Other virtualization software offers similar features. Or do it the manual way and walk through your code. Using printk in interrupt handlers is typically a bad idea (unless you're about to panic or stall).
If the error you are seeing doesn't happen often think of using BUG() or BUG_ON(cond) instead. These do conditional error messages and shouldn't happen as often as a non-conditional printk
Editing the schedule() function itself is typically a bad idea (unless you want to support multiple run queue's etc...). It's much better and easier to instead modify a scheduler class. Look at the code of the CFS scheduler to do this. If you want to accomplish something else I can give better advice.
It's not safe to call printk while holding the runqueue lock. A special function printk_sched was introduced in order to have a mechanism to use printk when holding the runqueue lock (https://lkml.org/lkml/2012/3/13/13). Unfortunatly it can just print one message within a tick (and there cannot be more than one tick when holding the run queue lock because interrupts are disabled). This is because an internal buffer is used to save the message.
You can either use lttng2 for logging to user space or patch the implementation of printk_sched to use a statically allocated pool of buffers that can be used within a tick.
Try trace_printk().
printk() has too much of an overhead and schedule() gets called again before previous printk() calls finish. This creates a live lock.
Here is a good article about it: https://lwn.net/Articles/365835/
It depends, basically it should be work fine.
try to use dmesg in shell to trace your printk if it is not there you apparently didn't invoked it.
2396 if (p->mm && printk_ratelimit()) {
2397 printk(KERN_INFO "process %d (%s) no longer affine to cpu%d\n",
2398 task_pid_nr(p), p->comm, cpu);
2399 }
2400
2401 return dest_cpu;
2402 }
there is a sections in sched.c that printk doesn't work e.g.
1660 static int double_lock_balance(struct rq *this_rq, struct rq *busiest)
1661 {
1662 if (unlikely(!irqs_disabled())) {
1663 /* printk() doesn't work good under rq->lock */
1664 raw_spin_unlock(&this_rq->lock);
1665 BUG_ON(1);
1666 }
1667
1668 return _double_lock_balance(this_rq, busiest);
1669 }
EDIT
you may try to printk once in 1000 times instead of each time.
Related
While referring to memory module of Linux kernel some functions are not clear to me. One of the functions is shown below:
static inline int __kprobes notify_page_fault(struct pt_regs *regs)
{
int ret = 0;
/* kprobe_running() needs smp_processor_id() */
if (kprobes_built_in() && !user_mode_vm(regs)) {
preempt_disable();
if (kprobe_running() && kprobe_fault_handler(regs, 14))
ret = 1;
preempt_enable();
}
return ret;
}
I am confused with the "__kprobes" between return type and function name. When I looked at the initialization of "__kprobes" in compiler.h, I found below:
/*Ignore/forbid kprobes attach on very low level functions marked by
this attribute:*/
#ifdef CONFIG_KPROBES
# define __kprobes __attribute__((__section__(".kprobes.text")))
#else
# define __kprobes
#endif
Well, I know that at compile time __kprobe is going to be replaced with its defined part.
Questions:
1.) What is the significance of __attribute__((__section__(".kprobes.text")))?
and
2.) What does it do at compile time and at run time when it is used before "function_name"?
I read about kprobe and found that it has to do something about breakpoints and back trace. What I understand about kprobe is it will help debugger in creating back traces and breakpoints. Could someone please explain me in simple words how does it really works and please correct me if I am wrong.
TL;DR
__attribute__((__section__(".kprobes.text"))) will place that function in separate section which is not findable by kprobes thus preventing infinite breakpoints.
You must use it before "function_name" to place whole "function_name" symbol in separate section.
Real answer
kprobes (kernel probes) is Linux kernel mechanism for dynamic tracing. It allows you to insert breakpoint at almost any kernel function, invoke your handler and then continue executing. It works by runtime patching kernel image with so-called kernel probe/kprobe - see struct kprobe. This probe will allow you to pass control to your handler, and that handler is usually do some tracing.
So, what's going under the hood:
You create your struct kprobe by defining address at which to break and handler to pass reference.
You register your probe with register_kprobe
Kernel kprobe subsystem finds address from your probe
Then kprobe:
inserts breakpoint CPU instruction (int 3 for x86) at given address
adds some wrapper code to save context(registers, etc.)
adds even more code to help you get access to function arguments or return values.
Now when kernel execution hits that probed address:
it will fall into CPU trap
it will save context
it will pass control to your handler via notifier_call_chain
...
after all it will restore context
That's how it works. As you can see it's a really neat and dirty hack, but some kernel function is so terribly low-level that it's just pointless to probe them. notify_page_fault is one of those functions - as a part of notifier_call_chain it's used in passing control to your handler.
So if you probe at notify_page_fault you'll get infinite loop of breakpoints, which is not what you want. What you really want is to protect that kind of functions and kprobes do this by placing it in separate section .kprobes.text. This will prevent to probe at that functions because kprobe will not lookup for address in that section. And that's a job for __attribute__((__section__(".kprobes.text"))).
I would like to be able to 'capture' an hrtimer interrupt with a linux kernel module and replay the interrupt at a later period in time. Any thoughts on how to go about doing this?
Use case: A program that calls sleep(1). My module will grab the hrtimer interrupt when it fires after 1 second, wait for 'x' amount of time, then re-fire the interrupt, waking the process.
Note: I do not want to hook the sleep system call.
Thanks!
Quite honestly, writing a Linux kernel module just to modify the behavior of sleep() for a single application sounds like overkill.
For most cases you should be able to use a preloadable shared object to intercept/override the sleep() function family with your own implementations. The application will call your implementation and your code may then call the real function with a modified parameter list.
This method is much simpler and less intrusive than anything involving kernel programming, although it will not work if your application is statically linked or if it uses direct system calls instead of library functions.
I want to write a LKM (Linux Kernel Module) that hijacks the realtime clock (interrupt 8). So I want the interrupt to be set to my function and at some point send it back to the old function.
I have tried to use the request_irq function without any success, probably because the kernel function that is there is not willing to share the interrupt (which is a good decision I guess).
I also tried to edit the IDT (Interrupt Descriptor Table), according to some pages I found. Non of them worked, most didn't even compile since they where written for kernel 2.6, and I'm working with 3.10.
This is a simplified code that I have just to give you the idea of what I'm doing.
kpage =__get_free_page( GFP_KERNEL);
asm("sidt %0": : "m"(*idtr) : );
memcpy(kpage, idtr, 256*sizeof(kpage));
newidt = (unsigned long long *)(*(unsigned long*)(idtr+1));
newidt[8] = &my_function;
asm("lidt %0": "=m"(newidt):);
All my attempts ended in good times with a segmentation fault, and in bad times with the kernel crashing forcing me to reboot (luckily I work with a virtual machine and snapshots).
So how can I hijack the realtime interrupt so it does my stuff? (And then send it back to the original function to get executed.)
Here is some nice code to change the pagefault function on the IDT. I couldn't make it work, since it's also written for kernel 2.6. This question is also worth looking into.
To get the bounty please publish working code, or at least sufficient info to make it run.
This can help you : http://cormander.com/2011/12/how-to-hook-into-hijack-linux-kernel-functions-via-lkm/
Why not you simply hook a function that is call every x steps like you want and execute what ever you need ?
How would be the correct way to prevent a soft lockup/unresponsiveness in a long running while loop in a C program?
(dmesg is reporting a soft lockup)
Pseudo code is like this:
while( worktodo ) {
worktodo = doWork();
}
My code is of course way more complex, and also includes a printf statement which gets executed once a second to report progress, but the problem is, the program ceases to respond to ctrl+c at this point.
Things I've tried which do work (but I want an alternative):
doing printf every loop iteration (don't know why, but the program becomes responsive again that way (???)) - wastes a lot of performance due to unneeded printf calls (each doWork() call does not take very long)
using sleep/usleep/... - also seems like a waste of (processing-)time to me, as the whole program will already be running several hours at full speed
What I'm thinking about is some kind of process_waiting_events() function or the like, and normal signals seem to be working fine as I can use kill on a different shell to stop the program.
Additional background info: I'm using GWAN and my code is running inside the main.c "maintenance script", which seems to be running in the main thread as far as I can tell.
Thank you very much.
P.S.: Yes I did check all other threads I found regarding soft lockups, but they all seem to ask about why soft lockups occur, while I know the why and want to have a way of preventing them.
P.P.S.: Optimizing the program (making it run shorter) is not really a solution, as I'm processing a 29GB bz2 file which extracts to about 400GB xml, at the speed of about 10-40MB per second on a single thread, so even at max speed I would be bound by I/O and still have it running for several hours.
While the posed answer using threads might possibly be an option, it would in reality just shift the problem to a different thread. My solution after all was using
sleep(0)
Also tested sched_yield / pthread_yield, both of which didn't really help. Unfortunately I've been unable to find a good resource which documents sleep(0) in linux, but for windows the documentation states that using a value of 0 lets the thread yield it's remaining part of the current cpu slice.
It turns out that sleep(0) is most probably relying on what is called timer slack in linux - an article about this can be found here: http://lwn.net/Articles/463357/
Another possibility is using nanosleep(&(struct timespec){0}, NULL) which seems to not necessarily rely on timer slack - linux man pages for nanosleep state that if the requested interval is below clock granularity, it will be rounded up to clock granularity, which on linux depends on CLOCK_MONOTONIC according to the man pages. Thus, a value of 0 nanoseconds is perfectly valid and should always work, as clock granularity can never be 0.
Hope this helps someone else as well ;)
Your scenario is not really a soft lock up, it is a process is busy doing something.
How about this pseudo code:
void workerThread()
{
while(workToDo)
{
if(threadSignalled)
break;
workToDo = DoWork()
}
}
void sighandler()
{
signal worker thread to finish
waitForWorkerThreadFinished;
}
void main()
{
InstallSignalHandler;
CreateSemaphore
StartThread;
waitForWorkerThreadFinished;
}
Clearly a timing issue. Using a signalling mechanism should remove the problem.
The use of printf solves the problem because printf accesses the console which is an expensive and time consuming process which in your case gives enough time for the worker to complete its work.
I found tsc2007 driver and modified according to our needs. Our firm is producing its own TI DM365 board. In this board we used TSC2007 and connected PENIRQ pin to GPIO0 of DM365. It is being seen OK on driver. when i touch to touchscreen cursor is moving but at the same time i am getting
BUG: scheduling while atomic: swapper /0x00000103/0, CPU#0
warning and embedded Linux is being crashed. there are 2 files that i modified and uploaded to http://www.muhendislikhizmeti.com/touchscreen.zip one is with timer the other is not. it is giving this error in any case.
I found a solution on web that i need to use work queue and call with using schedule_work() API. but they are blur for me now. Is anybody have any idea how to solve this problem and can give me some advice where to start to use work queue.
"Scheduling while atomic" indicates that you've tried to sleep somewhere that you shouldn't - like within a spinlock-protected critical section or an interrupt handler.
Common examples of things that can sleep are mutex_lock(), kmalloc(..., GFP_KERNEL), get_user() and put_user().
Exactly as said in 1st answer, scheduling while atomic happens when the scheduler gets confused and therefore unable to work properly and this because the scheduler tried to perform a "schedule()" in a section that contains a schedulable code inside of a non schedulable one.
For example using sleeps inside of a section protected by a spinlock. Trying to use another lock(semaphores,mutexes..) inside of a spinlock-proteced code may also disturb the scheduler. In addition using spinlocks in user space can drive the scheduler to behave as such. Hope this helps
For anyone else with a similar error - I had this problem because I had a function, called from an atomic context, that used kzalloc(..., GFP_KERN) when it should have used GFP_NOWAIT or GFP_ATOMIC.
This is just one example of a function sleeping when you don't want to, which is something you have to be careful of in kernel programming.
Hope this is useful to somebody else!
Thanks for the former two answers, in my case it was enough to disable the preemption:
preempt_disable();
// Your code with locks and schedule()
preempt_enable();