I have a low latency server/client audio application running on seperate cores. (via cpuset)
No xruns are detected, I suspect the scheduler to interrupt my critical routine. Since disabling interrupts is not possible in user space my idea was to create a kernel module and write wrapper functions for local_irq_disable()/local_irq_enable(). Security is not an issue. An rt Linux with fully preemptible kernel is in use.
I assume the nanosleep function won't work without interrupts too?
What would be the more elegant way to disable the scheduler but keep
the timers running?
How do I call these wrapper functions from user space?
Edit: SMP affinity is the keyword here: SMP IRQ Affinity
I would recommend to first try to trace your kernels scheduling by using lttng and tracecompass to find out what really is happening on your system and what other process, ressource lock or interrupt is forcing your process out of cpu.
Do you really need this process to run without cpu time gaps? If not: do you need response time in micoroseconds or milliseconds?
Related
Context:
Linux 64.
Intel Core 2 duo.
Question:
Where does the Linux kernel "communicate" with the cpu ?
I read the source code for scheduler but could not understand how they communicate and how the kernel tells the cpu that something need to be processed.
I understand that there are run queues, but isn't there something that enables the kernel to interrupt the cpu via the bus ?
Update
It expands my initial questions a bit : How can we tell the cpu where the task queues are ?
Because the cpu has to poll something, and i guess we tell it at some point. Missed that point in the kernel code.
I will try to write a simplified explanation of how it works, tell me if anything is unclear.
A CPU only do one thing : execute instructions. It will start at a predefined address, and execute. That's all. Sometime you can have an interrupt, that will temporarily make the CPU jump to another instruction.
A kernel is a program (=a sequence of instructions) that will make it easy to execute other programs. The kernel will do his business to setup what it needs. This often include building a list of process to run. The definition of "process" is totally up to the kernel because, as you know, the CPU only do one thing.
Now, when the kernel runs (being executed by the CPU), it might decide that one process needs to be executed. To do so, the kernel will simply jump to the process program. How it is done doesn't matter, but in most OSes, the kernel will map a periodic interrupt (the CPU will periodically jump) to a function that decide which process to execute and jump to it. It isn't required, but it is convenient because programs will be forcefully "interrupted" periodically so others can also be executed.
To sum up, the CPU doesn't "know" anything. The kernel runs, and will jump to other process code to make them run. Only the kernel "knows".
The Linux kernel is a program. It doesn't "talk" to the CPU as such; the CPU has a special register, the program counter (PC), which points to the current execution of the kernel which the CPU is processing.
The kernel itself contains many services. One of them manages the task queues. Each entry in the task queue contains information about the task. One such information is the CPU core on which the task is running. When the kernel decides that the service should do some work, it will call it's functions. The functions are made up from instructions which the CPU interprets. Most of them change the state of the CPU (like advancing the PC, changing register values, setting flags, enabling/disabling CPU cores, ...).
This means the CPU isn't polling anything. Depending on the scheduler, different strategies are used to process the task queue. The most simple one is timer based: The kernel install a timer interrupt (i.e. it writes the address of an interrupt handler somewhere plus it configured the timer to cause an interrupt every few milliseconds).
The handler then looks at the task queue and decides what to do, depending on its strategy.
I would like to run a long term task on a dedicated core and would like that task to be minimally interrupted / preempted. I can see 2 solutions. Which one is better or any other solution?
1) Set affinity and isolate core using isolcpus
2) Make the thread real time using SCHED_FIFO and set the priority high
- if this is the better choice how high the priority should be? Can I set it to 99?
What I am concerned about is being preempted by kernel threads, IPIs ...
Regarding the first solution you mentioned, by adding parameter isolcpus = [CPU no.] during boot will instruct Linux scheduler to not run any task on that CPU unless requested by user using CPU Affinity. But this CPU may receive interrupts and that can also be avoided by setting IRQ Affinity, so that the isolated CPU doesn’t receive any interrupt. Finally in your code of the task you set the Affinity to the isolated CPU and you are good to go.
But Even if you follow these steps, kernel tasks are executed on the isolated CPU core if you are not using a real-time kernel from RP_PREEMPT, hence it might not be possible to completely isolate a CPU core unless you are using RT kernel.
Refer - http://elinux.org/CPU_Shielding_capability
The second solution about using SCHED_FIFO scheduling policy and using a high priority value will still not prevent the kernel threads, Timer tick interrupts, IPIs etc., from pre-empting your task. Because the scheduling policies and priority is for kernel to schedule all other User-space processes and threads and does not apply to kernel threads or processes.
So by setting high priority to your task does not mean you will get 100% CPU dedicated to your task. Also the alternative, manually setting the CPU mask of your task to a CPUSET in the system, can cause problems and suboptimal load balancer performance. Your task will still get interrupted from time to time by Linux code, including other tasks - such as the timer tick interrupt and the scheduler code, IPIs from other CPUs and stuff like work queue kernel threads, although the interruption should be quite minimal if you have don’t have much activity going on in your other cores.
But the cleanest way to achieve this should come from Kernel tweak which I found from this link http://www.linuxjournal.com/article/6799?page=0,2. Though I haven’t tried this personally, I think it’s worth giving a look at this article as well before you decide upon the method you will use.
what do we mean by thread running in User mode and running in kernel mode? Is this related to thread execution instruction from User mode and thread executing instruction from Kernel mode? Kindly elaborate.
Also, is it possible that if a thread is executing in user mode is put to suspended state, then it may start executing in kernel mode? if yes, how is it possible? Until now I am only aware that a thread if suspended will be SUSPENDED completely, i.e. the context switch will take place by CPU to schedule another thread.
what do we mean by thread running in User mode and running in kernel mode?
There is no way to know what a person means by a phrase without context. If I had to guess, I'd say they are talking about whether the thread is scheduled by a user-space scheduler or a kernel scheduler. But it's also possible they are actually asking whether the thread is running user code or kernel code.
Is this related to thread execution instruction from User mode and thread executing instruction from Kernel mode? Kindly elaborate.
It could be. It also might not be. There's no way to know what a person means by a phrase without context.
Also, is it possible that if a thread is executing in user mode is put to suspended state, then it may start executing in kernel mode? if yes, how is it possible?
For implementations where the kernel schedules threads, the scheduler is running in kernel space. The code that actually suspends the thread typically runs in kernel space too because it has to add the thread to the various kernel scheduler data structures. So the thread that resumes the thread will run in kernel space too. At a higher level view, the same thread of execution can "become" the kernel scheduler, choose a user-space thread to execute, and then "become" that thread.
Until now I am only aware that a thread if suspended will be SUSPENDED completely, i.e. the context switch will take place by CPU to schedule another thread.
Right, and that's kernel code. So the same core is running user space code, then it's running kernel code, then it's running the user space code of another thread.
Modern operating systems have hardware support for separating the user code from the kernel code. On the x86 architecture you can set up memory pages that are not accessible to normal user code, and will trigger a page fault, so that the OS can survive faulty programs.
Code running in kernel mode has higher privileges, but also more responsibillities, as not everything is as easily accessible as from user space. If the user code gets stuck, then the OS can clean it up. If a kernel mode code hangs it might not be that easy, depending on how high the privilege level is.
Is there any way in Linux to assign one CPU core to a particular given process and there should not be any other processes or interrupt handlers to be scheduled on this core?
I have read about process affinity in Linux Binding Processes to CPUs using the taskset utility but that's not solving my problem because it just try to affine the given process to that core but it is possible that other processes may be scheduled on this core and this is what I want to avoid.
Should we change the kernel code for scheduling?
Yes there is. In fact, there are two separate ways to do it :-)
Right now, the best way to accomplish what you want is to do the following:
Add the parameter isolcpus=[cpu_number] to the Linux kernel command line from the boot loader during boot. This will instruct the Linux scheduler not to run any regular tasks on that CPU unless specifically requested using cpu affinity.
Use IRQ affinity to set other CPUs to handle all interrupts so that your isolated CPU will not receive any interrupts.
Use CPU affinity to fix your specific task to the isolated CPU.
This will give you the best that Linux can provide with regard to CPU isolation without out-of-tree and in-development patches.
Your task will still get interrupted from time to time by Linux code, including other tasks - such as the timer tick interrupt and the scheduler code, IPIs from other CPUs and stuff like work queue kernel threads, although the interruption should be quite minimal.
For an (almost) complete list of interruption sources, check out my page at https://github.com/gby/linux/wiki
The alternative method is to use cpusets which is way more elegant and dynamic but suffers from some weaknesses at this point in time (no migration of timers for example) which makes me recommend the old, crude but effective isolcpus parameter.
Note that work is currently being done by the Linux community to address all these issues and more to give even better isolation.
There is Redhat article talking about it. It modifies the boot parameter isolcpus.
And an old article written by Robert Love. And there is solution in that article.
All of a process' children receive the same CPU affinity mask as their
parent.
Then, all we need to do is have init bind itself to one processor.
All other processes, by nature of init being the root of the process
tree and thus the superparent of all processes, are then likewise
bound to the one processor.
Dedicate a Whole CPU Core to a Particular Program
While taskset allows a particular program to be assigned to certain CPUs, that does not mean that no other programs or processes will be scheduled on those CPUs. If you want to prevent this and dedicate a whole CPU core to a particular program, you can use "isolcpus" kernel parameter, which allows you to reserve the CPU core during boot.
Add the kernel parameter "isolcpus=" to the boot loader during boot or GRUB configuration file. Then the Linux scheduler will not schedule any regular process on the reserved CPU core(s), unless specifically requested with taskset. For example, to reserve CPU cores 0 and 1, add "isolcpus=0,1" kernel parameter. Upon boot, then use taskset to safely assign the reserved CPU cores to your program.
Source(s)
http://xmodulo.com/2013/10/run-program-process-specific-cpu-cores-linux.html
http://www.linuxtopia.org/online_books/linux_kernel/kernel_configuration/re46.html
Even if you follow the steps in gby's answer, kernel tasks are executed on the isolated CPU core. Work is underway in the linux RT_PREEMPT real time project to improve this. So if you are not using a bleeding edge real time kernel from RP_PREEMPT, it might not be possible to completely isolate a CPU core.
As per documentation
The Linux scheduler will honor the given CPU affinity and the process will not run on any other CPUs.
There is no mention that specific processor will be given to process exclusively.
I have a task which is basically a TIMER; so it goes to sleep and is supposed to wake up periodically.. So the timer task sleeps for say 10ms. But what is happening is that it is inconsistent in waking up and cannot be relied upon to awaken in time correctly.
In fact, in my runs, there is a big difference in sleep times. Sometimes it can vary by 1-2 ms in awakening and very few times does not come back to at all. This is because the kernel scheduler puts all the sleeping and waiting tasks in a queue and then when it polls to see who is to be awakened, I think it is round robin. So sometimes the task would have expired by the time the scheduler polls again. Sometimes, when there are interrupts, the ISR gets control and delays the timer from waking up.
What is the best solution to handle this kind of problem?
(Additional details: The task is a MAC timer for a wireless network; RTOS is a u-velOSity microkernel)
You should be using the timer API provided by the OS instead on relying on the scheduler. Here's an introduction to the timer API for Linux drivers.
If you need hardcore timing, the OS scheduler is not likely to be good enough (as you've found).
If you can, use a separate timer peripheral, and use it's ISR to do as little as you can get away with (timestamping some critical data, set some flags for example) and then let your higher-jitter routine make use of that data with its less guaranteed timing.
Linux is not an RTOS, and that is probably the root of your problem.
You can render Linux more suited to real-time use in various ways and to various extent. See A comparison of real-time Linux approaches for some methods and an assessment of the level of real-time performance you can expect.