I have written kernel module to measure the correctness of ndelay() kernel function.
#include <linux/module.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/time.h>
#include <linux/delay.h>
static int __init initialize(void)
{
ktime_t start, end;
s64 actual_time;
int i;
for(i=0;i<10;i++)
{
start = ktime_get();
ndelay(100);
end = ktime_get();
actual_time = ktime_to_ns(ktime_sub(end, start));
printk("%lld\n",(long long)actual_time);
}
return 0;
}
static void __exit final(void)
{
printk(KERN_INFO "Unload module\n");
}
module_init(initialize);
module_exit(final);
MODULE_AUTHOR("Bhaskar");
MODULE_DESCRIPTION("delay of 100ns");
MODULE_LICENSE("GPL");
the dmesg output is like this:
[16603.805783] 514
[16603.805787] 350
[16603.805789] 373
[16603.805791] 323
[16603.805793] 362
[16603.805794] 320
[16603.805796] 331
[16603.805797] 312
[16603.805799] 304
[16603.805801] 350
I have gone through one of the posts in stackoverflow:
Why udelay and ndelay is not accurate in linux kernel?
But I want a fine tuned nanosecond delay (probably in the range of 100-250ns) in kernel space. Can anyone please suggest me any alternative for doing this?
You can use
High resolution timers (or hrtimers)
hrtimer_init
hrtimer_start
hrtimer_cancel
functions. An example is available here
If you are targeting x86 only system, you can use rdtsc() call to get the CPU clock counts. The rdtsc() api has very little overhead. But you do need to convert from CPU clock to the ns, it is dependent on how fast your CPU clock is running.
static unsigned long long rdtsc(void)
{
unsigned int low, high;
asm volatile("rdtsc" : "=a" (low), "=d" (high));
return low | ((unsigned long long)high) << 32;
}
Otherwise you can use the kernel high resolution timers API.
The high-resolution timer API
Linux Kernel ktime - high resolution timer type, APIs, RDTSC usages
Related
Is it possible to run m-type instruction of RISC-V isa from userspace? I have to run misa instruction that provides cpu info from a userspace c program, but according to RISC-V documentation it's possible at machine privilege level. Is it completely impossible or is there a workaround?
I've already tried
#include <stdio.h>
static unsigned long cpuid()
{
unsigned long res;
__asm__ volatile ("csrr %0, misa" : "=r"(res));
return res;
}
int main() {
unsigned long mcpuid = cpuid();
printf("mcpuid: %lu", mcpuid);
return 0;
}
Running it on a RISC-V fedora I get Illegal instruction
I am developing a linux kernel module, which looks like this:
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/init.h>
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Me");
MODULE_DESCRIPTION("Something Something");
int checkSomething(void) {
int someCpuFeature = 0;
__asm__("mov $1, %eax");
__asm__("cpuid");
__asm__("mov %%ecx, %0" : "=r" (someCpuFeature));
if (someCpuFeature & 32) {
return 1;
}
return 0;
}
int __init init_module(void) {
if (!checkSomething()) {
printk(KERN_INFO "Exiting\n");
return 0;
} else {
printk(KERN_INFO "Continuing\n");
}
return 0;
}
static void __exit exit_module(void) {
printk(KERN_INFO "Unloading Module\n");
}
And when i loaded it, i tried to see it's output from dmesg.
but instead of only printing Exiting/Continuing, it also printed a call trace,
and said BUG: scheduling while atomic: insmod/24641/0x06100800.
I searched this bug and found that it has some connection to the scheduler and sleeping at places you shouldn't sleep in, but this is the only functionality of the code,
So i think it has something to do with the cpuid instruction, but i don't know exactly what it is.
any ideas?
Hm, considering the interrupts raised on my machine now, it would look like your assembler is incorrect.
Instead of writing this yourself, I'd recommend relying on the cpuid kernel module (and thus, reading /dev/cpu/NUMBER/cpuid after seeking to your level of interest), as that's future-proof external API. You can also look at /arch/x86/include/asm/processor.h, and use kernel functions like cpu_has to detect your feature. Don't reinvent the wheel – querying details of a CPU on an SMP machine is bound to be painful, and the poor kernel developers had to go through the pain to make this work themselves.
So I have just tested the available clocks on an embedded system with a 2.6.31 kernel.
Some simple test code:
#include <stdio.h>
#include <time.h>
int main(int argc, const char* argv[]) {
struct timespec clock_resolution;
printf("This system uses a timespec with %ldB in tv_sec and %ldB in tv_nsec.\n", (long)sizeof(clock_resolution.tv_sec), (long)sizeof(clock_resolution.tv_nsec));
printf("An int is %ldB on this system, a long int is %ldB.\n", (long)sizeof(int), (long)sizeof(long));
if (clock_getres(CLOCK_MONOTONIC, &clock_resolution))
perror("Can't get CLOCK_MONOTONIC resolution time");
printf("CLOCK_MONOTONIC has precision of %lds and %ldns on this system.\n", (long)clock_resolution.tv_sec, (long)clock_resolution.tv_nsec);
if (clock_getres(CLOCK_MONOTONIC_RAW, &clock_resolution))
perror("Can't get CLOCK_MONOTONIC_RAW resolution time");
printf("CLOCK_MONOTONIC_RAW has precision of %lds and %ldns on this system.\n", (long)clock_resolution.tv_sec, (long)clock_resolution.tv_nsec);
printf("Casted to unsigned this is %lus and %luns.\n", (unsigned long)clock_resolution.tv_sec, (unsigned long)clock_resolution.tv_nsec);
return 0;
}
On an Ubuntu 20.04 (5.4.0-52) VM on an x86 host it results in:
This system uses a timespec with 8B in tv_sec and 8B in tv_nsec.
An int is 4B on this system, a long int is 8B.
CLOCK_MONOTONIC has precision of 0s and 1ns on this system.
CLOCK_MONOTONIC_RAW has precision of 0s and 1ns on this system.
Casted to unsigned this is 0s and 1ns.
On the ARM based NXP i.MX257 controller it results in:
This system uses a timespec with 4B in tv_sec and 4B in tv_nsec.
An int is 4B on this system, a long int is 4B.
CLOCK_MONOTONIC has precision of 0s and 1ns on this system.
CLOCK_MONOTONIC_RAW has precision of 0s and -1070597342ns on this system.
Casted to unsigned this is 0s and 3224369954ns.
This seems somewhat off to me!? As an unsigned value that ns-resolution is 3224369954ns, so over 3s.
Edit to clarify some things:
Error checks on the clock_getres() calls don't trigger.
Controller is a NXP i.MX257.
For the ARM target: gcc 6.5, Kernel 2.6.31, uClibc-ng 1.0.30 (based on a buildroot environment)
I am trying to understand how many processors are supported by Linux Kernel.
grep NR_CPUS /boot/config-`uname -r`
Will give me the maximum number of processors supported by kernel, which I can override using kernel command line parameter nr_cpus.
To find number of online cpus, i can use num_online_cpus() function
Then what is nr_cpu_ids?
#include <linux/kernel.h>
#include <linux/module.h>
MODULE_LICENSE("GPL");
static int __init test_hello_init(void)
{
pr_info("%s: In init NR_CPUs=%d, nr_cpu_ids=%d\n", __func__, NR_CPUS, nr_cpu_ids);
pr_info("Number of cpus available:%d\n", num_online_cpus());
return -1;
}
static void __exit test_hello_exit(void)
{
pr_info("%s: In exit\n", __func__);
}
module_init(test_hello_init);
module_exit(test_hello_exit);
[11548.627338] test_hello_init: In init NR_CPUs=8192, nr_cpu_ids=128
[11548.627340] Number of cpus available:6
What is the difference between NR_CPUs and ncr_cpu_ids. Are they not same?
nr_cpu_ids is the total number of CPUs or processors in the machine while NR_CPUS is the total number of CPUs the Linux O/S can handle.
I wrote a simple program to determine if i can get nanosecond precision on my system, which is a RHEL 5.5 VM (kernel 2.6.18-194).
// cc -g -Wall ntime.c -o ntime -lrt
#include <inttypes.h>
#include <stdint.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <stdlib.h>
int main(int argc, char* argv[]) {
struct timespec spec;
printf("CLOCK_REALTIME - \"Systemwide realtime clock.\":\n");
clock_getres(CLOCK_REALTIME, &spec);
printf("\tprecision: %ldns\n", spec.tv_nsec);
clock_gettime(CLOCK_REALTIME, &spec);
printf("\tvalue : %010ld.%-ld\n", spec.tv_sec, spec.tv_nsec);
printf("CLOCK_MONOTONIC - \"Represents monotonic time. Cannot be set.\":\n");
clock_getres(CLOCK_MONOTONIC, &spec);
printf("\tprecision: %ldns\n", spec.tv_nsec);
clock_gettime(CLOCK_MONOTONIC, &spec);
printf("\tvalue : %010ld.%-ld\n", spec.tv_sec, spec.tv_nsec);
return 0;
}
A sample output:
CLOCK_REALTIME - "Systemwide realtime clock.":
precision: 999848ns
value : 1504781052.328111000
CLOCK_MONOTONIC - "Represents monotonic time. Cannot be set.":
precision: 999848ns
value : 0026159205.299686941
So REALTIME gives me the local time and MONOTONIC the system's uptime. Both clocks seem to have a μs precision (999848ns ≅ 1ms), even though MONOTONIC outputs in nanoseconds, which is confusing.
man clock_gettime states:
CLOCK_REALTIME_HR
High resolution version of CLOCK_REALTIME.
However, grep -R CLOCK_REALTIME_HR /usr/include/ | wc -l returns 0 and trying to compile results in error: ‘CLOCK_REALTIME_HR’ undeclared (first use in this function).
I was trying to determine if i could get the local time in nanosecond precision, but either my code has a bug or maybe this feature isn't entirely supported in 5.5 (or the VM's HPET is off, or something else).
Can i get local time in nanoseconds in this system? What am i doing wrong?
EDIT
Well the answer seems to be No.
While nanosecond precision can be achieved, the system doesn't guarantee nanosecond accuracy in this scenario (here's a clear answer on the difference rather than a rant). Typical COTS hardware doesn't really handle it (another answer in the right direction).
I'm still curious as to why do the clocks report the same clock_getres resolution yet MONOTONIC yields what seems to be nanosecond values while REALTIME yields microseconds.
RHEL5 is really ancient at this point, you should consider upgrading. On a newer system (Ubuntu 16.04) your program produces:
CLOCK_REALTIME - "Systemwide realtime clock.":
precision: 1ns
value : 1504783164.686220185
CLOCK_MONOTONIC - "Represents monotonic time. Cannot be set.":
precision: 1ns
value : 0000537257.257923964