perf tool output, magic values - arm

I ran perf with the parameter -x to print in machine readable format. The output is as follows:
1285831153,,instructions,1323535732,100.00
7332248,,branch-misses,1323535732,100.00
1316.587352,,cpu-clock,1316776510,100.00
1568113343,,cycles,1323535732,100.00
the first number is clear but then the values after the descriptions are not clear to me. Is the first one behind the description the runtime? Then why is it different? What does the 100.00 mean at the end of each line? It is not documented; I looked it up here: https://perf.wiki.kernel.org/index.php/Tutorial#Machine_readable_output

-x option of stat command is implemented in tools/perf/builtin-stat.c file as csv_output flag, and printing is static void printout function "(line 1061). Last values in the string are probably from:
print_noise(counter, noise);
print_running(run, ena);
With single run of target program (no -r 5 or -r 2 options - https://perf.wiki.kernel.org/index.php/Tutorial#Repeated_measurement) print_noise will not print anything. And print_running is printing the "run" argument twice, as value and as percentage of ena
static void print_running(u64 run, u64 ena)
{
if (csv_output) {
fprintf(stat_config.output, "%s%" PRIu64 "%s%.2f",
csv_sep,
run,
csv_sep,
ena ? 100.0 * run / ena : 100.0);
} else if (run != ena) {
fprintf(stat_config.output, " (%.2f%%)", 100.0 * run / ena);
}
}
You have run/ena = 1 (100.00%), so theses field have no useful information for you.
They are used in the case of event multiplexing (try perf stat -d or perf stat -dd; https://perf.wiki.kernel.org/index.php/Tutorial#multiplexing_and_scaling_events) when user ask perf to measure more event that can be enabled at same time (8 hardware events on intel with only 7 real hardware counting hardware units). Perf (perf_events subsystem of kernel) will enable some subsets of events and will change these subsets several times per second. Then run/ena will be proportional to the time share when this event was enabled, and run will probably show exact time amount when the event was counted. With normal human-readable perf stat this is marked when there is no [100%] for the event line; and the reported event count may be scaled (estimated) for the full running time of the program (inexact scaled).

Related

simple_copy example on pmem.io

I have created the emulated device given at http://pmem.io/2016/02/22/pm-emulation.html, successfully.
It shows the device correctly:
:~/Prakash/nvml/src/examples/libpmem$ mount | grep pmem
/dev/pmem0 on /mnt/pmemd type ext4 (rw,relatime,dax,errors=continue,data=ordered)
However, when I execute the simple_copy sample given with pmem nvml, it gives this error:
amd#amd:~/Prakash/nvml/src/examples/libpmem$ ./simple_copy logs
/dev/pmem0 pmem_map_file: File exists
amd#amd:~/Prakash/nvml/src/examples/libpmem$ ./simple_copy logs
/dev/pmem0/logs pmem_map_file: Not a directory
Am I not using the program correctly?
Also, I have mounted the device as dax and I clearly see the performance advantage with
:~/Prakash/nvml/src/examples/libpmem$ sudo dd if=/dev/zero of=/dev/pmem0 bs=2G count=1
0+1 records in
0+1 records out
2147479552 bytes (2.1 GB, 2.0 GiB) copied, 0.910729 s, 2.4 GB/s
:~/Prakash/nvml/src/examples/libpmem$ sudo dd if=/dev/zero of=/mnt/pmem0/test bs=2G count=1
0+1 records in
0+1 records out
2147479552 bytes (2.1 GB, 2.0 GiB) copied, 6.39032 s, 336 MB/s
from the errors posted, is seems reasonable to believe:
without the appropriate option, it will not create a directory
without the appropriate option, it will not replace a file
If you open the example you are referring to, you will see the following:
if ((pmemaddr = pmem_map_file(argv[2], BUF_LEN,
PMEM_FILE_CREATE|PMEM_FILE_EXCL,
0666, &mapped_len, &is_pmem)) == NULL) {
perror("pmem_map_file");
exit(1);
}
This is the part that is giving you trouble. To understand why, let's look at the man 7 libpmem. You can find the relevant part here.
This is the paragraph we are interested in:
The pmem_map_file() function creates a new read/write mapping for a
file. If PMEM_FILE_CREATE is not specified in flags, the entire
existing file path is mapped, len must be zero, and mode is ignored.
Otherwise, path is opened or created as specified by flags and mode,
and len must be non-zero. pmem_map_file() maps the file using mmap(2),
but it also takes extra steps to make large page mappings more likely.
So, the pmem_map_file function effectively calls open(2) and then mmap(2). In the simple_copy.c example we can observe that the flags which were used are: PMEM_FILE_CREATE and PMEM_FILE_EXCL, and as we can learn from the manpage, they roughly translate to O_CREAT and O_EXCL respectively.
This means that the error messages are correct and you've received them because in your first attempt you've provided an existing file, whilst on the second attempt you tried a directory.
There's an in-depth explanation of libpmem here.

Is there a way to calculate I/O and memory of current process in C?

If I use
/usr/bin/time -f"%e,%P,%M,%I,%O"
I get (for the last three placeholders) the memory the process used, and if there was some input and output during it.
Obviously, it's easy to get %e or something like it using sys/time.h, but is there a way to get %M, %I and %O programmatically?
You could read and parse the files in the /proc filesystem. /proc/self refers to the process accessing the /proc filesystem.
/proc/self/statm contains information about memory usage, measured in pages. Sample output:
% cat /proc/self/statm
1115 82 63 12 0 79 0
Fields are size resident share text lib data dt; see the proc manual page for some additional details.
/proc/self/io contains the I/O for the current process. Sample output:
% cat /proc/self/io
rchar: 2012
wchar: 0
syscr: 6
syscw: 0
read_bytes: 0
write_bytes: 0
cancelled_write_bytes: 0
Unfortunately, io isn't documented in the proc manual page (at least on my Debian system). I had too check the iotop source code to see how it obtained the per process I/O information.

Linux-kernel: printk from "open" syscall don't work

I have a doubt.
I opened the kernel and I changed the directory linux-3.1.1/fs/open.c
I changed the follow code in the open.c.
SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, int, mode)
{
long ret;
printk(KERN_EMERG "Testing\n");
...
}
I put this line only: printk(KERN_EMERG "Testing");
And I include the libraries:<linux/kernel.h> and <linux/printk.h>
So I compiled and rebooted my linux(Ubuntu).
During the rebooting appeared a lot of "Testing" on the screen.
So up to now its Ok.
But now I have a problem.
I created this program in c.
int main()
{
size_t filedesc = open("testefile2.txt",O_CREAT | O_WRONLY,0640);
printf("%d",filedesc);
}
I compiled this program and executed and works good.
But I don´t understand why the "Testing" didn't appeared on the shell.
I mean , if when I reboot the pc appeared a lot of the word "Testing" , why this word doens´t appear when I execute the program above.
Just to add I include this libraries in this code above:
unistd.h , fcntl.h , stdio.h , stdlib.h
Thank you guys.
printk calls appear in the kernel message buffer, not in your process' stdout/stderr
But I don´t understand why the "Testing" didn't appeared on the shell.
I think, this is effect of printk's messages suppression. (more exactly:rate limiting)
Check the messages log or console for
printk: ### messages suppressed.
string.
This feature will stop printing a message, if there were a lot of messages in recent time.
Actual code is as 3.1 kernel: http://lxr.linux.no/#linux+v3.1.1/kernel/printk.c#L1621
1621 * printk rate limiting, lifted from the networking subsystem.
1622 *
1623 * This enforces a rate limit: not more than 10 kernel messages
1624 * every 5s to make a denial-of-service attack impossible.
1625 */
1626 DEFINE_RATELIMIT_STATE(printk_ratelimit_state, 5 * HZ, 10);
1627
1628 int __printk_ratelimit(const char *func)
So, As the open syscall is very-very popular (just do an strace -e open /bin/ls - I'll get 15 open syscalls for just starting an simplest ls), the rate limiting will be in effect. It will limit your message to be printed only one time in 5 seconds; not more than 10 messages in single "burst".
I can only suggest to create a special user with known UID and add an UID checking before printk in your additional printk-in-open code.

How can I run this DTrace script to profile my application?

I was searching online for something to help me do assembly line profiling. I searched and found something on http://www.webservertalk.com/message897404.html
There are two parts of to this problem; finding all instructions of a particular type (inc, add, shl, etc) to determine groupings and then figuring out which are getting executed and summing correcty. The first bit is tricky unless grouping by disassembler is sufficient. For figuring which instructions are being executed, Dtrace is of course your friend here( at least in userland).
The nicest way of doing this would be instrument only the begining of each basic block; finding these would be a manual process right now... however, instrumenting each instruction is feasible for small applications. Here's an example:
First, our quite trivial C program under test:
main()
{
int i;
for (i = 0; i < 100; i++)
getpid();
}
Now, our slightly tricky D script:
#pragma D option quiet
pid$target:a.out::entry
/address[probefunc] == 0/
{
address[probefunc]=uregs[R_PC];
}
pid$target:a.out::
/address[probefunc] != 0/
{
#a[probefunc,(uregs[R_PC]-address[probefunc]), uregs[R_PC]]=count();
}
END
{
printa("%s+%#x:\t%d\t%#d\n", #a);
}
main+0x1: 1
main+0x3: 1
main+0x6: 1
main+0x9: 1
main+0xe: 1
main+0x11: 1
main+0x14: 1
main+0x17: 1
main+0x1a: 1
main+0x1c: 1
main+0x23: 101
main+0x27: 101
main+0x29: 100
main+0x2e: 100
main+0x31: 100
main+0x33: 100
main+0x35: 1
main+0x36: 1
main+0x37: 1
From the example given, this is exactly what i need. However I have no idea what it is doing, how to save the DTrace program, how to execute with the code that i want to get the results of. So i opened this hoping some people with good DTrace background could help me understand the code, save it, run it and hopefully get the results shown.
If all you want to do is run this particular DTrace script, simply save it to a .d script file and use a command like the following to run it against your compiled executable:
sudo dtrace -s dtracescript.d -c [Path to executable]
where you replace dtracescript.d with your script file name.
This assumes that you have DTrace as part of your system (I'm running Mac OS X, which has had it since Leopard).
If you're curious about how this works, I wrote a two-part tutorial on using DTrace for MacResearch a while ago, which can be found here and here.

How many files can i have opened at once?

On a typical OS how many files can i have opened at once using standard C disc IO?
I tried to read some constant that should tell it, but on Windows XP 32 bit that was a measly 20 or something. It seemed to work fine with over 30 though, but i haven't tested it extensively.
I need about 400 files opened at once at max, so if most modern OS's support that, it would be awesome. It doesn't need to support XP but should support Linux, Win7 and recent versions of Windows server.
The alternative is to write my own mini file system which i want to avoid if possible.
On Linux, this is dependent on the amount of available file descriptors.
You can use ulimit -n to set / show the number of available FD's per shell.
See these instructions to how to check (or change) the value of available total FD:s in Linux.
This IBM support article suggests that on Windows the number is 512, and you can change it in the registry (as instructed in the article)
As open() returns the fd as int - size of int limits also the upper limit.
(irrelevant as INT_MAX is a lot)
A process can query the limit using the getrlimit system-call.
#include<sys/resource.h>
struct rlimit rlim;
getrlimit(RLIMIT_NOFILE, &rlim);
printf("Max number of open files: %d\n", rlim.rlim_cur-1);
FYI, as root, you have first to modify the 'nofile' item in /etc/security/limits.conf . For example:
* hard nofile 10240
* soft nofile 10240
(changes in limits.conf typically take effect when the user logs in)
Then, users can use the ulimit -n bash command. I've tested this with up to 10,240 files on Fedora 11.
ulimit -n <max_number_of_files>
Lastly, all this is limited by the kernel limit, given by: (I guess you could echo a value into this to go even higher... at your own risk)
cat /proc/sys/fs/file-max
Also, see http://www.karakas-online.de/forum/viewtopic.php?t=9834

Resources