How to use Intel Advisor to profile my parallel MPI application? - c

I am working on a remote Linux server where I have my application running in parallel with MPI. I want to profile it and test how good is the load balance in each MPI process and which are the heaviest parts of the code.
To run my application in parallel I usually run it like this:
mpirun -n # ${location}/myApp arg1 arg2 etc.
In the machine there is a module about Intel Advisor which I am going to use. The GUI command
advixe-gui does not work so I have to do it with advixe-cl
In case is helpful, when I type:
advixe-cl
it returns me this:
Intel(R) Advisor Command Line Tool
Copyright (C) 2009-2019 Intel Corporation. All rights reserved.
Usage: advixe-cl <--action> [--action-option] [--global-option] [[--] <target> [target options]]
Use --help for details.
Any idea about how to proceed further with profiling?

You have to use Advisor's command line (advixe-cl) and you have to "wrap" your advixe-cl command line by mpirun. And you can copy and view obtained profiles with GUI afterwards - with individual "result view" for each rank profiled.
You can "wrap" the command line in few ways, for example (Intel MPI specific):
$ mpirun -n 1 -gtool "advixe-cl -collect survey -no-auto-finalize -project-dir /user/test/vec_project:0" /user/test/vec_samples/vec_samples
or (generic MPI with SLURM):
$ srun –n 1 –c 32 advixe-cl --collect=survey --project-dir=./adv -- ./miniFE.x
This topic is described in many details (including selective rank analysis or e.g Cray or Intel MPI specifics) in following Intel "Cookbooks" and articles:
Intel MPI-specific : Analyzing Intel MPI applications with Intel Advisor
Generic MPI, SLURM, for famous WRF workfload: Analyze Vectorization and Memory Aspects of an MPI Application "cookbook"
Advisor for MPI apps on Cray system: Analyze Performance on Cray Systems "cookbook"
Advisor Documentation chapter
Yet another article

You need to provide an action in the command line - it is not optional according to the syntax:
$ advixe-cl <--action> [--action-options] [--global-options] [[--] target [target options]]
Where action would be to either collect or report. And each command has exactly one action. For example, you cannot use both the collect and report actions in the same command.
You can review the User Guide for Advisor here.

Related

Mac kernel programming generic kernel extension prinf() not working

I've followed the Creating a Generic Kernel Extension with Xcode tutorial.
MyKext.c:
#include <sys/systm.h>
#include <mach/mach_types.h>
kern_return_t MyKext_start (kmod_info_t * ki, void * d)
{
printf("MyKext has started.\n");
return KERN_SUCCESS;
}
kern_return_t MyKext_stop (kmod_info_t * ki, void * d)
{
printf("MyKext has stopped.\n");
return KERN_SUCCESS;
}
I've also disabled the csrutil, which allow me to load my own kext.
# csrutil disable
When I load my own kext into kernel
$ sudo kextload -v /tmp/MyKext.kext
The result of printf() not write into /var/log/system.log.
I've also set boot-args
$ sudo nvram boot-args="original_contents debug=0x4"
Can anyone help me out?
Apparently, since Sierra (10.12) at least, they reorganized the way the logs are written (iOS support?), so you cannot see it in system.log anymore. Still, in your Console application, you have in the sidebar a Devices section, where you can select your device (usually your Mac system) and see real-time log limited to "kernel" in the search box. So I can see these when using kext load/kextunload:
default 11:58:27.608228 +0200 kernel MyKext has started.
default 11:58:34.446824 +0200 kernel MyKext has stopped.
default 11:58:44.803350 +0200 kernel MyKext has started.
There is no need for the csrutil and nvram changes.
Important For some freaky reason, I needed to restart the Console to reflect my messages changes, otherwise it has showing the ones (start & stop) from the previous build. Very strange indeed!
Later To recover old logs, try sudo log collect --last 1d and open the result with Console(more here).
Sorry to necro-post, but I found it useful to use log(1) with one of its many commands (as suggested by #pmdj in the comments above) rather than use Console. From the manual:
log -- Access system wide log messages created by os_log, os_trace and other log-
ging systems.
For example, one can run:
log stream
to see real-time output of the system, including printf() from the MacOS kernel extension.

Catching Mach system calls using dtruss

I ran dtruss on vmmap that is a process that read the virtual memory of another remote process.
I would expect that some of mach_port system calls would appear in the output of my command, but couldn't trace any (i.e. mach_vm_read, task_for_pid, etc ..)
The exact command i ran (notice that dtruss is a wrapper script of dtrace in OS-X) :
sudo dtruss vmmap <pid_of_sample_process>
The input argument for vmmap is just a pid of any running process, and the OS version i use is 10.10 (in 10.11 there's entitlement issue when running dtruss on apple products such as vmmap).
Perhaps someone can tell me how to identify the system call i'm looking for... Should I look for the explicit name in dtruss output, or just a general call number of my desired syscall (sadly, i haven't found any of them) :
./bsd/kern/trace.codes:0xff004b10 MSG_mach_vm_read
It looks to me like it's not using Mach APIs. It's using the libproc interface. I'm seeing many proc_info() syscalls, which is what's behind library calls like proc_pidinfo().
I used:
sudo dtrace -n 'pid$target::proc_*:entry {}' -c 'vmmap <some PID>'
to trace the various libproc functions being called. I see calls to proc_name(), proc_pidpath(), and proc_pidinfo() to get information about the target process and then calls to proc_regionfilename() to get information about the VM regions.
By the way, vmmap doesn't read the memory of the other process, it just reports information about the VM regions, not their contents. So, I wouldn't expect to see mach_vm_read() or the like.

How to monitor Tuxedo via mib

Currently I am tring to write a program to monitor Tuxedo. from the official documents, I found MIB is suitable for writting program to monitor it. I have read a quite lot of document of here http://docs.oracle.com/cd/E13203_01/tuxedo/tux90/rf5/rf5.htm#998207. Although there are so many instructions of very class, there is no any guide to tell me how to use it from the beginning. I have tried to search on github however unfortuanately there is no any code relating to tuxedo mib. Does any one have some good sample code?
Thanks a lot.
Here a Shell-function that reads the blocktime from Tuxedo:
get_blocktime() {
TmpErr=/tmp/ud32err_$$
rtc=0
ud32 -Ctpsysadm <<EOF 2>$TmpErr | grep TA_BLOCKTIME | cut -f2
SRVCNM .TMIB
TA_CLASS T_DOMAIN
TA_OPERATION GET
EOF
# ud32 has no good error-handling
if [ -s $TmpErr ]; then
echo "$PRG: Error calling ud32:"
cat $TmpErr 1>&2
rtc=1
fi
rm $TmpErr
exit $rtc
}
There are several examples of accessing MIB with Python https://github.com/PacktPublishing/Modernizing-Oracle-Tuxedo-Applications-with-Python/tree/main/Chapter06. For example:
import tuxedo as t
t.tpinit(cltname="tpsysop")
machine = t.tpadmcall(
{
"TA_CLASS": "T_MACHINE",
"TA_OPERATION": "GET",
"TA_FLAGS": t.MIB_LOCAL,
}
).data
A couple of notes:
you will need the TA_FLAGS set to MIB_LOCAL to return statistics (not done by default)
you might want to use tpadmcall() function instead of calling the .TMIB service. The function is much lighter on the system and does not increase Tuxedo statistics (number of service calls). The main limitation of tpadmcall is the limited size of the response so you will need to call the .TMIB service for server and queue statistics if your application has tens of them.
If the code example is not enough, you can check the chapter 6 of the book Modernizing Oracle Tuxedo Applications with Python.
I have some C code for calling .TMIB to monitor Tuxedo application here: https://github.com/TuxSQL/tuxmon
That should get you started.

Getting user-space stack information from perf

I'm currently trying to track down some phantom I/O in a PostgreSQL build I'm testing. It's a multi-process server and it isn't simple to associate disk I/O back to a particular back-end and query.
I thought Linux's perf tool would be ideal for this, but I'm struggling to capture block I/O performance counter metrics and associate them with user-space activity.
It's easy to record block I/O requests and completions with, eg:
sudo perf record -g -T -u postgres -e 'block:block_rq_*'
and the user-space pid is recorded, but there's no kernel or user-space stack captured, or ability to snapshot bits of the user-space process's heap (say, query text) etc. So while you have the pid, you don't know what the process was doing at that point. Just perf script output like:
postgres 7462 [002] 301125.113632: block:block_rq_issue: 8,0 W 0 () 208078848 + 1024 [postgres]
If I add the -g flag to perf record it'll take snapshots of the kernel stack, but doesn't capture user-space state for perf events captured in the kernel. The user-space stack only goes up to the entry-point from userspace, like LWLockRelease, LWLockAcquire, memcpy (mmap'd IO), __GI___libc_write, etc.
So. Any tips? Being able to capture a snapshot of the user-space stack in response to kernel events would be ideal.
I'm on Fedora 19, 3.11.3-201.fc19.x86_64, Schrödinger’s Cat, with perf version 3.10.9-200.fc19.x86_64.
OK, looks like there are several parts to this:
I'm on x86_64, where most distros build with -fomit-frame-pointer by default, and perf can't follow the stack without frame pointers;
.... unless it's a newer version built with libunwind support, in which case it supports perf record -g dwarf.
See:
the patch adding libunwind support to Perf
Debian bug 725075.
linux perf: how to interpret and find hotspots
I'm on Fedora 18, but the same issue applies. So if you're profiling code you're working on (as is likely on Stack Overflow), rebuild with -fno-omit-frame-pointer and -ggdb.
I landed up rebuilding perf because I wanted to be able to compare to the stock RPMs:
sudo yum build-dep perf
sudo yum install yum-utils rpmdevtools libunwind-devel
yumdownloader --source perf or download the appropriate kernel-.....src.rpm srpm
rpmdev-setuptree
rpm -Uvh kernel-*.src.rpm
cd $HOME/rpmbuild/SPECS
rpmbuild -bp --target=$(uname -m) kernel.spec
At this point you can just build a new perf if you want:
cd $HOME/rpmbuild/BUILD/kernel-*/linux-*/tools/perf
make
... which I did and tested that the updated perf does in fact capture a useful stack if built with libunwind available.
You can also build a new rpm:
edit kernel.spec, uncomment the line %define buildid ..., change buildid to something like .perfunwind. Note it's %define not % define.
In the same spec file, find:
%global perf_make \
make %{?_smp_mflags} -C tools/perf -s V=1 WERROR=0 NO_LIBUNWIND=1 HAVE_CPLUS_DEMANGLE=1 NO_GTK2=1 NO_LIBNUMA=1 NO_STRLCPY=1 prefix=%{_prefix}
and delete NO_LIBUNWIND=1
rpmbuild -bb --without up --without mp --without pae --without debug --without doc --without headers --without debuginfo --without bootwrapper --without with_vdso_install --with perf kernel.spec to produce new perf RPMs without building the whole kernel. Or if you want, omit the --without for the kernel flavour you want, in which case you'll also want to build headers, debuginfo, etc.
sudo rpm -Uvh $HOME/rpmbuild/RPMS/x86_64/perf-*.fc19.x86_64.rpm
See the fedora project guide on building a custom kernel.
I've reported the issue to Fedora; they shouldn't be using NO_LIBUNWIND=1. See bug 1025603.
Once you have a rebuilt perf you can use perf record -g dwarf to get full stacks.

What is the most general way to list all the kernel tasks in a linux system?

I am trying to figure out the best way to write a cross platform kernel code/shell script to list all the kernel task {(pid/tid , name)} in a linux dis. machine. it should be the most general possible. I tried to use ps -T but it is seems to be inaccurate and some platform don't support it in their busybox. Any suggestions?
If you want to distinguish user processes from kernel tasks, then this is a previous discussion on the subject: Identifying kernel threads
My answer to that question does not require any tools, it simply reads the contents of /proc//stat, so it should work on any distribution.
You could try
ps -e -o pgrp= -o pid= -o cmd= | sed -ne 's/^ *0 *// p'
although it assumes all kernel tasks belong to process group 0.

Resources