How can I run this DTrace script to profile my application? - c

I was searching online for something to help me do assembly line profiling. I searched and found something on http://www.webservertalk.com/message897404.html
There are two parts of to this problem; finding all instructions of a particular type (inc, add, shl, etc) to determine groupings and then figuring out which are getting executed and summing correcty. The first bit is tricky unless grouping by disassembler is sufficient. For figuring which instructions are being executed, Dtrace is of course your friend here( at least in userland).
The nicest way of doing this would be instrument only the begining of each basic block; finding these would be a manual process right now... however, instrumenting each instruction is feasible for small applications. Here's an example:
First, our quite trivial C program under test:
main()
{
int i;
for (i = 0; i < 100; i++)
getpid();
}
Now, our slightly tricky D script:
#pragma D option quiet
pid$target:a.out::entry
/address[probefunc] == 0/
{
address[probefunc]=uregs[R_PC];
}
pid$target:a.out::
/address[probefunc] != 0/
{
#a[probefunc,(uregs[R_PC]-address[probefunc]), uregs[R_PC]]=count();
}
END
{
printa("%s+%#x:\t%d\t%#d\n", #a);
}
main+0x1: 1
main+0x3: 1
main+0x6: 1
main+0x9: 1
main+0xe: 1
main+0x11: 1
main+0x14: 1
main+0x17: 1
main+0x1a: 1
main+0x1c: 1
main+0x23: 101
main+0x27: 101
main+0x29: 100
main+0x2e: 100
main+0x31: 100
main+0x33: 100
main+0x35: 1
main+0x36: 1
main+0x37: 1
From the example given, this is exactly what i need. However I have no idea what it is doing, how to save the DTrace program, how to execute with the code that i want to get the results of. So i opened this hoping some people with good DTrace background could help me understand the code, save it, run it and hopefully get the results shown.

If all you want to do is run this particular DTrace script, simply save it to a .d script file and use a command like the following to run it against your compiled executable:
sudo dtrace -s dtracescript.d -c [Path to executable]
where you replace dtracescript.d with your script file name.
This assumes that you have DTrace as part of your system (I'm running Mac OS X, which has had it since Leopard).
If you're curious about how this works, I wrote a two-part tutorial on using DTrace for MacResearch a while ago, which can be found here and here.

Related

Attaching to a process and call `dup2` on aarch64?

I tried attaching to a running process with gdb to redirect its stdout to an external file with these commands:
#Attaching
gdb -p 123456
#Redirecting (within GDB)
(gdb) p dup2(open("/tmp/my_stdout", 1089, 0777), 1)
I used the number 1089 because it represents O_WRONLY | O_CREAT | O_APPEND.
Firts, GDB just complained about some missing return types:
'open64' has unknown return type; cast the call to its declared return type
So I modified my command to
#Redirecting (within GDB)
(gdb) p (int)dup2((int)open("/tmp/my_stdout", 1089, 0777), 1)
This was successfully executed, and also works.
I'm trying to figure out how can I write a small utility that does the exact same thing as the above:
attaches to a process by PID
calls this (int)dup2((int)open("/tmp/my_stdout", 1089, 0777), 1)
Part2 seems easy, however part1 doesn't seem to work on aarch64. I could manage to work it on arm though.
There are a quite a few solutions which tries to solve this problem:
reptyr (doesn't work on process started by systemctl)
reredirect (doesn't support aarch64 at all)
injcode (doesn't support 64bit at all)
neercs (for sure no support for aarch64)
retty (for sure no support for aarch64)
If GDB can work, this is surely possible, but GDB is huge to analyze, and I hope I have some better solution which would not take weeks or months, like digging myself into GDB's source.

How to monitor Tuxedo via mib

Currently I am tring to write a program to monitor Tuxedo. from the official documents, I found MIB is suitable for writting program to monitor it. I have read a quite lot of document of here http://docs.oracle.com/cd/E13203_01/tuxedo/tux90/rf5/rf5.htm#998207. Although there are so many instructions of very class, there is no any guide to tell me how to use it from the beginning. I have tried to search on github however unfortuanately there is no any code relating to tuxedo mib. Does any one have some good sample code?
Thanks a lot.
Here a Shell-function that reads the blocktime from Tuxedo:
get_blocktime() {
TmpErr=/tmp/ud32err_$$
rtc=0
ud32 -Ctpsysadm <<EOF 2>$TmpErr | grep TA_BLOCKTIME | cut -f2
SRVCNM .TMIB
TA_CLASS T_DOMAIN
TA_OPERATION GET
EOF
# ud32 has no good error-handling
if [ -s $TmpErr ]; then
echo "$PRG: Error calling ud32:"
cat $TmpErr 1>&2
rtc=1
fi
rm $TmpErr
exit $rtc
}
There are several examples of accessing MIB with Python https://github.com/PacktPublishing/Modernizing-Oracle-Tuxedo-Applications-with-Python/tree/main/Chapter06. For example:
import tuxedo as t
t.tpinit(cltname="tpsysop")
machine = t.tpadmcall(
{
"TA_CLASS": "T_MACHINE",
"TA_OPERATION": "GET",
"TA_FLAGS": t.MIB_LOCAL,
}
).data
A couple of notes:
you will need the TA_FLAGS set to MIB_LOCAL to return statistics (not done by default)
you might want to use tpadmcall() function instead of calling the .TMIB service. The function is much lighter on the system and does not increase Tuxedo statistics (number of service calls). The main limitation of tpadmcall is the limited size of the response so you will need to call the .TMIB service for server and queue statistics if your application has tens of them.
If the code example is not enough, you can check the chapter 6 of the book Modernizing Oracle Tuxedo Applications with Python.
I have some C code for calling .TMIB to monitor Tuxedo application here: https://github.com/TuxSQL/tuxmon
That should get you started.

C script running as cron giving permission denied error

I have a .c file compiled and would like to run via a cron job but I end up getting this error:
/bin/sh: /usr/local/bin/get1Receive.c: Permission denied.
What is causing this error and how do I fix it?
Should I be running the .c file in cron or a different compiled file?
Results from /tmp/myvars
GROUPS=()
HOME=/root
HOSTNAME=capture
HOSTTYPE=x86_64
IFS='
'
LOGNAME=root
MACHTYPE=x86_64-redhat-linux-gnu
OPTERR=1
OPTIND=1
OSTYPE=linux-gnu
PATH=/usr/bin:/bin
POSIXLY_CORRECT=y
PPID=11086
PS4='+ '
PWD=/root
SHELL=/bin/sh
SHELLOPTS=braceexpand:hashall:interactive-comments:posix
SHLVL=1
TERM=dumb
UID=0
USER=root
_=/bin/sh
Results from file get1Receive.c
file get1Receive.c
get1Receive.c: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped
Snippet of codes.
sprintf(queryBuf1,"SELECT ipDest, macDest,portDest, sum(totalBits) FROM dataReceive WHERE timeStampID between '%s' And '%s' GROUP BY ipDest, macDest, portDest ",buff1,buff2);
printf("\nQuery receive %s",queryBuf1);
if(mysql_query(localConn, queryBuf1))
{
//fprintf(stderr, "%s\n", mysql_error(localConn));
printf("Error in first query of select %s\n",mysql_error(localConn));
exit(1);
}
localRes1 = mysql_store_result(localConn);
int num_fields = mysql_num_fields(localRes1);
printf("\nNumf of fields : %d",num_fields);
printf("\nNof of row : %lu",mysql_num_rows(localRes1));
If the output of this command:
file get1Receive1.c
shows that file name to be a valid executable that part is very unusual, but okay.
Assuming you are using biz14 (or your real username's ) crontab try this:
use the command crontab -e to create this line in your crontab:
* * * * * set > /tmp/myvars
Wait a few minutes, go back into crontab -e and delete that entry.
Use the set command from the command line to see what variables and aliases exist.
Compare that with that you see in /tmp/myvars You have to change how your C code executes by changing the variables and aliases the cron job runs with.
If you are running the cron job in someone else's crontab, then you have a bigger problem. Check file permissions on get1Receive1.c. and the directory it lives in. That other user (the one who wons the crontab) has to have permissions set on your directory and get1Receive1.c so the job can run.
Example crontab entry:
0 10 * * 1-5 /path/to/get1Receive1.c > /tmp/outputfile
Read /tmp/outputfile to see what you got. You are using printf in your code. printf only writes to the controlling terminal. There is no controlling terminal, so redirect the printf stuff to a file.
Last effort on this problem:
Check return codes on EVERYTHING. All C functions like fread(), any db function, etc. If a return code gives a fail response ( these are different for different function calls) then report the error number the line number and function - gcc provides LINE and func. Example:
printf("error on line %d in my code %s, error message =%s\n", __LINE__, __func__, [string of error message]);
If you do not check return codes you are writing very poor C code.
CHECK return codes, please, now!
Permission wise you could have two issues.
1. The 'c' file's permissions don't allow who you are running it as to run it.
2. You are running the cron with a script which doesn't have permissions.
Here's a helpful post: How to give permission for the cron job file?
The fact that you are running a 'c' file and referring to it as a script makes me think you're using C shell and not writing it as a C language program which would need to be compiled and have the generated executable run by the cron. If you're not using gcc or have never called gcc on your 'C' script then it's not C and call it C shell to avoid confusion.

Running OpenMP on a single node of a cluster

I am able to do simple for loops in OpenMP on my desktop/laptop of the form (a mild simplification of what I actually have...)
#include <stdlib.h>
#include <stdio.h>
#include <omp.h>
%%%% #include other libraries...
int main(void){
.
.
.
%%% declare and initialize variables.
.
.
.
#pragma omp parallel for collapse(3) shared(tf, p, Fx, Fy, Fz) private(v, i,j,k,t0)
for (i = 0; i < Nx; i++){
for (j = 0; j < Ny; j++){
for (k = 0; k < Nz; k++){
v[0] = Fx[i][j][k];
v[1] = Fy[i][j][k];
v[2] = Fz[i][j][k];
///My_fn changes v and then I put it back into Fx, Fy, Fz
My_fn(v, t0, tf, p);
Fx[i][j][k] = v[0];
Fy[i][j][k] = v[1];
Fz[i][j][k] = v[2];
}
}
}
}
If I want, I can even specify to use n_threasds = 1, 2, 3 or 4 cores on my laptop by adding omp_set_num_threads(n_threads); to the top, and I notice the performance I want. However, when using a cluster, I comment that line out.
I have access to a cluster and would like to run the code on a single node since the cluster has nodes with up to 48 cores and my laptop only 4. When I use the cluster, after compiling, I type into the terminal
$export OMP_NUM_THREADS=10
$bsub -n 10 ./a.out
But the program does not run properly: I output into a file and see it took 0 seconds to run, and the the values of Fx, Fy and Fz are what they are when I initiate them, so it seems the loop is not even run at all.
Edit: This issue was addressed by the people who managed the cluster, and is likely very specific to that cluster, hence I caution people to relate the issue to their specific case.
Looks to me that this question has nothing to do with programming but rather with using the batch system (a.k.a. distributed resource manager) on your cluster. The usual practice is to write a script instead and inside the script set OMP_NUM_THREADS to the number of slots granted. Your batch system appears to be LSF (a wild guess, based on the presence of bsub), then you'd mostly like to have something similar in the script (let's call it job.sh):
#BSUB -n 10
export OMP_NUM_THREADS=$LSB_DJOB_NUMPROC
./a.out
Then submit the script with bsub < job.sh. LSF exports the number of slots granted to the job in the LSB_DJOB_NUMPROC environment variable. By doing the assignment you may submit the same job file with different parameters like: bsub -n 20 < job.sh. You might need to give a hint to the scheduler that you'd like to have all slots on the same node. One can usually do that by specifying -R "span[ptile=n]". There might be other means to do that, e.g. an esub executable that you might need to specify:
#BSUB -a openmp
Please, note that Stack Overflow is not where your administrators store the cluster documentation. You'd better ask them, not us.
I am not sure that I understand correctly what you are up to, but I fear that your idea is that OpenMP would automatically run your application in a distributed way on a cluster.
OpenMP is not made for such a task, it supposes that you run your code in a shared memory setting. For a distributed setting (processors only connected through a networking link) there are other tools, namely MPI. But such a setting is a bit more complicated to set up than just the #pragma annotations that you are used to when using openMP.
Hristo is right, but i think you should add
#BSUB -R "span[hosts=1]" # run on a single node
in your .sh file. The ptile option is only to specify the number of tasks per node
, see i.e
https://doc.zih.tu-dresden.de/hpc-wiki/bin/view/Compendium/PlatformLSF
Otherwise, depending on the queue settings of the cluster, which you might get with
bqueues -l
the task would be runned on every node, which is available to you.
If the node has 24 cores
#PBS -l nodes=1:ppn=24
in my system. Probably in the cluster you use it will be like
#BSUB -l nodes=1:ppn=24

Line Number Info in ltrace and strace tools

Is it possible that I can view the line number and file name (for my program running with ltrace/strace) along with the library call/system call information.
Eg:
code section :: ptr = malloc(sizeof(int)*5); (file:code.c, line:21)
ltrace or any other tool: malloc(20) :: code.c::21
I have tried all the options of ltrace/strace but cannot figure out a way to get this info.
If not possible through ltrace/strace, do we have any parallel tool option for GNU/Linux?
You may be able to use the -i option (to output the instruction pointer at the time of the call) in strace and ltrace, combined with addr2line to resolve the calls to lines of code.
No It's not possible. Why don't you use gdb for this purpose?
When you are compiling application with gcc use -ggdb flags to get debugger info into your program and then run your program with gdb or equivalent frontend (ddd or similar)
Here is quick gdb manual to help you out a bit.
http://www.cs.cmu.edu/~gilpin/tutorial/
You can use strace-plus that can collects stack traces associated with each system call.
http://code.google.com/p/strace-plus/
Pretty old question, but I found a way to accomplish what OP wanted:
First use strace with -k option, which will generate a stack trace like this:
openat(AT_FDCWD, NULL, O_RDONLY) = -1 EFAULT (Bad address)
> /usr/lib/libc-2.33.so(__open64+0x5b) [0xefeab]
> /usr/lib/libc-2.33.so(_IO_file_open+0x26) [0x816f6]
> /usr/lib/libc-2.33.so(_IO_file_fopen+0x10a) [0x818ca]
> /usr/lib/libc-2.33.so(__fopen_internal+0x7d) [0x7527d]
> /mnt/r/build/tests/main(main+0x90) [0x1330]
> /usr/lib/libc-2.33.so(__libc_start_main+0xd5) [0x27b25]
> /mnt/r/build/tests/main(_start+0x2e) [0x114e]
The address of each function call are displayed at the end of each line, and you can paste it to addr2line to retrieve the file and line. For example, we want to locate the call in main() (fifth line of the stack trace).
addr2line -e tests/main 0x1330
It will show something like this:
/mnt/r/main.c:55

Resources