Plots in JMH [Java Micro-Benchmarking Harness] - benchmarking

I have been reading about JMH. But I couldn't find a way to generate plots using this. Does JMH support plotting? Or are there third party libraries for this purpose?

JMH does not support plotting. You can write out the performance results into a file (e.g. with -rf csv or -rf json), and use whatever plotting tool you are familiar with. Or, you can extract the performance data from RunResult instance you got from Java API, and parse/render it with any embedded library.
If you use settings to observe single call executions time (or batch calls) and you want to see plot of each durations (something like last plot here), you can combine settings:
#Measurement(batchSize = 1000, iterations = 500)
#BenchmarkMode({Mode.SingleShotTime})
And a bit of scripting to get desired data in csv. In settings like this, there is only summary data in resulting csv from jmh.
mvn package && java -jar target/benchmarks.jar -foe true -rf csv | tee output.txt
N=5 # set here number of benchmarks plus 2
grep Iteration -A 3 output.txt | grep -v Warmup | sed 's/$/,/' | xargs -l$N | sed 's/,$//'
It will output something like:
Iteration 1: 93.915 ±(99.9%) 2066.879 s/op, readerA: 28.487 s/op, readerB: 28.525 s/op, writer: 224.735 s/op, --
Iteration 2: 100.483 ±(99.9%) 1265.993 s/op, readerA: 59.927 s/op, readerB: 60.912 s/op, writer: 180.610 s/op, --
Iteration 3: 76.458 ±(99.9%) 760.395 s/op, readerA: 52.513 s/op, readerB: 52.276 s/op, writer: 124.586 s/op, --
Iteration 4: 84.046 ±(99.9%) 1189.029 s/op, readerA: 46.112 s/op, readerB: 46.724 s/op, writer: 159.303 s/op, --

Related

Gnuplot: Iterate over folders

I have different folders with datasets called e.g.
3-1-1
3-1-2
3-2-1
3-1-2
the first placeholder is fixed, the second and third are elements of a list:
k1values = "1 2"
k2values = "1 2"
I want to do easy operations in my Gnuplot script e.g. cd to the above directories and read a line of a textfile. First, it shall cd to the folder, read a file and cd back again etc.
My first (1) idea was to connect system command and sprintf:
do for[i=1:words(k1values)]{
do for[j=1:words(k2values)]{
system sprintf("cd 3-%d-%d", i, j)
system 'pwd'
system 'cd ..'
}
}
with that the same path is being printed, so no CD is happening at all.
or system 'cd sprintf("3-%d-%d", i, j)'
Unfortunately, this is not working.
Error message: sh: 1: Syntax error: "(" unexpected
I also tried concatenating the values to a string and enter it as a path: This also doesn't work:
k1values = "1 2"
k2values = "1 2"
string1 = '3'
do for[i=1:words(k1values)]{
do for[j=1:words(k2values)]{
path = sprintf("%s-%d-%d", string1, i, j)
system sprintf("cd %s", path)
system 'pwd'
system 'cd ..'
}
}
I print the path for testing, but the operating path is not being changed at all.
Thanks in advance!
Edit: The idea in a given pseudo code is like this:
do for k1
do for k2
valueX = <readingCommand>
make dir "3-k1-k2/Pictures"
for int i = 0; i<valueX; i++
set output bla
plot "3-k1-k2/Data/i.txt" <options>
end for
end do for
end do for
Unless there is a reason which we don't know yet, why do you want to change back and forth into the subdirectories?
Why not creating your path/filename via a function and load the desired file and plot the desired lines?
For example, if you have the following directory structure:
CurrentFolder
3-1-1
Data.dat
3-1-2
Data.dat
3-2-1
Data.dat
3-2-2
Data.dat
and the following files:
3-1-1/Data.dat
1 1.14
2 1.15
3 1.12
4 1.11
5 1.13
3-1-2/Data.dat
1 1.24
2 1.25
3 1.22
4 1.21
5 1.23
3-2-1/Data.dat
1 2.14
2 2.15
3 2.12
4 2.11
5 2.13
3-2-2/Data.dat
1 2.24
2 2.25
3 2.22
4 2.21
5 2.23
The following example loads all the files Data.dat from the corresponding subdirectories and plots the lines 2 to 4 (the lines have 0-based index, check help every).
Script:
### plot specific lines from files from different directories
reset session
k1values = "1 2"
k2values = "1 2"
string1 = '3'
myPath(i,j) = sprintf("%s-%s-%s",string1,word(k1values,i),word(k2values,j))
myFile(i,j) = sprintf("%s/%s",myPath(i,j),"Data.dat")
set key out
plot for [i=1:words(k1values)] for[j=1:words(k2values)] myFile(i,j) \
u 1:2 every ::1::3 w lp pt 7 ti myPath(i,j)
### end of script
Result:
This is my final solution:
k1values = '0.5 1'
k2values = '0.5 1'
omega = 3
do for[i in k1values]{
do for[j in k2values]{
savingPoint = system('head -n 1 "3-'.i.'-'.j.'/<fileName>.dat" | tail -1')
number = savingPoint/<value>
do for[m = savingPoint:0:-<value>]{
set title <...>
set output <...>
plot ''.omega.'-'.i.'-'.j.'/Data/'.m.'.txt' <...>
}
}
}
<...> is a placeholder and irrelevant.
So this is how I finally iterate over the folders.
Within the second for loop, a reading command is executed and allocated to a variable which is needed in the third for loop. i and j are strings though, but that does not matter.

SAS: set statement point = _N_

I'm trying to understand a friend's code to see if I can find some inspiration for my dissertation. He runs a section where he creates a dataset and inputs 3 datasets. However, what I don't understand is that he uses 3 set statements and the latter datasets use point = "_ N _"
What is the use of the following code?
data Other;
set One;
set Two point = _N_;
set Three point = _N_;
array Rating[*] Unrated;
array Amortising[*] '1'n;
array Rating_old[*] old_Unrated;
AM = 0;
do i = 1 to dim(Rating);
Rating[i] = Rating[i] + Rating_old[i] * Amortising[i];
end;
run;
The input datasets look like this
data one;
input Segment count weight ;
datalines;
1 0 0.1
99 1 0.2
;
run;
data two;
input block $ type '0'n '1'n '99'n;
datalines;
50 A 100% 10% 0%
50 S 100% 10% 0%
51 S 100% 10% 0%
52 S 100% 10% 0%
132 S 100% 12% 0%
;
run;
data three;
input DPD $ block type $ segment count weight;
datalines;
AM 50 S 1 0 0.1
Unrated 51 S 99 0.2
NPE 132 S 1 0.5
;
run;
Just looking to see what the point = _ N _ would be used for!
In this program it does nothing. The program would run exactly the same without the point= option on the last two set statements.
The POINT= let's you access observations directly. The _N_ automatic variable is incremented once for each iteration of the data step. So on the first iteration the step will read the first observation from each of the three inputs. Which is exactly what would happen without the point= option.
Note that this program will stop when the first SET statement reads past the end of the file. Without the POINT= then it would stop when ANY of the three set statements attempted to read past the end of the input file. You could do the same and avoid the ERRORs in the SAS log by using and testing the NOBS= options.
set One;
if _n_ <= nobs2 then set Two nobs=nobs2;
if _n_ <= nobs3 then set Three nobs=nobs3;
Given the datasets shown, it doesn't do anything.
However, if the ONE dataset had more rows than one or both of the other two datasets, it would avoid the data step stopping when it ran out of rows from the shortest dataset. For example, run this:
data Other;
set Two;
set One point = _N_;
set Three point = _N_;
array Rating[*] Unrated;
array Amortising[*] '1'n;
array Rating_old[*] old_Unrated;
AM = 0;
do i = 1 to dim(Rating);
Rating[i] = Rating[i] + Rating_old[i] * Amortising[i];
end;
run;
Just swapping TWO and ONE. Now you get 5 rows, while if you took off the point=_n_, you'd only get two still. So the program is likely being written to ensure all of ONE's rows are represented (similar to a left join in SQL except you're not joining to anything here). This would probably be more clearly written as a merge, even without a by statement if it's just a one-to-one merge. Usually, though, there's a valid merge key to merge on.

Filter rows of files conditional on multiple arrays values

I have a number of files (N>1000) with qtl summary data e.g. lets assume the first file is made of six lines (in reality they are all GWAs/imputed files with >10M SNPs)
cat QTL.1.txt
Chr Rs BP beta se pvalue
11 rs11224233 134945522 0.150216 0.736939 0.962375
11 rs4616056 134945709 0.129518 0.371824 0.910326
11 rs11823417 134945710 0.103462 0.41737 0.845826
11 rs80294507 134945765 0.150336 0.735363 0.961403
11 rs61907173 134946034 0.104531 0.158224 0.884548
11 rs147621717 134946277 0.105365 0.196168 0.86476
I would like to filter each of these datasets based on chromosome and positions of a list of genes (my list has 100 genes but now lest assume it has 2); therefore creating N_QTL*N_Genes files. I would like to go through each gene/position for each QTL. The Chromosome, positions and name of the genes are stored in four arrays and I would like to read iteratively these arrays and save the output for each qtl file for each gene.
What I have done so far doesnt work and I know awk is not the best way to do this:
declare -a array1
declare -a array2
declare -a array3
declare -a array4
array1=(11 11) #chromosome
array2=(134945709 134945765) #start gene position
array3=(134946034 134946277) #end gene position
array4=(A B) # gene name
for qtl in 1; do # in reality it would be for qtl in 1 1000
for ((i=0; i<${#array1[#]}; i++)); do
cat QTL.$qtl.txt | awk '$1=='array1[$i]' && $3>='array2[$i]' &&
$3<='array3[$i]' {print$0}' > Gene.${array4[$i]}_QTL.$qtl.txt;
done;
done
within awk $1 is the chromosome and $3 the position- so therefore filtering based on these.
So my expected output for QTL.1.txt for Gene A would be
cat Gene.A_QTL.1.txt
Chr Rs BP beta se pvalue
11 rs4616056 134945709 0.129518 0.371824 0.910326
11 rs11823417 134945710 0.103462 0.41737 0.845826
11 rs80294507 134945765 0.150336 0.735363 0.961403
11 rs61907173 134946034 0.104531 0.158224 0.884548
And for QTL.1.txt for Gene B would be
cat Gene.B_QTL.1.txt
Chr Rs BP beta se pvalue
11 rs80294507 134945765 0.150336 0.735363 0.961403
11 rs61907173 134946034 0.104531 0.158224 0.884548
11 rs147621717 134946277 0.105365 0.196168 0.86476
I end up with empty files as probably the way I ask these columns to be filtered based on the values of the arrays doesnt work.
Any help very much appreciated!
Thank you in advance
Mixing bash and awk for parsing files is not always the best way forward.
Here a solution with awk only.
Assume you have the information assigned to your bash array in a file:
$ cat info
11 134945765 154945765 Gene1
12 134945522 174945522 Gene2
You could use the following awk script to perform a lookup with the data file:
awk 'NR==FNR{
for(i=2;i<=NF;i++)
a[$1,i]=$i
next
}
a[$1,2]<=$3 && a[$1,3]>=$3{
print $0 > a[$1,4]"_QTL"
}' info QTL.1.txt
This will create a file with the following content:
$ cat Gene1_QTL
11 rs80294507 134945765 0.150336 0.735363 0.961403
11 rs61907173 134946034 0.104531 0.158224 0.884548
11 rs147621717 134946277 0.105365 0.196168 0.86476
Maybe not exactly what you're looking at, but yet I hope this is helpful...
You might want to do the following if multiple genes are located in the same chromosome (using gene name instead of chr as Key):
awk 'NR==FNR{
chr[$4]=$1;
start[$4]=$2;
end[$4]=$3;
}
NR!=FNR{
for (var in chr){
name=var"_"FILENAME;
if(chr[var]==$1 && start[var] <=$3 && end[var]>=$3){
print $0 > name;
}
}
}' info QTL

Elixir/Erlang: How to find the source of high CPU usage?

My Elixir app is using about 50% of the CPU, but it really should only be using <1%. I'm trying to figure out what is causing the high CPU usage and I'm having some trouble.
In a remote console, I tried
Listing all processes with Process.list
Looking at the process info with Process.info
Sorting the processes by reduction count
Sorting the processes by message queue length
The message queues are all close to 0, but the reduction counts are very high for some processes. The processes with high reduction counts are named
:file_server_2
ReactPhoenix.ReactIo.Pool
:code_server
(1) and (3) are both present in my other apps, so I feel like it must be (2). This is where I'm stuck. How can I go further and figure out why (2) is using so much CPU?
I know that ReactPhoenix uses react-stdio. Looking at top, react-sdtio doesn't use any resources, but the beam does.
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 87 53.2 1.2 2822012 99212 ? Sl Nov20 580:03 /app/erts-9.1/bin/beam.smp -Bd -- -root /app -progname app/releases/0.0.1/hello.sh -- -home /root -- -noshell -noshell -noinput -boot /app/
root 13873 0.0 0.0 4460 792 ? Rs 13:54 0:00 /bin/sh -c deps/react_phoenix/node_modules/.bin/react-stdio
I saw in this StackOverflow post that stdin can cause resource issues, but I'm unsure if that applies here. Anyway, any help would be greatly appreciated!
Did you try etop?
iex(2)> :etop.start
========================================================================================
nonode#nohost 14:57:45
Load: cpu 0 Memory: total 26754 binary 143
procs 51 processes 8462 code 7201
runq 0 atom 292 ets 392
Pid Name or Initial Func Time Reds Memory MsgQ Current Function
----------------------------------------------------------------------------------------
<0.6.0> erl_prim_loader '-' 458002 109280 0 erl_prim_loader:loop
<0.38.0> code_server '-' 130576 196984 0 code_server:loop/1
<0.33.0> application_controll '-' 58731 831632 0 gen_server:loop/7
<0.88.0> etop_server '-' 58723 109472 0 etop:data_handler/2
<0.53.0> group:server/3 '-' 19364 2917928 0 group:server_loop/3
<0.61.0> disk_log:init/2 '-' 16246 318352 0 disk_log:loop/1
<0.46.0> file_server_2 '-' 3838 18752 0 gen_server:loop/7
<0.51.0> user_drv '-' 3720 13832 0 user_drv:server_loop
<0.0.0> init '-' 2559 34440 0 init:loop/1
<0.37.0> kernel_sup '-' 2093 58600 0 gen_server:loop/7
========================================================================================
http://erlang.org/doc/man/etop.html

How are the control points connected in a NURBS surface?

I'm trying to learn how to deal with NURBS surfaces for a project. Basically I wan't to build a geometry in some 3D program with NURBS, then export the geometry, and run some simulations with it. I have figured out the NURBS curve, and I do think I mostly understand how surfaces work, but what I don't get is how the control points are connected. Apparently you don't need any topology matrix as with polygons? When I export NURBS surfaces from Maya, in the file format .ma, which is plain text file, I can see the knot vectors, and then just a list of points. No topology information. How does this work? How can you reconstruct the NURBS surface without knowing how the points are connected to each other? The exported file is written below:
//Maya ASCII 2013 scene
//Name: test4.ma
//Last modified: Sat, Jan 26, 2013 07:21:36 PM
//Codeset: UTF-8
requires maya "2013";
requires "stereoCamera" "10.0";
currentUnit -l centimeter -a degree -t film;
fileInfo "application" "maya";
fileInfo "product" "Maya 2013";
fileInfo "version" "2013 x64";
fileInfo "cutIdentifier" "201207040330-835994";
fileInfo "osv" "Mac OS X 10.8.2";
fileInfo "license" "student";
createNode transform -n "loftedSurface1";
setAttr ".t" -type "double3" -0.68884794895562784 0 -3.8172687581953233 ;
createNode nurbsSurface -n "loftedSurfaceShape1" -p "loftedSurface1";
setAttr -k off ".v";
setAttr ".vir" yes;
setAttr ".vif" yes;
setAttr ".covm[0]" 0 1 1;
setAttr ".cdvm[0]" 0 1 1;
setAttr ".dvu" 0;
setAttr ".dvv" 0;
setAttr ".cpr" 4;
setAttr ".cps" 4;
setAttr ".cc" -type "nurbsSurface"
3 3 0 0 no
8 0 0 0 1 2 3 3 3
11 0 0 0 1 2 3 4 5 6 6 6
54
0.032814107781307778 -0.01084889661073064 -2.5450696958149557
0.032814107781308312 -0.010848896610730773 -1.6967131305433036
0.032824475105651972 -0.010848896610730714 -0.0016892641735144487
0.032777822146102309 -0.01084889661073018 2.5509821204222565
0.032948882997777158 -0.010848896610730326 5.3256822304677218
0.032311292550627417 -0.010848896610730283 7.5033561343333179
0.034690593487551526 -0.010848896610730296 11.39484483093603
0.014785648001686571 -0.010848896610730293 11.972583607988943
-0.00012526283089935193 -0.010848896610730293 12.513351622510489
0.87607723187763198 -0.023973071493875439 -2.5450696958149557
0.87607723187766595 -0.023973071493876091 -1.6967131305433036
0.87636198619878247 -0.023973071493875821 0.00026157734839016289
0.87508059175355446 -0.023973071493873142 2.5441541750955903
0.87977903805225144 -0.023973071493873861 5.3510431702524812
0.86226664730269065 -0.02397307149387367 7.4087403205209448
0.9276177640022375 -0.023973071493873725 11.747947146400762
0.39164345444212556 -0.023973071493873704 12.72679599298271
-0.003344290659457324 -0.023973071493873708 13.356608602511475
2.7585407036097025 0.080696275184513055 -2.5450696958149557
2.7979735813230628 0.036005680442686323 -1.6988092981025378
2.7828331201271896 0.05438167150027777 0.0049374879309111996
2.6143679292284574 0.23983328019207673 2.5309327393956176
2.67593270347135 0.19013709747074492 5.3992530024698517
2.5981387973985108 0.20347021966427298 7.2291224273514345
2.8477496474469728 0.19983391361149261 12.418208886861429
1.1034136098865515 0.20064198162322153 14.474560637904968
-0.010126299867110311 0.20064198162322155 15.133224682698101
4.5214126649737496 0.45953483463333544 -2.5450696958149557
4.6561826938778452 0.23941045408996731 -1.7369291398229287
4.6267725925384751 0.29043329565744253 0.025561242784985394
3.9504978751410711 1.3815767918640129 2.5159293599869446
4.1596851721552888 1.0891788615080038 5.438642765250469
3.9992107014958198 1.1676270867254697 7.0865667556376426
4.4319212871194775 1.1462321162116154 12.949041810935984
1.6384310220676352 1.1509865541035829 15.927795222282771
-0.015643773215464073 1.1509865541035829 16.578582772395933
5.2193823159440154 3.0233786192453191 -2.5450696958149557
5.2193823159440162 3.0233786192453196 -1.6967131305433036
5.2218229691816047 3.0233786192453191 0.0091618497226043649
5.2108400296124504 3.0233786192453196 2.5130032217858407
5.251110808032692 3.0233786192453191 5.4667467111172652
5.1010106339208772 3.0233786192453191 6.9770771103715621
5.6611405519478906 3.0233786192453205 13.358896446133507
2.0430537629341199 3.0233786192453183 17.059047057656215
-0.019924192630756767 3.0233786192453191 17.6998820408444
5.1365144716134976 5.4897102753589557 -2.5450696958149557
5.1365144716134994 5.4897102753589566 -1.6967131305433036
5.1389093836131625 5.4897102753589566 0.0089946049919694682
5.1281322796146718 5.4897102753589566 2.5135885783430627
5.1676483276091361 5.4897102753589548 5.4645725296190131
5.0203612396297714 5.4897102753589566 6.9851884798073476
5.5699935435527692 5.4897102753589566 13.328625149888618
2.0133428487217855 5.4897102753589557 16.975388787391935
-0.01960785732642523 5.4897102753589557 17.617014800296868
;
select -ne :time1;
setAttr ".o" 1;
setAttr ".unw" 1;
select -ne :renderPartition;
setAttr -s 2 ".st";
select -ne :initialShadingGroup;
setAttr ".ro" yes;
select -ne :initialParticleSE;
setAttr ".ro" yes;
select -ne :defaultShaderList1;
setAttr -s 2 ".s";
select -ne :postProcessList1;
setAttr -s 2 ".p";
select -ne :defaultRenderingList1;
select -ne :renderGlobalsList1;
select -ne :hardwareRenderGlobals;
setAttr ".ctrs" 256;
setAttr ".btrs" 512;
select -ne :defaultHardwareRenderGlobals;
setAttr ".fn" -type "string" "im";
setAttr ".res" -type "string" "ntsc_4d 646 485 1.333";
select -ne :ikSystem;
setAttr -s 4 ".sol";
connectAttr "loftedSurfaceShape1.iog" ":initialShadingGroup.dsm" -na;
// End of test4.ma
A NURBS surface is allays topologically square with points of degree+spans in u direction and (degree-1)+spans+1* in v direction. (a single NURBS surface is like one face of a polygon only more complicated)
The first 2 attributes in ".cc" are the degree in direction, and the next two lines define the knots each individual value represents a span. Duplicates are just weights so the point is repeated x times so:
8 0 0 0 1 2 3 3 3
Means there 8 knots (in this case in U direction) with 0 1 2 3 spans for a total of 6 points so it's a single span curve of third degree in U direction. The example has 9 points in V direction thus 7*9 = 54 points in total
This is not enough however, for NURBS to be even remotely useful. You must implement trim curves which are curves that lay on the UV parametrization of the surface and they can clip the individual NURBS to different shape.
In practice however maya users rely on manual quilting. Quilts** are the higher order NURBS equivalent of a mesh, that most nurbs modelers use as a concept. To handle these its often not enough to have even the trim curves. As trim curves cannot be reliably transported between applications, without sewing. Thus many applications rely on actually telling what the spatial history of said surface to surface quilt collections topographical connection is. So be prepared to make your own intersection algorithms etc., etc., for any meaningful NURBS compatibility.
For more on the mathematical underpinning info see Wikipedia, wolfram etc.
* If I remember correctly something like that.
** Quilts have different names in different applications due to simultaneous discovery on in several different language areas.
NURBS surfaces' CVs are always laid out in a grid. The number of CVs in a nurbs surface can be computed using the degree of the surface and the number of knots in each direction. Then the CVs are just presented in some specific order, typically row-major.
Let's look at your example. I'm mostly just guessing the format, so you'll want to check my assumptions.
3 3 0 0 no
It looks like you have a bicubic surface. It's not periodic in either direction (that is, you have a sheet rather than a cylinder or torus). Your CVs are non-rational, meaning they're [x,y,z] instead of [xw,yw,zw,w].
In other words, the format of that first line appears to be:
[degree in s] [degree in t] [periodic in s] [periodic in t] [rational]
Next up, one knot vector has 8 knot values, and the other has 11. For a degree 3 non-periodic nurbs, the number of CVs is num_knots - 2. So, you have 6 x 9 CVs in this surface.
The first 6 CVs are in the first row. The next 6 are in the next row, etc.
If you're looking for more information on NURBS, I'd recommend this text for theory. For maya specific stuff, they have some decent documentation in the maya API.

Resources