Plotting from line to line with gnuplot - database

I have a .txt file in which two set of data are in the same column but divided by some characters, here an example:
#First set of data
#Time #Velocity
1 0.3
2 0.5
3 0.8
4 1.3
#Second set of data
#Time #Velocity
1 0.7
2 0.9
3 1.8
4 2.3
So I would like to plot this two set of data as two different curves, and also I do not know how many lines has each set of data ( or at least this number can change ) so i cannot use every command.( I'm looking for some gnuplot command, not bash command).
Thank you

As you already mentioned every will not work here, since you have variable lengths of datasets (edit: yes it will, see edit below).
In case you had two empty lines to separate your datasets you could use index, check help index.
However, if you have a single empty line, pseudocolumn -1 will help. Check help pseudocolumns.
Then you can define a filter with the ternary operator, check help ternary.
Code:
### plotting variable datasets
reset session
$Data <<EOD
#First set of data
#Time #Velocity
1 0.3
2 0.5
3 0.8
4 1.3
#Second set of data
#Time #Velocity
1 0.7
2 0.9
3 1.8
#Third set of data
#Time #Velocity
1 0.9
2 1.4
3 2.6
4 3.6
5 4.8
EOD
myFilter(col,i) = column(-1)==i-1 ? column(col) : NaN
set key top left
plot for [i=1:3] $Data u 1:(myFilter(2,i)) w lp pt 7 title sprintf("Set %d",i)
### end of code
Edit: (as #binzo pointed out)
Actually, I made it too complicated. As simple as the following will also do it without filter (filter can be used on other occasions). Note, the blocks are numbered starting from 0.
plot for [i=1:3] $Data u 1:2 every :::i-1::i-1 w lp pt 7 title sprintf("Set %d",i)
Result:

Related

Create an array with a sequence of numbers in bash

I would like to write a script that will create me an array with the following values:
{0.1 0.2 0.3 ... 2.5}
Until now I was using a script as follows:
plist=(0.1 0.2 0.3 0.4)
for i in ${plist[#]}; do
echo "submit a simulation with this parameter:"
echo "$i"
done
But now I need the list to be much longer ( but still with constant intervals).
Is there a way to create such an array in a single command? what is the most efficient way to create such a list?
Using seq you can say seq FIRST STEP LAST. In your case:
seq 0 0.1 2.5
Then it is a matter of storing these values in an array:
vals=($(seq 0 0.1 2.5))
You can then check the values with:
$ printf "%s\n" "${vals[#]}"
0,0
0,1
0,2
...
2,3
2,4
2,5
Yes, my locale is set to have commas instead of dots for decimals. This can be changed setting LC_NUMERIC="en_US.UTF-8".
By the way, brace expansion also allows to set an increment. The problem is that it has to be an integer:
$ echo {0..15..3}
0 3 6 9 12 15
Bash supports C style For loops:
$ for ((i=1;i<5;i+=1)); do echo "0.${i}" ; done
0.1
0.2
0.3
0.4
Complementing the main answer
In my case, seq was not the best choice.
To produce a sequence, you can also use the jot utility. However, this command has a more elaborated syntaxis.
# 1 2 3 4
jot - 1 4
# 3 evenly distributed numbers between 0 and 10
# 0 5 10
jot 3 0 10
# a b c ... z
jot -c - 97 122

For loop to average over individual time points Matlab

I have a 21x2 vector in Matlab that looks like this:
A = [0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5;
0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1]';
each element in the first row corresponds to either a 0 or 1 in the second row. I need to treat each of the set of 0's and 1's as blocks so that I can single out the elements such that I have a vector containing all the first elements of the 1 blocks and then another vector containing all the second elements of the 1 blocks and so on until I have all the elements separated out.
so for e.g. vector1=[1.0 2.1], vector2=[1.1 2.2] etc etc.
This is because I need to average over individual points between blocks so that I have for e.g. avg_vector1, avg_vector2, avg_vector3... etc.
So far I have been trying to write a loop to do this but I can already tell it won't be so efficient and might not work every time because I will have to have a an if for each j (see below) and the "number of j's" is not really fixed, sometimes the block could be longer, sometimes it could be shorter.
j=1;
for i=1:size(A,1)
if A(i,2)==1
if j==1
vector1(i)=A(i,1);
j=j+1; %j is acting as a counter for the "size" of the block of 0's and 1's
if j==2
vector2(i)=A(i,1);
**incomplete**
Does anyone know how to do this more elegantly and simply?
Thanks
(Hopefully) correct version:
M = logical(A(:, 2));
is_start = [M(1); ~M(1:end-1) & M(2:end)];
is_start = is_start(M);
A_valid = A(M, 1);
group_idx = cumsum(is_start);
group_start_idx = find(is_start);
sub_idx = (1:numel(is_start))' - group_start_idx(group_idx)+1;
means = accumarray(sub_idx, A_valid, [], #mean);
There is possibly a slightly neater way of doing this with one or two fewer steps, but this should work.
Take home lesson: use cumsum more often!

Calculating mean over an array of lists in R

I have an array built to accept the outputs of a modelling package:
M <- array(list(NULL), c(trials,3))
Where trials is a number that will generate circa 50 sets of data.
From a sampling loop, I am inserting a specific aspect of the outputs. The output from the modelling package looks a little like this:
Mt$effects
c_name effect Other
1 DPC_I 0.0818277549 0
2 DPR_I 0.0150814475 0
3 DPA_I 0.0405341027 0
4 DR_I 0.1255416311 0
5 (etc.)
And I am inserting it into my array via a loop
For(x in 1:trials) {
Mt<-run_model(params)
M[[x,3]] <- Mt$effects
}
The object now looks as follows
M[,3]
[[1]]
c_name effect Other
1 DPC_I 0.0818277549 0
2 DPR_I 0.0150814475 0
3 DPA_I 0.0405341027 0
4 DR_I 0.1255416311 0
5 (etc.)
[[2]]
c_name effect Other
1 DPC_I 0.0717384637 0
2 DPR_I 0.0190812375 0
3 DPA_I 0.0856456427 0
4 DR_I 0.2330002551 0
5 (etc.)
[[3]]
And so on (up to 50 elements).
What I want to do is calculate an average (and sd) of effect, grouped by each c_name, across each of these 50 trial runs, but I’m unable to extract the data in to a single dataframe (for example) so that I can run a ddply summarise across them.
I have tried various combinations of rbind, cbind, unlist, but I just can’t understand how to correctly lift this data out of the sequential elements. I note also that any reference to .names results in NULL.
Any solution would be most appreciated!

how to split file in arrays and find maximum value in each of them

I have a file:
1 0.5
2 0.7
3 0.55
4 0.7
5 0.45
6 0.8
7 0.75
8 0.3
9 0.35
10 0.5
11 0.65
12 0.75
I want to split the file into 4 arrays ending on every next 3rd line and then to find the maximum value in the second column for every array. So this file the outcome would be the:
3 0.7
6 0.8
9 0.75
12 0.75
I have managed so far to split the file into several by
awk 'NR%3==1{x="L"++i;}{print > x}' filename
then to find the maximum in every file:
awk 'BEGIN{max=0}{if(($2)>max) max=($2)}END {print $1,max}'
However, this creates additional files which is fine for this example but in reality the original file contains 65 million lines so I will be a bit overwhelmed by the amount of files and I am trying to avoid it by writing a short script which will combine both of the mentioned above.
I tried this one:
awk 'BEGIN {for (i=1; i<=12; i+=3) {max=0} {if(($2)>max) max=($2)}}END {print $1,max}' Filename
but it produces something irrelevant.
So if you can help me out it will be much appreciated!
You could go for something like this:
awk 'NR % 3 == 1 || $2 > max {max = $2} NR % 3 == 0 {print $1, max}' file
The value of max is always reset every three rows and updated if value of the second column is greater than it. At the end of every group of three, the first column and the max are printed.

Plotting data included in script fails in loop

I would like to plot data points included inside a script file.
This should be done multiple times (plotting to different files).
Therefore, I am using a do-for-loop.
This loop let's Gnuplot freeze on excution.
Could you please hint me to the cause?
This is my MWE:
reset
set autoscale
do for [index=1:1] {
plot "-" with lines ls 2 notitle
0.500 5
1.000 6
1.500 7
e
}
Yes, seems like the combination of do for with inline data isn't supported. It also wouldn't be very convenient, since this would require a separate data block for every iteration like in
set style data linespoints
plot '-' using 1:2, '-' using 1:3
1 2 3
4 5 6
e
1 2 3
4 5 6
e
With version 5.0 inline data blocks were introduced which allow reusing inline data:
$data <<EOD
1 2 3
4 5 6
EOD
do for [i=2:3] {
plot $data using 1:i w l
pause -1
}

Resources