Combining boxplots in R - arrays

I have a plotting question regarding boxplots (using base graphics).
I have several arrays of data which I wish to turn into box plots and compare. The arrays reflect different experiments and what I would like to show is the base results and the percentage difference for the experiments (on one plot!). I.e. the base results on the 1st y axis and the % diff on the second y axis:
base <- array(runif(12*24*3), dim=c(12,24,3))
exp1 <- array(runif(12*24*3), dim=c(12,24,3))
exp2 <- array(runif(12*24*3), dim=c(12,24,3))
exp3 <- array(runif(12*24*3), dim=c(12,24,3))
exp4 <- array(runif(12*24*3), dim=c(12,24,3))
# calc p.diff
p.diff <- function(mod,base) {
100.0*((mod-base)/base) }
a <- p.diff(exp1,base)
b <- p.diff(exp2,base)
c <- p.diff(exp3,base)
# combine the % diff arrays
exps <- list(a,b,c)
# plot the results
boxplot(base, xlim=c(1,4), col="gray", xaxt="n", ylab="Base values", outline=FALSE)
axis(side=1, 1:4, labels=c("base","% exp1","% exp2","% exp3") )
par(new=TRUE)
boxplot(exps, col="red", ylim=c(-200,200), outline=FALSE, axes=FALSE)
axis(4)
grid()
This almost works but I don't get the positioning of the different box plots right (if you run my example you will see what I mean). So is there a better way to control the placement of the box plots? Or a better way to produce a similar type of figure?

Edited (1): You need to define the rigth sequences for the X axis. So that the plots don't overlap. Just try to play with it.
I think the labels of the X axes are not at the right place? I don't know a more elegant way of doing it but here is a solution:
# plot the results
boxplot(base, xlim=c(1,4), col="gray", xaxt="n", ylab="Base values", outline=FALSE)
axis(side=1,1,labels=('base'))
par(new=TRUE)
boxplot(exps, col="red", ylim=c(-200,200), outline=FALSE, axes=FALSE)
axis(4)
axis(side=1,1:3,labels=c("% exp1","% exp2","% exp3"))
grid()
So I added every label after creating the boxplot. First plot the base and label it, then plot exps and label it. Does it solve your problem?
Edit: Just to be more clear, You are adding a new plot with 3 values, that is why axis(side=1,1:3,labels=c("% exp1","% exp2","% exp3")) is from 1 to 3...
Edited (2):
Why don't you use multi rows in the plot and try to plot 2 graphs? Here is an example with your data:
#divide your plottin area into 2 columns with one row.
par(mfrow = c(1, 2))
# plot the results
boxplot(base, col="gray", xaxt="n", ylab="Base values", outline=FALSE,axes=FALSE)
axis(2)
axis(side=1,1,labels=('base'))
segments(0,0,1,0)
boxplot(exps,col="red", xaxt="n", ylim=c(-200,200), outline=FALSE, axes=FALSE)
axis(4)
axis(side=1,at=(1:3),labels=c("% exp1","% exp2","% exp3"))
you can have more information about it from here

Related

FiPy: Can I directly change faceVariables depending on neighboring cells?

I am working with a biological model of the distribution of microbial biomass (b1) on a 2D grid. From the biomass a protein (p1) is produced. The biomass diffuses over the grid, while the protein does not. Only if a certain amount of protein is produced (p > p_lim), the biomass is supposed to diffuse.
I try to implement this by using a dummy cell variable z multiplied with the diffusion coefficient and setting it from 0 to 1 only in cells where p > p_lim.
The condition works fine and when the critical amount of p is reached in a cell, z is set to 1, and diffusion happens. However, the diffusion still does not work with the rate I would like, because to calculate diffusion, the face variable, not the value of the cell itself is used. The faces of z are always a mean of the cell with z=1 and its neighboring cells with z=0. I I, however, would like the diffusion to work at its original rate even if the neighbouring cell is still at p < p_lim.
So, my question is: Can i somehow access a faceVariable and change it? For example, set a face to 1 if any neigboring cell has reached p1 > p_lim? I guess this is not a proper mathematical thing to do, but I couldn't think of another way to simulate this problem.
I will show a very reduced form of my model below. In any case, I thank you very much for your time!
##### produce mesh
nx= 5.
ny= nx
dx = 1.
dy = dx
L = nx*dx
mesh = Grid2D(nx=nx,ny=ny,dx=dx,dy=dy)
#parameters
h1 = 0.5 # production rate of p
Db = 10. # diffusion coeff of b
p_lim=0.1
# cell variables
z = CellVariable(name="z",mesh=mesh,value=0.)
b1 = CellVariable(name="b1",mesh=mesh,hasOld=True,value=0.)
p1= CellVariable(name="p1",mesh=mesh,hasOld=True,value=0.)
# equations
eqb1 = (TransientTerm(var=b1)== DiffusionTerm(var=b1,coeff=Db*z.arithmeticFaceValue)-ImplicitSourceTerm(var=b1,coeff=h1))
eqp1 = (TransientTerm(var=p1)==ImplicitSourceTerm(var=b1,coeff=h1))
# set b1 to 10. in the center of the grid
b1.setValue(10.,where=((x>2.)&(x<3.)&(y>2.)&(y<3.)))
vi=Viewer(vars=(b1,p1),FIPY_VIEWER="matplotlib")
eq = eqb1 & eqp1
from builtins import range
for t in range(10):
b1.updateOld()
p1.updateOld()
z.setValue(z + 0.1,where=((p1>=p_lim) & (z < 1.)))
eq.solve(dt=0.1)
vi.plot()
In addition to .arithmeticFaceValue, FiPy provides other interpolators between cell and face values, such as .harmonicFaceValue and .minmodFaceValue.
These properties are implemented using subclasses of _CellToFaceVariable, specifically _ArithmeticCellToFaceVariable, _HarmonicCellToFaceVariable, and _MinmodCellToFaceVariable.
You can also make a custom interpolator by subclassing _CellToFaceVariable. Two such examples are _LevelSetDiffusionVariable and ScharfetterGummelFaceVariable (neither is well documented, I'm afraid).
You need to override the _calc_() method to provide your custom calculation. This method takes three arguments:
alpha: an array of the ratio (0-1) of the distance from the face to the cell on one side, relative to the distance from distance from the cell on the other side to the cell on the first side
id1: an array of indices of the cells on one side of the face
id2: an array of indices of the cells on the other side of the face
Note: You can ignore any clause if inline.doInline: and look at the _calc_() method defined under the else: clause.

R Programming: 3D array plots

I am trying to do up a 3D array plot in R.
I already have an array built up and defined with the corresponding z-values
e.g. CVHSP500 = array(0,c((nHSP500-N),N))
So now I am trying to do a 3D array plot with it. I decided to go with persp3d(CVHSP500,col = "lightblue",) and have obtained a rather decent plot.
3D Image
So there are obviously some issues with this plot.
1) The coordinates are not defined correctly.
Reading up online on the usage of persp3D, and other R programming functions/packages like slice3D, they all require x, y and z to be separate list.
I don't understand how to match the values of x and y to the respective z, and since persp3D works perfectly without me having to do that, I decided to use persp3D.
But I will need to insert coordinates for it, but I have no idea how to.
2) Any advice how do I color the plots for different ranges of z?
The ones online all seem to have to refer to individual x, y and z lists and some form of advanced modification which I can't really understand. This light blue color looks okay but it would be good for different ranges of z as well though.
Thanks for the help. Much appreciated.
To transform a 2D array representing z for each (x,y) into 3 vectors x, y and z, you can do this:
CVHSP500 = array(0,c((nHSP500-N),N))
x <- rep(1:(nHSP500-N),N)
y <- rep(1:N,(nHSP500-N))
z <- CVHSP500
dim(z) <- (nHSP500-N)*N

Plotting arrays using a grouped horizontal bar graph

I am trying to generate a graph that should look similar to:
My arrays are:
Array4:[Nan;Nan;.......;20;21;22;23;24;..........60]
Array3:[[Nan;Nan;.......;20;21;22;23;24;..........60]
Array2:[0;1;2;3;4;5;6;Nan;Nan;Nan;Nan;17;18;.....60]
Array1:[0;1;2;3;4;5;6;Nan;Nan;Nan;Nan;17;18;.....60]
I cannot find the right way to group my arrays in order to plot them in the way shown on the above graph.
I tried using the following function explained in: http://uk.mathworks.com/help/matlab/ref/barh.html
barh(1:numel(x),y,'hist')
where y=[Array1,Array2;Array3,Array4] and x={'1m';'2m';'3m';......'60m'}
but it does not work.
Why Your Current Approach Isn't Working
Your intuition makes sense to me, but the barh function you are using doesn't work the way you think it does. Specifically, you are interpreting the meaning of the x and y inputs to that function incorrectly. Those are inputs are constant values, not entire axes. The first y input refers to the end-point of the bar that stretches horizontally from x = 0 and the first x input refers to location on the y-axis of the horizontal bar. To illustrate what I mean, I've provided the below horizontal bar graph:
You can find this same picture in the official documentation of the MATLAB barh function. The code used to generate this bar graph is also given in the documentation, shown below:
x = 1900:10:2000;
y = [57,91,105,123,131,150,...
170,203,226.5,249,281.4];
figure;
barh(x, y);
The individual elements of the x array, rather confusingly, show up on the y-axis as the starting locations of each bar. The corresponding elements of the y array are the lengths of each bar. This is the reason that the arrays must be the same length, and this illustrates that they are not specifications of the x and y axes as one might intuitively believe.
An Approach To Solve Your Problem
First things first, the easiest approach is to do this manually with the plot function and a set of lines that represent floating bars. Consult the official documentation for the plot function if you'd like to plot the lines with some sort of color coordination in mind - the code I present (modified version of this answer on StackOverflow) just switches the color of the floating bars between red and blue. I tried to comment the code so that the purpose of each variable is clear. The code I present below matches the floating bar graph that you want to be plotted, if you are alright with replacing thick floating bars with 2D lines floating on a plot.
I used the data that you gave in your question to specify the floating horizontal bars that this script would output - a screenshot is shown below the code. Array1 & Array2:[0;1;2;3;4;5;6;Nan;Nan;Nan;Nan;17;18;.....60], these arrays go from 0 to 6 (length = 6) and 17 to 60 (length = 60 - 17 = 43). Because there is a "discontinuity" of sorts from 7 to 16, I have to define two floating bars for each array. Hence, the first four values in my length array are [6, 6, 43, 43]. Where the first 6 and the first 43 correspond to Array1 and the second 6 and the second 43 correspond to Array2. Recognizing this "discontinuity", the starting point of the first floating bar for Array1 and Array2 is x = 0 and the starting point of the second floating bar for Array1 and Array2 is x = 7. Putting that all together, you arrive at the x-coordinates for the first four points in the floating_bars array, [0 0; 0 1.5; 17 0; 17 1.5]. The y-coordinates in this array only serve to distinguish Array1, Array2, and so on from each other.
Code:
floating_bars=[0 0; 0 1.5; 17 0; 17 1.5; 20 6; 20 7.5]; % Each row is the [x,y] coordinate pair of the starting point for the floating bar
L=[6, 6, 43, 43, 40, 40]; % Length of each consecutive bar
thickness = 0.75;
figure;
for i=1:size(floating_bars,1)
curr_thickness = 0;
% It is aesthetically pleasing to have thicker bars, this makes the plot look for like the grouped horizontal bar graph that you want
while (curr_thickness < thickness)
% Each bar group has two bars; set the first to be red, the second to be blue (i.e., even index means red bar, odd index means blue bar)
if mod(i, 2)
plot([floating_bars(i,1), floating_bars(i,1)+L(i)], [floating_bars(i,2) + curr_thickness, floating_bars(i,2) + curr_thickness], 'r')
else
plot([floating_bars(i,1), floating_bars(i,1)+L(i)], [floating_bars(i,2) + curr_thickness, floating_bars(i,2) + curr_thickness], 'b')
end
curr_thickness = curr_thickness + 0.05;
hold on % Make sure that plotting the current floating bar does not overwrite previous float bars that have already been plotted
end
end
ylim([ -10 30]) % Set the y-axis limits so that you can see more clearly the floating bars that would have rested right on the x-axis (y = 0)
Output:
How Do I Do This With the barh Function?
The short answer is that you'd have to modify the function manually. Someone has already done this with one of the bar graph plotting functions provided by MATLAB, bar3. The logic implemented in this modified bar3 function can be re-applied for your purposes if you read their barNew.m function and tweak it a bit. If you'd like a pointer as to where to start, I'd suggest looking at how they specify z-axis minimum and maximums for their floating bars on the plot, and apply that same logic to specify x-axis minimum and maximums for your floating bars in your 2D case.
I hope this helps, happy coding! :)
I explain here my approach to generate these type of graphs. Not sure if it is the best but it works and there is no need to do anything manually. I came up with this solution based on the following Vladislav Martin's explained fact: "The y-coordinates in this array only serve to distinguish Array1, Array2, and so on from each other".
My original arrays are:
Array4=[Nan....;20;21;22;23;24;..........60]
Array3=[Nan....;20;21;22;23;24;..........60]
Array2=[0;1;2;3;4;5;6;Nan;Nan;Nan;Nan;17;18;.....60]
Array1=[0;1;2;3;4;5;6;Nan;Nan;Nan;Nan;17;18;.....60]
x={'0m';'1m';'2m';'3m';'4m';....'60m'}
The values contained in these arrays make reference to the x-axis on the graph. In order to make the things more simple and to avoid having to code a function to determine the length for each discontinuity in the arrays, I replace these values for y-axis position values. Basically I give to Array1 y-axis position values of 0 and to Array2 0+0.02=0.02. To Array3 I give y-axis position values of 0.5 and to Array4 0.5+0.02=0.52. In this way, Array2 will be plotted on the graph closer to Array1 which will form the first group and Array4 closer to Array3 which will form the second group.
Datatable=table(Array1,Array2,Array3,Array4);
cont1=0;
cont2=0.02;
for col=1:2:size(Datatable,2)
col2=col+1;
for row=1:size(Datatable,1)
if isnan(Datatable{row,col})==0 % For first array in the group: If the value is not nan, I replace it for the corresponnding cont1 value
Datatable{row,col}=cont1;
end
if isnan(Datatable{row,col2})==0 % For second array in the group: If the value is not nan, I replace it for the corresponnding cont2 value
Datatable{row,col2}=cont2;
end
end
cont1=cont1+0.5;
cont2=cont2+0.5;
end
The result of the above code will be a table like the following:
And now I plot the Arrays using 2D floating lines:
figure
for array=1:2:size(Datatable,2)
FirstPair=cell2mat(table2cell(Datatable(:,array)));
SecondPair=cell2mat(table2cell(Datatable(:,array+1)));
hold on
plot(1:numel(x),FirstPair,'r','Linewidth',6)
plot(1:numel(x),SecondPair,'b','Linewidth',6)
hold off
end
set(gca,'xticklabel',x)
And this will generate the following graph:

Plot 3d surface map from data frame

I first begin by running the code below to tune a SVM:
tunecontrol <- tune.control(nrepeat=5, sampling = "fix",cross=5, performances=T)
tune_svm1 <- tune(svm,
Y ~ 1
+ X
, data = data,
ranges = list(epsilon = seq(epsilon_start
,epsilon_end
,(epsilon_end-epsilon_start)/10)
, cost = cost_start*(1:5)
, gamma = seq(gamma_start
,gamma_end
,(gamma_end - gamma_start)/5))
, tunecontrol=tunecontrol)
In tune_svm1$performances I have 330 observations containing all the values for epsilon, cost, and gamma that I stated in the ranges section of the above code as well as another column for the calculated error.
I'd like to generate a 3D surface plot for epsilon, cost, gamma, and error using three variables as X,Y,Z and the last for color. I've read on several resources for plot3d and persp but have had a lot of difficulty implementing.
If I try to follow the examples provided and use mesh to generate a mesh plot, I can only mesh together 3 of the 4 variables from tune_svm1$performances and saving the separate results for X,Y and Z as shown in the first link is difficult because the mesh is saved as an array, not a matrix. I've tried to hack a graph using the following code but the visual is nonsensical (probably because the order isn't being preserved by meshing each individually:
M1 <- mesh(tune_svm1$performances$epsilon[1:nrow(tune_svm1$performances)]
,tune_svm1$performances$cost[1:nrow(tune_svm1$performances)])
M2 <- mesh(tune_svm1$performances$epsilon[1:nrow(tune_svm1$performances)]
,tune_svm1$performances$gamma[1:nrow(tune_svm1$performances)])
M3 <- mesh(tune_svm1$performances$epsilon[1:nrow(tune_svm1$performances)]
,tune_svm1$performances$error[1:nrow(tune_svm1$performances)])
x <- M1$x ; y <- M1$y ; z <- M2$y ; c <- M3$y
surf3D(x,y,c, colvar = c)
What's the best way to approach this? Thank you.

Making coordinates out of arrays

So, I have two arrays:
X'
ans =
2.5770 2.5974 2.1031 2.7813 2.6083 2.9498 3.0053 3.3860
>> Y'
ans =
0.7132 0.5908 1.9988 1.0332 1.3301 1.1064 1.3522 1.3024
I would like to combine n-th members of two arrays together, and than plot those coordinates on graph.
So it should be:
{(2.5770,0.7132), (2.5974,0.5908)...}
Is this possible to do? If so, how?
Schorsch showed that it is simple to plot, but just to answer the question as asked in the title, you can combine the arrays into coordinates by just arranging the vectors like rectangles.
Your x and y are vertical, so you can put them side-by-side in a 2-column matrix:
combined = [x y]
or transform and have 2 rows: combined = [x' ; y']
(Because they're vertical, what you don't want is these, which would concatenate them out into one long column or row: [x ; y] or [x' y'])
Just to be clear, though, this is not needed for plotting.
Edit: A suggested edit asked what happens if you plot(combined). That depends if it's the horizontal or vertical version. In any case, plotting a 2x? matrix won't plot x vs. y. It plots all of the columns versus the simple indices 1,2,3,... So the first way I defined combined will make two lines, plotting x and y on the y-axis against their indices on the x-axis, and the second version of combined will make a strange plot with the all of the values of x plotted in a vertical column where x=1 and all of the points of y beside those at x=2.

Resources