geom_vline with different vlines for each facet - python-ggplot

Having an issue convincing python-ggplot to honor different vlines in different panels. R's ggplot2 geom_vline documentation suggests a way of doing so, which I attempted to follow here:
df = pd.DataFrame({'x': np.random.random(size = 200), 'y': np.random.random(size = 200), 'cat': np.random.randint(0, high = 4, size = 200)})
vlines = pd.DataFrame({'z': [0.25, 0.375, 0.5, 0.75], 'vs': [0,0,1,1], 'am': [0,1,0,1]})
pl = ggplot(df, aes('x','y')) + \
geom_point() + \
facet_wrap('cat', scales = 'fixed') + \
geom_vline(aes(xintercept = 'z', linetype = 'dash'), vlines)
Doesn't work.
Is this something that just hasn't been added to python-ggplot yet, or am I misinterpreting how to analogize the r ggplot2 documentation?

See https://stackoverflow.com/a/44197340/6805670
You'll need to create another df with xintercept for each facet

Related

How to create my own layers on MONAI U-Net?

I'm using MONAI on Spyder Anaconda to build a U-Net network. I want to add/modify layers starting from this baseline.
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = nets.UNet(
spatial_dims = 2,
in_channels = 3,
out_channels = 1,
channels = (4, 8, 16, 32, 64),
strides = (2, 2, 2, 2),
num_res_units = 3,
norm = layers.Norm.BATCH,
kernel_size=3,).to(device)
loss_function = losses.DiceLoss()
torch.backends.cudnn.benchmark = True
optimizer = torch.optim.Adam(model.parameters(), lr = 1e-4, weight_decay = 0)
post_pred = Compose([EnsureType(), Activations(sigmoid = True), AsDiscrete(threshold=0.5)])
post_label = Compose([EnsureType()])
inferer = SimpleInferer()
utils.set_determinism(seed=46)
My final aim is to create a MultiResUNet that has different layers such as:
class Conv2d_batchnorm(torch.nn.Module):
'''
2D Convolutional layers
Arguments:
num_in_filters {int} -- number of input filters
num_out_filters {int} -- number of output filters
kernel_size {tuple} -- size of the convolving kernel
stride {tuple} -- stride of the convolution (default: {(1, 1)})
activation {str} -- activation function (default: {'relu'})
'''
def __init__(self, num_in_filters, num_out_filters, kernel_size, stride = (1,1), activation = 'relu'):
super().__init__()
self.activation = activation
self.conv1 = torch.nn.Conv2d(in_channels=num_in_filters, out_channels=num_out_filters, kernel_size=kernel_size, stride=stride, padding = 'same')
self.batchnorm = torch.nn.BatchNorm2d(num_out_filters)
def forward(self,x):
x = self.conv1(x)
x = self.batchnorm(x)
if self.activation == 'relu':
return torch.nn.functional.relu(x)
else:
return x
This is just an example of a different Conv2d layer that I would use instead of the native one of the baseline.
Hope some of you can figure out how to proceed.
Thanks, Fede

Data arrays must have the same length, and match time discretization in dynamic problems error in GEKKO

I want to find the value of the parameter m that minimizes my variable x subject to a system of differential equations. I have the following code
from gekko import GEKKO
def run_model_m(days, population, case, k_val, b_val, u0_val, sigma_val, Kmax0, a_val, c_val):
list_x =[]
list_u =[]
list_Kmax =[]
for i in range(len(days)):
list_xi=[]
list_ui=[]
list_Ki=[]
for j in range(len(days[i])):
#try:
m = GEKKO(remote=False)
#m.time= days[i][j]
eval = np.linspace(days[i][j][0], days[i][j][-1], 100, endpoint=True)
m.time = eval
x_data= population[i][j]
variable= np.linspace(population[i][j][0], population[i][j][-1], 100, endpoint=True)
x = m.Var(value=population[i][j][0], lb=0)
sigma= m.Param(sigma_val)
d = m.Param(c_val)
k = m.Param(k_val)
b = m.Param(b_val)
r = m.Param(a_val)
step = np.ones(len(eval))
step= 0.2*step
step[0]=1
m_param = m.CV(value=1, lb=0, ub=1, integer=True); m_param.STATUS=1
u = m.Var(value=u0_val, lb=0, ub=1)
#m.free(u)
a = m.Param(a_val)
c= m.Param(c_val)
Kmax= m.Param(Kmax0)
if case == 'case0':
m.Equations([x.dt()== x*(r*(1-x/(Kmax))-m_param/(k+b*u)-d), u.dt()== sigma*(m_param*b/((k+b*u)**2))])
elif case == 'case4':
m.Equations([x.dt()== x*(r*(1-u**2)*(1-x/(Kmax))-m_param/(k+b*u)-d), u.dt() == sigma*(-2*u*r*(1-x/(Kmax))+(b*m_param)/(b*u+k)**2)])
p = np.zeros(len(eval))
p[-1] = 1.0
final = m.Param(value=p)
m.Obj(x)
m.options.IMODE = 6
m.options.MAX_ITER=15000
m.options.SOLVER=1
# optimize
m.solve(disp=False, GUI=False)
#m.open_folder(dataset_path+'inf')
list_xi.append(x.value)
list_ui.append(u.value)
list_Ki.append(m_param.value)
list_x.append(list_xi)
list_Kmax.append(list_Ki)
list_u.append(list_ui)
return list_x, list_u, list_Kmax, m.options.OBJFCNVAL
scaled_days[i][j] =[-7.0, 42.0, 83.0, 125.0, 167.0, 217.0, 258.0, 300.0, 342.0]
scaled_pop[i][j] = [0.01762491277346285, 0.020592540360308997, 0.017870838266697213, 0.01690069378982034,0.015512320147187675,0.01506701796298272,0.014096420738841563,0.013991224004743027,0.010543380664478205]
k0,b0,group, case0, u0, sigma0, K0, a0, c0 = (100, 20, 'Size3, Inc', 'case0', 0.1, 0.05, 2, 0, 0.01)
list_x2, list_u2, list_Kmax2,final =run_model_m(days=[[scaled_days[i][j]]], population=
[[scaled_pop[i][j]]],case=case1, k_val=list_b1[i0][0], b_val=b1, u0_val=list_u1[i0][j0],
sigma_val=sigma1, Kmax0=K1, a_val=list_Kmax1[0][0], c_val=c1)
I get the error Data arrays must have the same length, and match time discretization in dynamic problems error but I don't understand why. I have tried making x and m_param arrays, with x=m.Var, m_param =m.MV... But still get the same error, even if they are all arrays of the same length. Is this the right way to find the solution of the minimization problem?
I think the error was just that in run_model_m I was passing a list as u0_val and it didn't have the same dimensions as m.time. So it should be u0_val=list_u1[0][0][0]

Map Layer Issues in ggplot2

I'm having a few issues with finalizing my map for a report. I think I'm warm on the solutions, but haven't quite figured them out. I would really appreciate any help on solutions so that I can finally move on!
1) The scale bar will NOT populate in the MainMap code and the subsequent Figure1 plot. This is "remedied" in the MainMap code if I comment out the "BCWA_land" map layer. However, when I retain the "BCWA_land" map layer it will eliminate the scale bar and produces this error:
Warning message: Removed 3 rows containing missing values (geom_text).
And this is the code:
MainMap <- ggplot(QOI) +
geom_sf(aes(fill = quadID)) +
scale_fill_manual(values = c("#6b8c42",
"#70b2ae",
"#d65a31")) +
labs(fill = "Quadrants of Interest",
caption = "Figure 1: Map depicting the quadrants in the WSDOT project area as well as other quadrants of interest in the Puget Sound area.")+
ggtitle("WSDOT Project Area and Quadrants of Interest") +
scalebar(x.min = -123, x.max = -122.8, y.min = 47, y.max = 47.1, location = "bottomleft",
transform = TRUE, dist = 10, dist_unit = "km", st.size = 3, st.bottom = TRUE, st.dist = 0.1) +
north(data = QOI, location = "topleft", scale = 0.1, symbol = 12, x.min = -123, y.min = 48.3, x.max = -122.7, y.max = 48.4) +
theme_bw()+
theme(panel.grid= element_line(color = "gray50"),
panel.background = element_blank(),
panel.ontop = TRUE,
legend.text = element_text(size = 11, margin = margin(l = 3), hjust = 0),
legend.position = c(0.95, 0.1),
legend.justification = c(0.85, 0.1),
legend.background = element_rect(colour = "#3c4245", fill = "#f4f4f3"),
axis.title = element_blank(),
plot.title = element_text(face = "bold", colour = "#3c4245", hjust = 0.5, margin = margin(b=10, unit = "pt")),
plot.caption = element_text(face = "italic", colour = "#3c4245", margin = margin(t = 7), hjust = 0, vjust = 0.5)) +
geom_sf(data = BCWA_land) + #this is what I've tried to comment out to determine the scale bar problem
xlim (-123.1, -121.4) +
ylim (47.0, 48.45)
MainMap
InsetRect <- data.frame(xmin=-123.2, xmax=-122.1, ymin=47.02, ymax=48.45)
InsetMap <- ggplotGrob( ggplot( quads) +
geom_sf(aes(fill = "")) +
scale_fill_manual(values = c("#eefbfb"))+
geom_sf(data = BCWA_land) +
scale_x_continuous(expand = c(0,0), limits = c(-124.5, -122.0)) +
scale_y_continuous(expand = c(0,0), limits = c(47.0, 49.5)) +
geom_rect(data = InsetRect,aes(xmin=xmin, xmax=xmax, ymin=ymin, ymax=ymax),
color="#3c4245",
size=1.25,
fill=NA,
inherit.aes = FALSE) +
theme_bw()+
theme(legend.position = "none",
panel.grid = element_blank(),
axis.title = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
plot.margin = margin(0,0,0,0)))
InsetMap
Figure1 <- MainMap +
annotation_custom(grob = InsetMap, xmin = -122.2, xmax = -121.3,
ymin = 47.75, ymax = 48.5)
Figure1
As you can see I'm not getting this issue or error for my north arrow so I'm not really sure what is happening with the scale bar!
This problem is probably a little too OCD, however I REALLY don't want the gridlines to show up on the InsetMap, and was hoping that the InsetMap would overlay on top of the MainMap, without gridlines as I had those parameters set to element_blank() in the InsetMap code.
Here is an image of my plot. If you would like the data for this, please let me know. Because these are shapefiles, the data is unwieldy and not super conducive to SO's character limit for a post...
If anyone has any insight into a solution(s) I would so so appreciate that!! Thanks for your time!
The issue was the
xlim (-123.1, -121.4) +
ylim (47.0, 48.45)
call that I made. Instead, I used coord_sf(xlim = c(min, max), ylim = c(min, max)). I thought that this would be helpful to someone who might be in my position later on!
Essentially the difference between setting the limits of to the graph using just the x/y lim calls is that that truncates the data that is available in your dataset, whereas the coord_sf call simply "focuses" your graph on that extent if you will without altering the data you have available in your dataset.

How to transpose a vector of strings in Julia 0.6 (in order to plot multiple legends)?

In Julia 0.5 you can write:
using DataFrames, Plots, StatPlots
df = DataFrame(
fruit = ["orange","orange","orange","orange","apple","apple","apple","apple"],
year = [2010,2011,2012,2013,2010,2011,2012,2013],
production = [120,150,170,160,100,130,165,158],
consumption = [70,90,100,95, 80,95,110,120]
)
plotlyjs()
mycolours = [:green :orange]
legend1 = sort(unique(df[:fruit]))
legend2 = legend1'
fruits_plot = plot(df, :year, :production, group=:fruit, linestyle = :solid, linewidth=3, label = ("Production of " * legend2), color=mycolours)
where legend1 is a 2-element DataArrays.DataArray{String,1} and legend2 is a 1×2 DataArrays.DataArray{String,2}.
Now, in julia 0.6 legend2 = legend1' is not working any more. You can do instead a legend2 = reshape(legend1, (1, :)), but that produces a pretty obscure 1×2 Base.ReshapedArray{String,2,DataArrays.DataArray{String,1},Tuple{}} that then is not accepted in the plot() call.
So, any way in julia 0.6 to produce from a 2-element DataArrays.DataArray{String,1} a 1×2 DataArrays.DataArray{String,2} ?
Again, posting on SO helps.. ;-)
I finally got that I can obtain the plot anticipating the string concatenation:
fruits_plot = plot(df, :year, :production, group=:fruit, linestyle = :solid, linewidth=3, label= reshape("Production of " * sort(unique(df[:fruit])),(1,:)), color=mycolours)

R: how to properly create rx_forest_model object?

I'm trying to do a churn analysis with R and SQL Server 2016.
I have uploaded my dataset on my database in a local SQL Server and I did all the preliminary work on this dataset.
Well, now I have this function trainModel() which I would use to estimate my random model forest:
trainModel = function(sqlSettings, trainTable) {
sqlConnString = sqlSettings$connString
trainDataSQL <- RxSqlServerData(connectionString = sqlConnString,
table = trainTable,
colInfo = cdrColInfo)
## Create training formula
labelVar = "churn"
trainVars <- rxGetVarNames(trainDataSQL)
trainVars <- trainVars[!trainVars %in% c(labelVar)]
temp <- paste(c(labelVar, paste(trainVars, collapse = "+")), collapse = "~")
formula <- as.formula(temp)
## Train gradient tree boosting with mxFastTree on SQL data source
library(RevoScaleR)
rx_forest_model <- rxDForest(formula = formula,
data = trainDataSQL,
nTree = 8,
maxDepth = 16,
mTry = 2,
minBucket = 1,
replace = TRUE,
importance = TRUE,
seed = 8,
parms = list(loss = c(0, 4, 1, 0)))
return(rx_forest_model)
}
But when I run the function I get this wrong output:
> system.time({
+ trainModel(sqlSettings, trainTable)
+ })
user system elapsed
0.29 0.07 58.18
Warning message:
In tempGetNumObs(numObs) :
Number of observations not available for this data source. 'numObs' set to 1e6.
And for this warning message, the function trainModel() does not create the object rx_forest_model
Does anyone have any suggestions on how to solve this problem?
After several attempts, I found the reason why the function trainModel() did not function properly.
Is not a connection string problem and is not even a data source type issue.
The problem is in the syntax of function trainModel().
It is enough to eliminate from the body of the function the statement:
return(rx_forest_model)
In this way, the function returns the same warning message, but creates the object rx_forest_model in the correct way.
So, the correct function is:
trainModel = function(sqlSettings, trainTable) {
sqlConnString = sqlSettings$connString
trainDataSQL <- RxSqlServerData(connectionString = sqlConnString,
table = trainTable,
colInfo = cdrColInfo)
## Create training formula
labelVar = "churn"
trainVars <- rxGetVarNames(trainDataSQL)
trainVars <- trainVars[!trainVars %in% c(labelVar)]
temp <- paste(c(labelVar, paste(trainVars, collapse = "+")), collapse = "~")
formula <- as.formula(temp)
## Train gradient tree boosting with mxFastTree on SQL data source
library(RevoScaleR)
rx_forest_model <- rxDForest(formula = formula,
data = trainDataSQL,
nTree = 8,
maxDepth = 16,
mTry = 2,
minBucket = 1,
replace = TRUE,
importance = TRUE,
seed = 8,
parms = list(loss = c(0, 4, 1, 0)))
}

Resources