Related
I am new to r and I am trying to get my mind around the apply function. So far I managed to run my anovas for all the the variables on my data and I got the pairwise comparison.
varlist <- names(dt)[5:length(dt)]
# loop
models <- lapply(X = varlist,
FUN = function(t) lm(formula = paste0("`", t, "` ~ block+irrigation*genotype"), data = dt))
#Name the list of models to the column name
names(models) = varlist
## apply anova to each model stored in the list, models
lapply(models, anova)
#marginal-means-all-variable}
res.model1 <- lapply(models, function(x) pairs(emmeans(x, ~genotype:irrigation)))
res.model1
So far so good, now I want to create a compact letter list so I can use to plot it. Previously I used the following but I can't work out how to apply an lapply function to the following code
CLD = cld(res.model1,
alpha=0.05,
Letters=letters,
adjust="tukey")
I use the CLD data to create graphs
I manage to get the letters with the following code but then I am not getting the full anova table.
tx <- with(dt, interaction(irrigation, genotype)) # determining the factors
model2 <- lapply(varlist, function(x) {
lm(substitute(i~block+tx, list(i = as.name(x))), data = dt)}) # using the factors already in "tx"
lapply(model2, anova)
letters = lapply(model2, function(m) HSD.test((m), "tx", alpha = 0.05, group = TRUE, console = TRUE))
Any suggestions to achieve what I need.
Thank you
I tried to follow the examples in the
Link 1 - Sparse Matrix
https://www.tidyverse.org/blog/2020/11/tidymodels-sparse-support/
Link 2 - Workflow_sets
https://www.tmwr.org/workflow-sets.html
I had trouble including the blue print into the workflow sets.
In the examples where workflow_set is defined in link 2
no_pre_proc <-
workflow_set(
preproc = list(simple = model_vars),
models = list(MARS = mars_spec, CART = cart_spec, CART_bagged = bag_cart_spec,
RF = rf_spec, boosting = xgb_spec, Cubist = cubist_spec)
)
and the way we add blue print into the workflow in link 1
wf_sparse <-
workflow() %>%
add_recipe(text_rec, blueprint = sparse_bp) %>%
add_model(lasso_spec)
wf_default <-
workflow() %>%
add_recipe(text_rec) %>%
add_model(lasso_spec)
Where and how do I add the "blueprint = sparse_bp" option in the workflow_set above?
My attempts were
no_pre_proc <-
workflow_set(
preproc = list(simple = model_vars),
models = list(MARS = mars_spec, CART = cart_spec, CART_bagged = bag_cart_spec,
RF = rf_spec, boosting = xgb_spec, Cubist = cubist_spec)) %>%
option_add(update_blueprint(blueprint = sparse_bp))
Running the racing tune gave me this error
Error: Problem with `mutate()` column `option`.
i `option = purrr::map(option, append_options, dots)`.
x All options should be named.
Run `rlang::last_error()` to see where the error occurred
<error/rlang_error>
There were 9 workflows that had no results.
Backtrace:
1. ggplot2::autoplot(...)
2. workflowsets:::autoplot.workflow_set(...)
3. workflowsets:::rank_plot(...)
4. workflowsets:::pick_metric(object, rank_metric, metric)
6. workflowsets:::collect_metrics.workflow_set(x)
7. workflowsets:::check_incompete(x, fail = TRUE)
8. workflowsets:::halt(msg)
Run `rlang::last_trace()` to see the full context.
> rlang::last_trace()
<error/rlang_error>
There were 9 workflows that had no results.
Backtrace:
x
1. +-ggplot2::autoplot(...)
2. \-workflowsets:::autoplot.workflow_set(...)
3. \-workflowsets:::rank_plot(...)
4. \-workflowsets:::pick_metric(object, rank_metric, metric)
5. +-tune::collect_metrics(x)
6. \-workflowsets:::collect_metrics.workflow_set(x)
7. \-workflowsets:::check_incompete(x, fail = TRUE)
8. \-workflowsets:::halt(msg)
>
thanks,
Thank you for asking this question; we definitely are not supporting this use case (passing non-default arguments to the recipe or model) very well right now. We've opened an issue here where you can track our work on this.
In the meantime, you could try a bit of a hacky workaround by manually using update_recipe() on the workflow you are interested in:
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#> method from
#> required_pkgs.model_spec parsnip
data(parabolic)
set.seed(1)
split <- initial_split(parabolic)
train_set <- training(split)
test_set <- testing(split)
glmnet_spec <-
logistic_reg(penalty = 0.1, mixture = 0) %>%
set_engine("glmnet")
rec <-
recipe(class ~ ., data = train_set) %>%
step_YeoJohnson(all_numeric_predictors())
sparse_bp <- hardhat::default_recipe_blueprint(composition = "dgCMatrix")
wfs_orig <-
workflow_set(
preproc = list(yj = rec,
norm = rec %>% step_normalize(all_numeric_predictors())),
models = list(regularized = glmnet_spec)
)
new_wf <-
wfs_orig %>%
extract_workflow("yj_regularized") %>%
update_recipe(rec, blueprint = sparse_bp)
Created on 2021-12-09 by the reprex package (v2.0.1)
Then (I know this feels hacky for now) manually take this new_wf and stick it in to the wfs_orig$info[[1]]$workflow slot to replace what is there.
I'm creating a shiny app and i'm letting the user choose what data that should be displayed in a plot and a table. This choice is done through 3 different input variables that contain 14, 4 and two choices respectivly.
ui <- dashboardPage(
dashboardHeader(),
dashboardSidebar(
selectInput(inputId = "DataSource", label = "Data source", choices =
c("Restoration plots", "all semi natural grasslands")),
selectInput(inputId = "Variabel", label = "Variable", choices =
choicesVariables)),
#choicesVariables definition is omitted here, because it's very long but it
#contains 14 string values
selectInput(inputId = "Factor", label = "Factor", choices = c("Company
type", "Region and type of application", "Approved or not approved
applications", "Age group" ))
),
dashboardBody(
plotOutput("thePlot"),
tableOutput("theTable")
))
This adds up to 73 choices (yes, i know the math doesn't add up there, but some choices are invalid). I would like to do this using a lookup table so a created one with every valid combination of choices like this:
rad1<-c(rep("Company type",20), rep("Region and type of application",20),
rep("Approved or not approved applications", 13), rep("Age group", 20))
rad2<-choicesVariable[c(1:14,1,4,5,9,10,11, 1:14,1,4,5,9,10,11, 1:7,9:14,
1:14,1,4,5,9,10,11)]
rad3<-c(rep("Restoration plots",14),rep("all semi natural grasslands",6),
rep("Restoration plots",14), rep("all semi natural grasslands",6),
rep("Restoration plots",27), rep("all semi natural grasslands",6))
rad4<-1:73
letaLista<-data.frame(rad1,rad2,rad3, rad4)
colnames(letaLista) <- c("Factor", "Variabel", "rest_alla", "id")
Now its easy to use subset to only get the choice that the user made. But how do i use this information to plot the plot and table without using a 73 line long ifelse statment?
I tried to create some sort of multidimensional array that could hold all the tables (and one for the plots) but i couldn't make it work. My experience with these kind of arrays is limited and this might be a simple issue, but any hints would be helpful!
My dataset that is the foundation for the plots and table consists of dataframe with 23 variables, factors and numerical. The plots and tabels are then created using the following code for all 73 combinations
s_A1 <- summarySE(Samlad_info, measurevar="Dist_brukcentrum",
groupvars="Companytype")
s_A1 <- s_A1[2:6,]
p_A1=ggplot(s_A1, aes(x=Companytype,
y=Dist_brukcentrum))+geom_bar(position=position_dodge(), stat="identity") +
geom_errorbar(aes(ymin=Dist_brukcentrum-se,
ymax=Dist_brukcentrum+se),width=.2,position=position_dodge(.9))+
scale_y_continuous(name = "") + scale_x_discrete(name = "")
where summarySE is the following function, burrowed from cookbook for R
summarySE <- function(data=NULL, measurevar, groupvars=NULL, na.rm=TRUE,
conf.interval=.95, .drop=TRUE) {
# New version of length which can handle NA's: if na.rm==T, don't count them
length2 <- function (x, na.rm=FALSE) {
if (na.rm) sum(!is.na(x))
else length(x)
}
# This does the summary. For each group's data frame, return a vector with
# N, mean, and sd
datac <- ddply(data, groupvars, .drop=.drop,
.fun = function(xx, col) {
c(N = length2(xx[[col]], na.rm=na.rm),
mean = mean (xx[[col]], na.rm=na.rm),
sd = sd (xx[[col]], na.rm=na.rm)
)
},
measurevar
)
# Rename the "mean" column
datac <- rename(datac, c("mean" = measurevar))
datac$se <- datac$sd / sqrt(datac$N) # Calculate standard error of the mean
# Confidence interval multiplier for standard error
# Calculate t-statistic for confidence interval:
# e.g., if conf.interval is .95, use .975 (above/below), and use df=N-1
ciMult <- qt(conf.interval/2 + .5, datac$N-1)
datac$ci <- datac$se * ciMult
return(datac)
}
The code in it's entirety is a bit to large but i hope this may clarify what i'm trying to do.
Well, thanks to florian's comment i think i might have found a solution my self. I'll present it here but leave the question open as there is probably far neater ways of doing it.
I rigged up the plots (that was created as lists by ggplot) into a list
plotList <- list(p_A1, p_A2, p_A3...)
tableList <- list(s_A1, s_A2, s_A3...)
I then used subset on my lookup table to get the matching id of the list to select the right plot and table.
output$thePlot <-renderPlot({
plotValue<-subset(letaLista, letaLista$Factor==input$Factor &
letaLista$Variabel== input$Variabel & letaLista$rest_alla==input$DataSource)
plotList[as.integer(plotValue[1,4])]
})
output$theTable <-renderTable({
plotValue<-subset(letaLista, letaLista$Factor==input$Factor &
letaLista$Variabel== input$Variabel & letaLista$rest_alla==input$DataSource)
skriva <- tableList[as.integer(plotValue[4])]
print(skriva)
})
I have merged different sources of code to make an app that allows one to upload a file (data frame).
However, beyond this I would also like to make it possible to select specific columns from the data frame and analyse them. This is difficult however as one must predefine the given data frame in order to be able to refer to it in the ui.R script....
So when a previously undefined data frame is uploaded to the site, one can not revere to it in the ui.R as it is defined in the server....
predefined variables
vchoices <- 1:ncol(mtcars)
names(vchoices) <- names(mtcars)
ui.R
runApp(
ui = basicPage(
h2('The uploaded file data'),
dataTableOutput('mytable'),
fileInput('file', 'Choose info-file to upload',
accept = c(
'text/csv',
'text/comma-separated-values',
'text/tab-separated-values',
'text/plain',
'.csv',
'.tsv'
)
),
actionButton("choice", "incorporate external information"),
selectInput("columns", "Select Columns", choices=vchoices, inline = T),
#notice that the 'choices' in selectInput are set to the predefined
#variables above whereas I would like to set them equal to the
#not yet defined uploaded file below in server.R
tableOutput("table_display")
))
Notice that the 'choices' in selectInput are set to the predefined variables above whereas I would like to set them equal to the not yet defined uploaded file below in server.R
server.R
server = function(input, output) {
info <- eventReactive(input$choice, {
inFile <- input$file
if (is.null(inFile))
return(NULL)
isolate(f<-read.table(inFile$datapath, header = T,
sep = "\t"))
f
})
output$table_display<-renderTable({
f<-info()
f<-subset(f, select=input$columns) #subsetting takes place here
head(f)
})
}
Does anyone know of a way to refer to a variable that's defined in in the server, in the ui and thus allow for interactive manipulation?
You can use a family of functions update*Input - in this case updateSelectInput. Its first argument has to be session and you also have to add session to server <- function(input, output) to be able to update your widget.
You could make an update of the widget immediately after clicking on the actionButton - so, you had to use updateSelectInput within eventReactive.
Let's take a look how we can do that:
First, you can save the names of columns of the new uploaded dataset in a variable, say, vars and then pass it to the function updateSelectInput.
(The choices of the selectInput are initially set to NULL - we don't need to specify them before because they are going to be updated anyway)
info <- eventReactive(input$choice, {
inFile <- input$file
# Instead # if (is.null(inFile)) ... use "req"
req(inFile)
# Changes in read.table
f <- read.table(inFile$datapath, header = input$header, sep = input$sep, quote = input$quote)
vars <- names(f)
# Update select input immediately after clicking on the action button.
updateSelectInput(session, "columns","Select Columns", choices = vars)
f
})
I've added a small upload interface to your code.
The other way would be to define widgets on the server side and then to pass them to the client side via renderUI function. You can find here an example.
Full example:
library(shiny)
ui <- fluidPage(
h2('The uploaded file data'),
dataTableOutput('mytable'),
fileInput('file', 'Choose info-file to upload',
accept = c(
'text/csv',
'text/comma-separated-values',
'text/tab-separated-values',
'text/plain',
'.csv',
'.tsv'
)
),
# Taken from: http://shiny.rstudio.com/gallery/file-upload.html
tags$hr(),
checkboxInput('header', 'Header', TRUE),
radioButtons('sep', 'Separator',
c(Comma=',',
Semicolon=';',
Tab='\t'),
','),
radioButtons('quote', 'Quote',
c(None='',
'Double Quote'='"',
'Single Quote'="'"),
'"'),
################################################################
actionButton("choice", "incorporate external information"),
selectInput("columns", "Select Columns", choices = NULL), # no choices before uploading
tableOutput("table_display")
)
server <- function(input, output, session) { # added session for updateSelectInput
info <- eventReactive(input$choice, {
inFile <- input$file
# Instead # if (is.null(inFile)) ... use "req"
req(inFile)
# Changes in read.table
f <- read.table(inFile$datapath, header = input$header, sep = input$sep, quote = input$quote)
vars <- names(f)
# Update select input immediately after clicking on the action button.
updateSelectInput(session, "columns","Select Columns", choices = vars)
f
})
output$table_display <- renderTable({
f <- info()
f <- subset(f, select = input$columns) #subsetting takes place here
head(f)
})
}
shinyApp(ui, server)
i have many data.frames() that i am trying to send to MySQL database via RMySQL().
# Sends data frame to database without a problem
dbWriteTable(con3, name="SPY", value=SPY , append=T)
# stock1 contains a character vector of stock names...
stock1 <- c("SPY.A")
But when I try to loop it:
i= 1
while(i <= length(stock1)){
# converts "SPY.A" into SPY
name <- print(paste0(str_sub(stock1, start = 1, end = -3))[i], quote=F)
# sends data.frame to database
dbWriteTable(con3,paste0(str_sub(stock1, start = 1, end = -3))[i], value=name, append=T)
i <- 1+i
}
The following warning is returned & nothing was sent to database
In addition: Warning message:
In file(fn, open = "r") :
cannot open file './SPY': No such file or directory
However, I believe that the problem is with pasting value onto dbWriteTable() since writing dbWriteTable(con3, "SPY", SPY, append=T) works but dbWriteTable(con3, "SPY", name, append=T) will not...
You are probably using a non-base package for str_sub and I'm guessing you get the same behavior with substr. Does this succeed?
dbWriteTable(con3, substr( stock1, 1,3) , get(stock1), append=T)