I would like to put my summary statistics into a table using the kable function, but I cannot because it comes up as an array.
```{r setup options, include = FALSE}
knitr::opts_chunk$set(fig.width = 8, fig.height = 5, echo = TRUE)
library(mosaic)
library(knitr)
```
```{r}
sum = summary(SwimRecords$time) # generic data set included with mosaic package
kable(sum) # I want this to be printed into a table
```
Any suggestions?
You can do so easily with the broom package which is built to "tidy" these stats-related objects:
#install.packages(broom)
broom::tidy(sum)
I'm creating a shiny app and i'm letting the user choose what data that should be displayed in a plot and a table. This choice is done through 3 different input variables that contain 14, 4 and two choices respectivly.
ui <- dashboardPage(
dashboardHeader(),
dashboardSidebar(
selectInput(inputId = "DataSource", label = "Data source", choices =
c("Restoration plots", "all semi natural grasslands")),
selectInput(inputId = "Variabel", label = "Variable", choices =
choicesVariables)),
#choicesVariables definition is omitted here, because it's very long but it
#contains 14 string values
selectInput(inputId = "Factor", label = "Factor", choices = c("Company
type", "Region and type of application", "Approved or not approved
applications", "Age group" ))
),
dashboardBody(
plotOutput("thePlot"),
tableOutput("theTable")
))
This adds up to 73 choices (yes, i know the math doesn't add up there, but some choices are invalid). I would like to do this using a lookup table so a created one with every valid combination of choices like this:
rad1<-c(rep("Company type",20), rep("Region and type of application",20),
rep("Approved or not approved applications", 13), rep("Age group", 20))
rad2<-choicesVariable[c(1:14,1,4,5,9,10,11, 1:14,1,4,5,9,10,11, 1:7,9:14,
1:14,1,4,5,9,10,11)]
rad3<-c(rep("Restoration plots",14),rep("all semi natural grasslands",6),
rep("Restoration plots",14), rep("all semi natural grasslands",6),
rep("Restoration plots",27), rep("all semi natural grasslands",6))
rad4<-1:73
letaLista<-data.frame(rad1,rad2,rad3, rad4)
colnames(letaLista) <- c("Factor", "Variabel", "rest_alla", "id")
Now its easy to use subset to only get the choice that the user made. But how do i use this information to plot the plot and table without using a 73 line long ifelse statment?
I tried to create some sort of multidimensional array that could hold all the tables (and one for the plots) but i couldn't make it work. My experience with these kind of arrays is limited and this might be a simple issue, but any hints would be helpful!
My dataset that is the foundation for the plots and table consists of dataframe with 23 variables, factors and numerical. The plots and tabels are then created using the following code for all 73 combinations
s_A1 <- summarySE(Samlad_info, measurevar="Dist_brukcentrum",
groupvars="Companytype")
s_A1 <- s_A1[2:6,]
p_A1=ggplot(s_A1, aes(x=Companytype,
y=Dist_brukcentrum))+geom_bar(position=position_dodge(), stat="identity") +
geom_errorbar(aes(ymin=Dist_brukcentrum-se,
ymax=Dist_brukcentrum+se),width=.2,position=position_dodge(.9))+
scale_y_continuous(name = "") + scale_x_discrete(name = "")
where summarySE is the following function, burrowed from cookbook for R
summarySE <- function(data=NULL, measurevar, groupvars=NULL, na.rm=TRUE,
conf.interval=.95, .drop=TRUE) {
# New version of length which can handle NA's: if na.rm==T, don't count them
length2 <- function (x, na.rm=FALSE) {
if (na.rm) sum(!is.na(x))
else length(x)
}
# This does the summary. For each group's data frame, return a vector with
# N, mean, and sd
datac <- ddply(data, groupvars, .drop=.drop,
.fun = function(xx, col) {
c(N = length2(xx[[col]], na.rm=na.rm),
mean = mean (xx[[col]], na.rm=na.rm),
sd = sd (xx[[col]], na.rm=na.rm)
)
},
measurevar
)
# Rename the "mean" column
datac <- rename(datac, c("mean" = measurevar))
datac$se <- datac$sd / sqrt(datac$N) # Calculate standard error of the mean
# Confidence interval multiplier for standard error
# Calculate t-statistic for confidence interval:
# e.g., if conf.interval is .95, use .975 (above/below), and use df=N-1
ciMult <- qt(conf.interval/2 + .5, datac$N-1)
datac$ci <- datac$se * ciMult
return(datac)
}
The code in it's entirety is a bit to large but i hope this may clarify what i'm trying to do.
Well, thanks to florian's comment i think i might have found a solution my self. I'll present it here but leave the question open as there is probably far neater ways of doing it.
I rigged up the plots (that was created as lists by ggplot) into a list
plotList <- list(p_A1, p_A2, p_A3...)
tableList <- list(s_A1, s_A2, s_A3...)
I then used subset on my lookup table to get the matching id of the list to select the right plot and table.
output$thePlot <-renderPlot({
plotValue<-subset(letaLista, letaLista$Factor==input$Factor &
letaLista$Variabel== input$Variabel & letaLista$rest_alla==input$DataSource)
plotList[as.integer(plotValue[1,4])]
})
output$theTable <-renderTable({
plotValue<-subset(letaLista, letaLista$Factor==input$Factor &
letaLista$Variabel== input$Variabel & letaLista$rest_alla==input$DataSource)
skriva <- tableList[as.integer(plotValue[4])]
print(skriva)
})
I am writing an R-Shiny app. Can some one tell me how to execute a Microsoft SQL query in R Shiny ?
This is what I have done so far:
data <- reactive({
conn <- reactive ({ databaseOpen(serverName="[serverName]", databaseName=[dbName])})
qr <- reactive ({ SELECT * from myTable })
res <- reactive ({databaseQuery(conn = conn,query = qr)})
close(conn)
View(res)
})
Any help is appreciated !
I was able to call a query by creating a function outside of the server and ui functions (in other words, in a global.r). Then the server function could call that query function using one of the inputs in the function.
Here is my code:
queryfunction <- function(zipper){
odbcChannel <- odbcConnect("myconnection")
querydoc <- paste0("
SELECT distinct *
FROM mydb
where substring(nppes_provider_zip,1,2) = '43'
and [provider_type] = 'General Practice'
")
pricetable <- sqlQuery(odbcChannel, querydoc)
close(odbcChannel)
pricetable[which(substring(pricetable$nppes_provider_zip,1,5)==zipper),]
}
server <- shinyServer(function(input, output) {
output$mytable1 <- renderDataTable(data.table(queryfunction(input$zip)))
})
I figured it out. It can be done as:
server.r
serverfun<-function(input, output){
# Storing values in myData variable
myData <- reactive({
# Opening database connection
conn <- databaseOpen(serverName = "myServer",databaseName = "myDB")
# Sample query which uses some input
qr <- paste( "SELECT name FROM Genes g WHERE Id = ",input$myId," ORDER BY name")
# Storing results
res <- databaseQuery(conn = conn,query = qr)
# closing database
databaseClose(conn)
# Returning results
res
})
output$tbTable <- renderTable({
# Checking if myData is not null
if(is.null(myData())){return ()}
# return myData
myData()
})
ui.r
library("shiny")
shinyUI(
pageWithSidebar(
headerPanel("Hide Side Bar example"),
sidebarPanel(
textInput("Id", "Enter ID below","1234")
),
mainPanel(
tabsetPanel(
tabPanel("Data", tableOutput("tbTable"))
)
)
)
)
i have many data.frames() that i am trying to send to MySQL database via RMySQL().
# Sends data frame to database without a problem
dbWriteTable(con3, name="SPY", value=SPY , append=T)
# stock1 contains a character vector of stock names...
stock1 <- c("SPY.A")
But when I try to loop it:
i= 1
while(i <= length(stock1)){
# converts "SPY.A" into SPY
name <- print(paste0(str_sub(stock1, start = 1, end = -3))[i], quote=F)
# sends data.frame to database
dbWriteTable(con3,paste0(str_sub(stock1, start = 1, end = -3))[i], value=name, append=T)
i <- 1+i
}
The following warning is returned & nothing was sent to database
In addition: Warning message:
In file(fn, open = "r") :
cannot open file './SPY': No such file or directory
However, I believe that the problem is with pasting value onto dbWriteTable() since writing dbWriteTable(con3, "SPY", SPY, append=T) works but dbWriteTable(con3, "SPY", name, append=T) will not...
You are probably using a non-base package for str_sub and I'm guessing you get the same behavior with substr. Does this succeed?
dbWriteTable(con3, substr( stock1, 1,3) , get(stock1), append=T)
In a ModelForm I can write a clean_<field_name> member function to automatically validate and clean up data entered by a user, but what can I do about dirty json or csv files (fixtures) during a manage.py loaddata?
Fixtures loaded with loaddata are assumed to contain clean data that doen't need validation (usually as an inverse operation to a prior dumpdata), so the short answer is that loaddata isn't the approach you want if you need to clean your inputs.
However, you probably can use some of the underpinnings of loaddata while implementing your custom data cleaning code--I'm sure you can easily script something using the Django serialization libs to read your existing data files them in and the save the resulting objects normally after the data has been cleaned up.
In case others want to do something similar, I defined a model method to do the cleaning (so it can be called from ModelForms)
MAX_ZIPCODE_DIGITS = 9
MIN_ZIPCODE_DIGITS = 5
def clean_zip_code(self, s=None):
#s = str(s or self.zip_code)
if not s: return None
s = re.sub("\D","",s)
if len(s)>self.MAX_ZIPCODE_DIGITS:
s = s[:self.MAX_ZIPCODE_DIGITS]
if len(s) in (self.MIN_ZIPCODE_DIGITS-1,self.MAX_ZIPCODE_DIGITS-1):
s = '0'+s # FIXME: deal with other intermediate lengths
if len(s)>=self.MAX_ZIPCODE_DIGITS:
s = s[:self.MIN_ZIPCODE_DIGITS]+'-'+s[self.MIN_ZIPCODE_DIGITS:]
return s
Then wrote a standalone python script to clean up my legacy json files using any clean_ methods found among the models.
import os, json
def clean_json(app = 'XYZapp', model='Entity', fields='zip_code', cleaner_prefix='clean_'):
# Set the DJANGO_SETTINGS_MODULE environment variable.
os.environ['DJANGO_SETTINGS_MODULE'] = app+".settings"
settings = __import__(app+'.settings').settings
models = __import__(app+'.models').models
fpath = os.path.join( settings.SITE_PROJECT_PATH, 'fixtures', model+'.json')
if isinstance(fields,(str,unicode)):
fields = [fields]
Ns = []
for field in fields:
try:
instance = getattr(models,model)()
except AttributeError:
print 'No model named %s could be found'%(model,)
continue
try:
cleaner = getattr(instance, cleaner_prefix+field)
except AttributeError:
print 'No cleaner method named %s.%s could be found'%(model,cleaner_prefix+field)
continue
print 'Cleaning %s using %s.%s...'%(fpath,model,cleaner.__name__)
fin = open(fpath,'r')
if fin:
l = json.load(fin)
before = len(l)
cleans = 0
for i in range(len(l)):
if 'fields' in l[i] and field in l[i]['fields']:
l[i]['fields'][field]=cleaner(l[i]['fields'][field]) # cleaner returns None to delete records
cleans += 1
fin.close()
after = len(l)
assert after>.5*before
Ns += [(before, after,cleans)]
print 'Writing %d/%d (new/old) records after %d cleanups...'%Ns[-1]
with open(fpath,'w') as fout:
fout.write(json.dumps(l,indent=2,sort_keys=True))
return Ns
if __name__ == '__main__':
clean_json()