rscala package: How to access the elements of Scala cached reference Array in R - arrays

I am using rscala to communicate Scala and R. Lets say I have a Scala function that returns an Array of Array[Double] and a Double as:
Array(projectedTrainToMatrix,predictedTrain,kernelParam,projection,thresholdsToArray)
where kernelParam is of type Double and the others are Array[Double]. When I run the method from R as:
myfit<-s$.cristinahg.ocapis.kdlor$kdlorfit(traindata,trainlabels,kerneltype,params)
I get myfit as
ScalaCachedReference... _: Array[Array[_]]
[Ljava.lang.Object;#b9dfc5a
But I want to access each of the values in myfitArray. I have tried to access them via myfit$'(1)' but I get this instead the desired Array of Double:
function (..., .AS.REFERENCE = NA, .EVALUATE = TRUE, .PARENTHESES = FALSE)
{
args <- list(...)
if (!is.null(names(args)))
stop("Arguments should not have names.")
names <- paste0(rep("$", length(args)), seq_len(length(args)))
header <- mkHeader(args, names)
identifier <- ""
body <- if (inherits(reference, "ScalaInterpreterReference"))
paste0(reference[["identifier"]], ".", method)
else if (inherits(reference, "ScalaCachedReference")) {
if (.EVALUATE) {
identifier <- reference[["identifier"]]
paste0("R.cached($0).asInstanceOf[", reference[["type"]],
"].", method)
}
else {
paste0("R.cached(\\"", reference[["identifier"]],
"\\").asInstanceOf[", reference[["type"]], "].",
method)
}
}
else if (inherits(reference, "ScalaInterpreterItem")) {
if (method == "new")
paste0("new ", reference[["snippet"]])
else paste0(reference[["snippet"]], ".", method)
}
else stop("Unrecognized reference type.")
argsList <- paste0(names, collapse = ",")
if ((nchar(argsList) > 0) || .PARENTHESES)
argsList <- paste0("(", argsList, ")")
snippet <- paste0(header, paste0(body, argsList))
if (get("show.snippet", envir = interpreter[["env"]]))
cat("<<<\\n", snippet, "\\n>>>\\n", sep = "")
cc(interpreter)
wb(interpreter, DEF)
wc(interpreter, snippet)
flush(interpreter[["socketIn"]])
status <- rb(interpreter, "integer")
if (status != OK) {
if (get("serializeOutput", envir = interpreter[["env"]]))
echoResponseScala(interpreter)
stop("Problem defining function.")
}
functionName <- rc(interpreter)
if (get("serializeOutput", envir = interpreter[["env"]]))
echoResponseScala(interpreter)
f <- function(..., .NBACK = 1) {
args <- list(...)
if (length(args) != length(names))
stop("Incorrect number of arguments.")
if (!is.null(names(args)))
stop("Arguments should not have names.")
workspace <- new.env(parent = parent.frame(.NBACK))
assign(".rsI", interpreter, envir = workspace)
for (i in seq_len(length(args))) assign(names[i], args[[i]],
envir = workspace)
cc(interpreter)
wb(interpreter, INVOKE)
wc(interpreter, functionName)
wc(interpreter, identifier)
flush(interpreter[["socketIn"]])
rServe(interpreter, TRUE, workspace)
status <- rb(interpreter, "integer")
if (get("serializeOutput", envir = interpreter[["env"]]))
echoResponseScala(interpreter)
if (status != OK)
stop("Problem invoking function.")
result <- scalaGet(interpreter, "?", .AS.REFERENCE)
if (is.null(result))
invisible(result)
else result
}
if (.EVALUATE)
f(..., .NBACK = 2)
else f
}
<bytecode: 0x55b1ba98b3f8>
<environment: 0x55b1bd2616c0>
So how can I access each element of the Scala Array in R?

Your example shows that you are using rscala < 3.0.0. While it could be done in older versions, I recommend you use a recent version on CRAN. Below I provide a solution using rscala 3.1.0.
library(rscala)
scala()
s + '
def kdlorfit(
projectedTrainToMatrix: Array[Double], predictedTrain: Array[Double],
kernelParam: Double, projection: Array[Double], thresholdsToArray: Array[Double]) =
{
Array(projectedTrainToMatrix,predictedTrain,kernelParam,projection,thresholdsToArray)
}
'
x1 <- c(1,2,3)
x2 <- c(11,12,13)
x3 <- 34
x4 <- c(100,110,120)
x5 <- c(50,51)
myfit <- s$kdlorfit(x1,x2,x3,x4,x5)
scalaType(myfit)
identical(x1,myfit(0L)$"asInstanceOf[Array[Double]]"())
identical(x3,myfit(2L)$"asInstanceOf[Double]"())
Note the need to cast using asInstanceOf because the Scala type of myfit is Array[Any].
If the function returned Array[Array[Double]] instead of Array[Any], no casting would be needed, as shown below.
s + '
def kdlorfit2(
projectedTrainToMatrix: Array[Double],
predictedTrain: Array[Double],
kernelParam: Array[Double],
projection: Array[Double],
thresholdsToArray: Array[Double]) =
{
Array(projectedTrainToMatrix,predictedTrain,kernelParam,projection,thresholdsToArray)
}
'
myfit <- s$kdlorfit2(x1,x2,I(x3),x4,x5)
scalaType(myfit)
identical(x1,myfit(0L))
identical(x3,myfit(2L))
Note that, when calling kdlorfit2, the argument x3 is passed as Array[Double] because it is wrapped in I(). Without wrapping, it is a passed as a Double as in the previous example.

Related

Get related articles based on tags in Ruby

I’m trying to display a related section based on the article’s tags. Any articles that have similar tags should be displayed.
The idea is to iterate the article’s tags and see if any other articles have those tags.
If yes, then add that article to a related = [] array of articles I can retrieve later.
Article A: tags: [chris, mark, scott]
Article B: tags: [mark, scott]
Article C: tags: [alex, mike, john]
Article A has as related the Article B and vice-versa.
Here’s the code:
files = Dir[ROOT + 'articles/*']
# parse file
def parse(fn)
res = meta(fn)
res[:body] = PandocRuby.new(body(fn), from: 'markdown').to_html
res[:pagedescription] = res[:description]
res[:taglist] = []
if res[:tags]
res[:tags] = res[:tags].map do |x|
res[:taglist] << '%s' % [x, x]
'%s' % [x, x]
end.join(', ')
end
res
end
# get related articles
def related_articles(articles)
related = []
articles[:tags].each do |tag|
articles.each do |item|
if item[:tags] != nil && item[:tags].include?(tag)
related << item unless articles.include?(item)
end
end
end
related
end
articles = files.map {|fn| parse(fn)}.sort_by {|x| x[:date]}
articles = related_articles(articles)
Throws this error:
no implicit conversion of Symbol into Integer (TypeError)
Another thing I tried was this:
# To generate related articles
def related_articles(articles)
related = []
articles.each do |article|
article[:tags].each do |tag|
articles.each do |item|
if item[:tags] != nil && item[:tags].include?(tag)
related << item unless articles.include?(item)
end
end
end
end
related
end
But now the error says:
undefined method `each' for "tagname":String (NoMethodError)
Help a Ruby noob? What am I doing wrong? Thanks!
As an aside to the main question, I tried rewriting the tag section of the code, but still no luck:
res[:taglist] = []
if res[:tags]
res[:tags] = res[:tags].map do |x|
res[:taglist] << '' + x + ''
'' + x + ''
end.join(', ')
end
In your first attempt, the problem is in articles[:tags]. articles is an array, so you cannot access it using a symbol key.
The second attempt fails because article[:tags] is a string (from the parse function, you get the original tags, transform to HTML and then join). The :taglist key instead contains an array, you could use it.
Finally, the "related" array should be per-article so neither implementation could possibly solve your issue, as both return a single array for all your set of articles.
You probably need a two pass:
def parse(fn)
res = meta(fn)
res[:body] = PandocRuby.new(body(fn), from: 'markdown').to_html
res[:pagedescription] = res[:description]
res[:tags] ||= [] # and don't touch it
res[:tags_as_links] = res[:tags].map { |x| "#{x}" }
res[:tags_as_string] = res[:tags_as_links].join(', ')
res
end
articles = files.map { |fn| parse(fn) }
# convert each article into a hash like
# {tag1 => [self], tag2 => [self]}
# and then reduce by merge
taggings = articles
.map { |a| a[:tags].product([[a]]).to_h }
.reduce { |a, b| a.merge(b) { |_, v1, v2| v1 | v2 } }
# now read them back into the articles
articles.each do |article|
article[:related] = article[:tags]
.flat_map { |tag| taggings[tag] }
.uniq
# remove the article itself
article[:related] -= [article]
end

Sink and Source are different length Vecs

The following code is to concatenate two strings. It is getting compiled but shows errors after elaboration.
Code:
package problems
import chisel3._
import chisel3.util._
class Compare1 extends Module {
val io = IO(new Bundle {
val in1 = Input(Vec(5, UInt(3.W)))
val in2 = Input(Vec(5, UInt(3.W)))
val out = Output(Vec(6, UInt(3.W)))
})
val L = 5
io.out := io.in2
val ml = 4
for (l <- 0 until ml) {
when (io.in2(l) === io.in1(L - ml + l)) {
io.out(l) := io.in1(l)
}
}
val m = (2*L) - ml
for (i <- ml until m) {
io.out(i) := io.in2(i - (L - ml))
}
}
Testbench:
I am poking 19333 and 23599 and expecting 154671
Error:
To sum it up, this is what I get
Errors: 1: in the following tutorials
Tutorial Compare1: exception Connection between sink (Vec(chisel3.core.UInt#80, chisel3.core.UInt#82, chisel3.core.UInt#84, chisel3.core.UInt#86, chisel3.core.UInt#88, chisel3.core.UInt#8a)) and source (Vec(chisel3.core.UInt#6a, chisel3.core.UInt#6c, chisel3.core.UInt#6e, chisel3.core.UInt#70, chisel3.core.UInt#72)) failed #: Sink and Source are different length Vecs.
The error is with the line: io.out := io.in2, io.out is a Vec of length 6 while io.in2 is a Vec of length 5. As the error says, you cannot connect Vecs of different lengths together.
If you wish to connect indices 0 to 4 of io.in2 to io.out, try
for (i <- 0 until io.in2.size) { io.out(i) := io.in2(i) }

OpenMDAO v0.13: performing an optimization when using multiple instances of a components initiated in a loop

I am setting up an optimization in OpenMDAO v0.13 using several components that are used many times. My assembly seems to be working just fine with the default driver, but when I run with an optimizer it does not solve. The optimizer simply runs with the inputs given and returns the answer using those inputs. I am not sure what the issue is, but I would appreciate any insights. I have included a simple code mimicking my structure that reproduces the error. I think the problem is in the connections, summer.fs does not update after initialization.
from openmdao.main.api import Assembly, Component
from openmdao.lib.datatypes.api import Float, Array, List
from openmdao.lib.drivers.api import DOEdriver, SLSQPdriver, COBYLAdriver, CaseIteratorDriver
from pyopt_driver.pyopt_driver import pyOptDriver
import numpy as np
class component1(Component):
x = Float(iotype='in')
y = Float(iotype='in')
term1 = Float(iotype='out')
a = Float(iotype='in', default_value=1)
def execute(self):
x = self.x
a = self.a
term1 = a*x**2
self.term1 = term1
print "In comp1", self.name, self.a, self.x, self.term1
def list_deriv_vars(self):
return ('x',), ('term1',)
def provideJ(self):
x = self.x
a = self.a
dterm1_dx = 2.*a*x
J = np.array([[dterm1_dx]])
print 'In comp1, J = %s' % J
return J
class component2(Component):
x = Float(iotype='in')
y = Float(iotype='in')
term1 = Float(iotype='in')
f = Float(iotype='out')
def execute(self):
y = self.y
x = self.x
term1 = self.term1
f = term1 + x + y**2
self.f = f
print "In comp2", self.name, self.x, self.y, self.term1, self.f
class summer(Component):
total = Float(iotype='out', desc='sum of all f values')
def __init__(self, size):
super(summer, self).__init__()
self.size = size
self.add('fs', Array(np.ones(size), iotype='in', desc='f values from all cases'))
def execute(self):
self.total = sum(self.fs)
print 'In summer, fs = %s and total = %s' % (self.fs, self.total)
class assembly(Assembly):
x = Float(iotype='in')
y = Float(iotype='in')
total = Float(iotype='out')
def __init__(self, size):
super(assembly, self).__init__()
self.size = size
self.add('a_vals', Array(np.zeros(size), iotype='in', dtype='float'))
self.add('fs', Array(np.zeros(size), iotype='out', dtype='float'))
print 'in init a_vals = %s' % self.a_vals
def configure(self):
# self.add('driver', SLSQPdriver())
self.add('driver', pyOptDriver())
self.driver.optimizer = 'SNOPT'
# self.driver.pyopt_diff = True
#create this first, so we can connect to it
self.add('summer', summer(size=len(self.a_vals)))
self.connect('summer.total', 'total')
print 'in configure a_vals = %s' % self.a_vals
# create instances of components
for i in range(0, self.size):
c1 = self.add('comp1_%d'%i, component1())
c1.missing_deriv_policy = 'assume_zero'
c2 = self.add('comp2_%d'%i, component2())
self.connect('a_vals[%d]' % i, 'comp1_%d.a' % i)
self.connect('x', ['comp1_%d.x'%i, 'comp2_%d.x'%i])
self.connect('y', ['comp1_%d.y'%i, 'comp2_%d.y'%i])
self.connect('comp1_%d.term1'%i, 'comp2_%d.term1'%i)
self.connect('comp2_%d.f'%i, 'summer.fs[%d]'%i)
self.driver.workflow.add(['comp1_%d'%i, 'comp2_%d'%i])
self.connect('summer.fs[:]', 'fs[:]')
self.driver.workflow.add(['summer'])
# set up main driver (optimizer)
self.driver.iprint = 1
self.driver.maxiter = 100
self.driver.accuracy = 1.0e-6
self.driver.add_parameter('x', low=-5., high=5.)
self.driver.add_parameter('y', low=-5., high=5.)
self.driver.add_objective('summer.total')
if __name__ == "__main__":
""" the result should be -1 at (x, y) = (-0.5, 0) """
import time
from openmdao.main.api import set_as_top
a_vals = np.array([1., 1., 1., 1.])
test = set_as_top(assembly(size=len(a_vals)))
test.a_vals = a_vals
print test.a_vals
test.x = 2.
test.y = 2.
tt = time.time()
test.run()
print "Elapsed time: ", time.time()-tt, "seconds"
print 'result = ', test.summer.total
print '(x, y) = (%s, %s)' % (test.x, test.y)
print test.fs
I played around with your model, and found that the following line caused problems:
#self.connect('summer.fs[:]', 'fs[:]')
When I commented it out, I got the optimization to move.
I am not sure what is happening there, but the graph transformations sometimes have some issues with component input nodes that are promoted as outputs on the assembly boundary. If you still want those values to be available on the assembly, you could try promoting the outputs from the comp2_n components instead.

how to return a data frame after subsetting its columns with NAs removed

I'm working on a problem from coursera and seem to get lost at the end where I have to extract both selected columns and return a data frame, there is a link to a similar problem at (Empty rows in list as NA values in data.frame in R)
rankall <- function(outcome, num = 'best'){
data<- read.csv('outcome-of-care-measures.csv', colClasses = 'character')
if(!outcome %in% c('heart attack', 'heart failure', 'penumonia')){
stop('invalid outcome')
}
states <- sort(unique(state))
for (i in 1:length(state)){
statedata <- data[data$State == state[i], ]
if(outcome == 'heart attack'){
index <- as.numeric(statedata[,11])
} else if(outcome == 'heart failure'){
index <- as.numeric(statedata[,17])
} else if(outcome == 'pneumonia'){
index <- as.numeric(statedata[,23])
}
#sort by mortality rate and hospital name
sorteddata <- data[order(data[,index],data$Hospital.Name, na.rm = TRUE)]
#rank by state
staterank <- function(state){
hospital_state <- subset(sorteddata, State == state)
}
#choose rows at each num value, this where I get stuck
if(!is.numeric(num)){
if(num == 'best'){
num <- 1}
else if(num == 'worst'){
num <- length(hospital_state)}
}
hospital_state[num]
}

Test page for mod_wsgi

mod_python has a test page script which emits information about the server configuration. You can put
SetHandler mod_python
PythonHandler mod_python.testhandler
into your .htaccess and it displays the page.
Now my question: Does something similiar exist for mod_wsgi as well?
No. You can create something kind of helpful by iterating over the keys of environ, though:
def application(env, respond):
respond('200 OK', [('Content-Type', 'text/plain')])
return ['\n'.join('%s: %s' % (k, v) for (k, v) in env.iteritems())]
I have now put together something like a test page here. For your convenience, I'll share it with you here:
def tag(t, **k):
kk = ''.join(' %s=%r' % kv for kv in k.items())
format = '<%s%s>%%s</%s>' % (t, kk, t)
return lambda content: format % content
def table(d):
from cgi import escape
escq = lambda s: escape(s, quote=True)
tr = tag('tr')
th = tag('th')
td_code = lambda content: tag('td')(tag('code')(content))
return tag('table', border='1')(''.join((
'\n\t' + tr(th('Key') + th('Value') + th('Repr')) + '\n',
''.join(('\t' + tr(td_code('%s') + td_code('%s') + td_code('%s')) + '\n') % (k, escq(str(v)), escq(repr(v))) for k, v in sorted(d.items())),
))) + '\n'
def application(environ, start_response):
import os
l = []
from wsgiref.headers import Headers
h = Headers(l)
h.add_header('Content-Type', 'text/html')
start_response('200 OK', l)
yield '<html><head><title>my mod_wsgi test page</title></head><body>\n'
# yield '<h3>General information</h3>\n'
# yield table({})
yield '<h3>Process info</h3>\n'
yield table(dict(
wd=os.getcwd(),
pid=os.getpid(),
ppid=os.getppid(),
uid=os.getuid(),
gid=os.getgid(),
))
yield '<h3>Environment</h3>\n'
yield table(environ)

Resources