Vectorize an S4 class in R - arrays

I got some troubles defining array like classes in a way that they are fully typed (as far as that is possible in R).
My example: I want to define a class Avector, which should contain an arbitrary number of elements of the class A.
# Define the base class
setClass("A", representation(x = "numeric"))
# Some magic needed ????
setClass("Avector", ???)
# In the end one should be able to use it as follows:
a1 <- new("A", x = 1)
a2 <- new("A", x = 2)
X <- new("Avector", c(a1, a2))
I am aware that having a vector of objects is not possible in R. So I guess it will be stored in a kind of "typed" list.
I have found some solution, but I am not happy with it:
# Define the vectorized class
setClass(
"Avector",
representation(X = "list"),
valididty = function(.Object)) {
if (all(sapply(.Object#X, function(x) class(x) == "A")))
TRUE
else
"'Avector' must be a list of elements in the class 'A'"
}
)
# Define a method to subscript the elements inside of the vector
setMethod(
"[", signature(x = "Avector", i = "ANY", j = "ANY"),
function(x, i, j, ...) x#X[[i]]
)
# Test the class
a1 <- new("A", x = 1)
a2 <- new("A", x = 2)
avec <- new("Avector", X = list(a1, a2))
# Retrieve the element in index i
avec[i]
This method appears more like a hack to me. Is there a way to do this in a canonical way in R without doing this type checking and indexing method by hand?
Edit:
This should also hold, if the class A is not consisting of atomic slots. For example in the case that:
setClass("A", representation(x = "data.frame"))
I would be glad for help :)
Cheers,
Adrian

The answer depends somewhat on what you are trying to accomplish, and may or may not be possible in your use case. The way S4 is intended to work is that objects are supposed to be high-level to avoid excessive overheads.
Generally, it is necessary to have the slots be vectors. You can't define new atomic types from within R. So in your toy example instead of calling
avec <- new("Avector", X = list(a1, a2))
you call
avec <- new("A", x = c(1, 2))
This may necessitate other slots (which were previously vectors) becoming arrays, for example.
If you're desperate to have an atomic type, then you might be able to over-ride one of the existing types. I think the bit64 package does this, for example. Essentially what you do is make a new class that inherits from, say, numeric and then write lots of methods that supersede all the default ones for your new class.

Related

Haskell Array Pattern in a function

Hi total Haskell beginner here: What does the pattern in a function for an array look like ? For example: I simply want to add +1 to the first element in my array
> a = array (1,10) ((1,1) : [(i,( i * 2)) | i <- [2..10]])
My first thought was:
> arraytest :: Array (Int,Int) Int -> Array (Int,Int) Int
> arraytest (array (mn,mx) (a,b):xs) = (array (mn,mx) (a,b+1):xs)
I hope you understand my problem :)
You can't pattern match on arrays because the data declaration in the Data.Array.IArray module for the Array type doesn't have any of its data constructors exposed. This is a common practice in Haskell because it allows the author to update the internal representation of their data type without making a breaking change for users of their module.
The only way to use an Array, therefore, is to use the functions provided by the module. To access the first value in an array, you can use a combination of bounds and (!), or take the first key/value pair from assocs. Then you can use (//) to make an update to the array.
arraytest arr = arr // [(index, value + 1)]
where
index = fst (bounds arr)
value = arr ! index
If you choose to use assocs, you can pattern match on its result:
arraytest arr = arr // [(index, value + 1)]
where
(index, value) = head (assocs arr) -- `head` will crash if the array is empty
Or you can make use of the Functor instances for lists and tuples:
arraytest arr = arr // take 1 (fmap (fmap (+1)) (assocs arr))
You will probably quickly notice, though, that the array package is lacking a lot of convenience functions. All of the solutions above are fairly verbose compared to how the operation would be implemented in other languages.
To fix this, we have the lens package (and its cousins), which add a ton of convenience functions to Haskell and make packages like array much more bearable. This package has a fairly steep learning curve, but it's used very commonly and is definitely worth learning.
import Control.Lens
arraytest arr = arr & ix (fst (bounds arr)) +~ 1
If you squint your eyes, you can almost see how it says arr[0] += 1, but we still haven't sacrificed any of the benefits of immutability.
This is more like an extended comment to #4castle's answer. You cannot pattern match on an Array because its implementation is hidden; you must use its public API to work with them. However, you can use the public API to define such a pattern (with the appropriate language extensions):
{-# LANGUAGE PatternSynonyms, ViewPatterns #-}
-- PatternSynonyms: Define patterns without actually defining types
-- ViewPatterns: Construct patterns that apply functions as well as match subpatterns
import Control.Arrow((&&&)) -- solely to dodge an ugly lambda; inline if you wish
pattern Array :: Ix i => (i, i) -> [(i, e)] -> Array i e
-- the type signature hints that this is the array function but bidirectional
pattern Array bounds' assocs' <- ((bounds &&& assocs) -> (bounds', assocs'))
-- When matching against Array bounds' assocs', apply bounds &&& assocs to the
-- incoming array, and match the resulting tuple to (bounds', assocs')
where Array = array
-- Using Array in an expression is the same as just using array
arraytest (Array bs ((i,x):xs)) = Array bs ((i,x+1):xs)
I'm fairly sure that the conversions to and from [] make this absolutely abysmal for performance.

Optimizing function speed on 3D array

I am applying a user-defined function to individual cells of a 3D array. The contents of each cell are one of the following possibilities, all of which are character vectors because of prior formatting:
"N"
"A"
""
"1"
"0"
I want to create a new 3D array of the same dimensions, where cells contain either NA or a numeric vector containing 1 or 0. Thus, I wrote a function named Numericize and used aaply to apply it to the entire array. However, it takes forever to apply it.
Numericize <- function(x){
if(!is.na(x)){
x[x=="N"] <- NA; x
x[x=="A"] <- NA; x
x[x==""] <- NA; x
x <- as.integer(x)
}
return(x)
}
The dimensions original array are 480x866x366. The function takes forever to apply using the following code:
Final.Daily.Array <- aaply(.data = Complete.Daily.Array,
.margins = c(1,2,3),
.fun = Numericize,
.progress = "text")
I am unsure if the speed issue comes from an inefficient Numericize, an inefficient aaply, or something else entirely. I considered trying to set up parallel computing using the plyr package but I wouldn't think that such a simple command would require parallel processing.
On one hand I am concerned that I created a stack overflow for myself (see this for more), but I have applied other functions to similar arrays without problems.
ex.array <- array(dim = c(3,3,3))
ex.array[,,1] <- c("N","A","","1","0","N","A","","1")
ex.array[,,2] <- c("0","N","A","","1","0","N","A","")
ex.array[,,3] <- c("1","0","N","A","","1","0","N","A")
desired.array <- array(dim = c(3,3,3))
desired.array[,,1] <- c(NA,NA,NA,1,0,NA,NA,NA,1)
desired.array[,,2] <- c(0,NA,NA,NA,1,0,NA,NA,NA)
desired.array[,,3] <- c(1,0,NA,NA,NA,1,0,NA,NA)
ex.array
desired.array
Any suggestions?
You can just use a vectorized approach:
ex.array[ex.array %in% c("", "N", "A")] <- NA
storage.mode(ex.array) <- "integer"
You can simply use the second line and it will introduce NAs by coercion.

Nested array slicing

Let's say I have an array of vectors:
""" simple line equation """
function getline(a::Array{Float64,1},b::Array{Float64,1})
line = Vector[]
for i=0:0.1:1
vector = (1-i)a+(i*b)
push!(line, vector)
end
return line
end
This function returns an array of vectors containing x-y positions
Vector[11]
> Float64[2]
> Float64[2]
> Float64[2]
> Float64[2]
.
.
.
Now I want to seprate all x and y coordinates of these vectors to plot them with plotyjs.
I have already tested some approaches with no success!
What is a correct way in Julia to achive this?
You can broadcast getindex:
xs = getindex.(vv, 1)
ys = getindex.(vv, 2)
Edit 3:
Alternatively, use list comprehensions:
xs = [v[1] for v in vv]
ys = [v[2] for v in vv]
Edit:
For performance reasons, you should use StaticArrays to represent 2D points. E.g.:
getline(a,b) = [(1-i)a+(i*b) for i=0:0.1:1]
p1 = SVector(1.,2.)
p2 = SVector(3.,4.)
vv = getline(p1,p2)
Broadcasting getindex and list comprehensions will still work, but you can also reinterpret the vector as a 2×11 matrix:
to_matrix{T<:SVector}(a::Vector{T}) = reinterpret(eltype(T), a, (size(T,1), length(a)))
m = to_matrix(vv)
Note that this does not copy the data. You can simply use m directly or define, e.g.,
xs = #view m[1,:]
ys = #view m[2,:]
Edit 2:
Btw., not restricting the type of the arguments of the getline function has many advantages and is preferred in general. The version above will work for any type that implements multiplication with a scalar and addition, e.g., a possible implementation of immutable Point ... end (making it fully generic will require a bit more work, though).

How to hold a list mutable values in Haskell?

I would like to have an array, say:
myArray = [1,2,3,4,5,6,7,8,9]
and be able to run a function that changes a value in the list to another value. I would like to be able to run this function several times with myArray updating to the new set of numbers after each run.
myArray = [1,2,3,4,5,6,7,8,9]
>>> f 1 5 myAarray
>>> myArray
[1,2,3,4,1,6,7,8,9]
>>> f 3 8 myArray
>>> myArray
[1,2,3,4,1,6,7,3,9]
How do I create a holder for my values that can have changing values.
Thanks!
All Haskell values are immutable. You can't change a value that's bound to a name (you can shadow them in GHCi, but that's a slightly different thing).
If you want to achieve true1 mutability, you need an immutable reference to mutable data. To use those, typically you'd want to be in a monadic context.
Here's an example using a rather low-level reference type called IORef:
import Data.IORef
import Control.Monad
f :: [Int] -> [Int]
f = map (+1)
main = do
a <- newIORef [1,2,3,4,5]
readIORef a >>= print
readIORef a >>= (return . f) >>= writeIORef a
readIORef a >>= print
Note that the value of a doesn't change; it still points to the same "value location". What changes is the actual value that's being pointed to.
That being said, this requires using the IO monad which is generally frowned upon. Depending on your needs, a fully pure solution like State might work better.
-- assume previous f
g :: State [Int] ()
g = modify f
Now you only need to start with some state, and the state monad will chain the modifications for you, like so:
main = print $ execState (g >> g >> g) [1,2,3,4,5]
This is essentially equivalent to simple composition:
f . f . f $ [1,2,3,4,5]
Which, last but not least, could be your default go-to solution in Haskell.
P.S. I'm using a simpler f in my examples, but there's no reason you couldn't do:
(f 1 5) . (f 3 8) $ myArray
1This is somewhat ambiguous, but for the sake of simplicity I'd expand this to "the one that could be backed by direct memory operations".

Matrix as Applicative functor, which is not Monad

I run into examples of Applicatives that are not Monads. I like the multi-dimensional array example but I did not get it completely.
Let's take a matrix M[A]. Could you show that M[A] is an Applicative but not a Monad with Scala code ? Do you have any "real-world" examples of using matrices as Applicatives ?
Something like M[T] <*> M[T => U] is applicative:
val A = [[1,2],[1,2]] //let's assume such imaginary syntax for arrays
val B = [[*2, *3], [*5, *2]]
A <*> B === [[2,6],[5,4]]
There may be more complex applicatives in signal processing for example. Using applicatives allows you to build one matrix of functions (each do N or less element-operations) and do only 1 matrix-operation instead of N.
Matrix is not a monoid by definition - you have to define "+" (concatenation) between matrixes for that (fold more precisely). And not every (even monoidal) matrix is a monad - you have to additionaly define fmap (not flatMap - just map in scala) to make it a Functor (endo-functor if it returns matrix). But by default Matrix isn't Functor/Monoid/Monad(Functor + Monoid).
About monadic matrixes. Matrix can be monoid: you may define dimension-bound concatenation for matrixes that are same sized along the orthogonal dimension. Dimension/size-independent concatenation will be something like:
val A = [[11,12],[21,22]]; val B = [[11,12,13],[21,22,23],[31,32,33]]
A + B === [[11,12,0,0,0], [21,22,0,0,0], [0,0,11,12,13],[0,0,21,22,23],[0,0,31,32,33]
Identity element will be []
So you can also build the monad (pseudocode again):
def flatMap[T, U](a: M[T])(f: T => M[U]) = {
val mapped = a.map(f)// M[M[U]] // map
def normalize(xn: Int, yn: Int) = ... // complete matrix with zeros to strict xn * yn size
a.map(normalize(a.max(_.xn), a.max(_.yn)))
.reduceHorizontal(_ concat _)
.reduceVertical(_ concat _) // flatten
}
val res = flatMap([[1,1],[2,1]], x => if(x == 1)[[2,2]] else [[3,3,3]])
res === [[2,2,0,2,2],[3,3,3,2,2]]
Unfortunately, you must have zero-element (or any default) for T (not only for monoid itself). It doesn't make T itself some kind of magma (because no defined binary operation for this set is required - only some const defined for T), but may create additional problems (depending on your challenges).

Resources