Efficiently `apply` on array and preserve structure - arrays

I have an array of matrices.
dims <- c(10000,5,5)
mat_array <- array(rnorm(prod(dims)), dims)
I would like to perform a matrix-based operation (e.g. inversion via the solve function) on each matrix, but preserve the full structure of the array.
So far, I have come up with 3 options:
Option 1: A loop, which does exactly what I want, but is clunky and inefficient.
mat_inv <- array(NA, dims)
for(i in 1:dims[1]) mat_inv[i,,] <- solve(mat_array[i,,])
Option 2: The apply function, which is faster and cleaner, BUT squishes each matrix down to a vector.
mat_inv <- apply(mat_array, 1, solve)
dim(mat_inv)
[1] 25 10000
I know I can set the output dimensions to match those of the input, but I'm wary of doing this and messing up the indexing, especially if I had to apply over non-adjacent dimensions (e.g. if I wanted to invert across dimension 2).
Option 3: The aaply function from the plyr package, which does exactly what I want, but is MUCH slower (4-5x) than the others.
mat_inv <- plyr::aaply(mat_array, 1, solve)
Are there any options that combine the speed of base::apply with the versatility of plyr::aaply?

Related

Inflate array (add additional dimension with copies of itself)

I need to perform basic calculations on arrays of different dimensionality.
The recommended solution seems to be to inflate all arrays to match size.
For example, Array B with dimension 10x100. And array C with dimension 10x10000. The result (e.g. by multiplication) should be an array D of size 10x100x10000.
Therefore I would "inflate" B by the "10000" dimension, and C by the "100" dimension and simply do B*C.
Now, what is the best (fastest) way to achieve this inflation?
A slow method illustrating the desired outcome:
B <- array(dim=c(10,100),rnorm(n=10*100)) # small array
A <- array(dim=c(10,100,10000)) # creating empty big array
A[] <- B # "inflate" B into A, creating 10'000 copies of B
Ideas?
Idea 1
Just tried this, about 3x faster:
A <- rep(B,10000)
dim(A) <- c(10,100,10000)
Still searching..
Reference to original request

Efficient, concise approach to convert array dimension to list (and back) in R

I convert between data formats a lot. I'm sure this is quite common. In particular, I switch between arrays and lists. I'm trying to figure out if I'm doing it right, or if I'm missing any schemas that would greatly improve quality of life. Below I'll give some examples of how to achieve desired results in a couple situations.
Begin with the following array:
dat <- array(1:60, c(5,4,3))
Then, convert one or more of the dimensions of that array to a list. For clarification and current approaches, see the following:
1 dimension, array to list
# Convert 1st dim
dat_list1 <- unlist(apply(dat, 1, list),F,F) # this is what I usually do
# Convert 1st dim, (alternative approach)
library(plyr) # I don't use this approach often b/c I try to go base if I can
dat_list1a <- alply(dat, 1) # points for being concise!
# minus points to alply for being slow (in this case)
> microbenchmark(unlist(apply(dat, 1, list),F,F), alply(dat, 1))
Unit: microseconds
expr min lq mean median uq max neval
unlist(apply(dat, 1, list), F, F) 40.515 43.519 50.6531 50.4925 53.113 88.412 100
alply(dat, 1) 1479.418 1511.823 1684.5598 1595.4405 1842.693 2605.351 100
1 dimension, list to array
# Convert elements of list into new array dimension
# bonus points for converting to original array
dat_array1_0 <- simplify2array(dat_list1)
aperm.key1 <- sapply(dim(dat), function(x)which(dim(dat_array1_0)==x))
dat_array1 <- aperm(dat_array1_0,aperm.key1)
In general, these are the tasks I'm trying to accomplish, although sometimes it's in multiple dimensions or the lists are nested, or some such other complication. So I'm asking if anyone has a "better" (concise, efficient) way of doing either of these things, but bonus points if a suggested approach can handle other related scenarios too.

Apply an R function over multiple arrays, returning an array of the same size

I have two arrays of 2x2 matrices, and I'd like to apply a function over each pair of 2x2 matrices. Here's a minimal example, multiplying each matrix in A by its corresponding matrix in B:
A <- array(1:20, c(5,2,2))
B <- array(1:20, c(5,2,2))
n <- nrow(A)
# Desired output: array with dimension 5x2x2 that contains
# the product of each pair of 2x2 matrices in A and B.
C <- aperm(sapply(1:n, function(i) A[i,,]%*%B[i,,], simplify="array"), c(3,1,2))
This takes two arrays, each with 5 2x2 matrices, and multiplies each pair of 2x2 matrices together, with the desired result in C.
My current code is this ugly last line, using sapply to loop through the first array dimension and pull out each 2x2 matrix separately from A and B. And then I need to permute the array dimensions with aperm() in order to have the same ordering as the original arrays (sapply(...,simplify="array") indexes each 2x2 matrix using the third dimension rather than the first one).
Is there a nicer way to do this? I hate that ugly function(i) in there, which is really just a way of faking a for loop. And the aperm() call makes this much less readable. What I have now works fine; I'm just searching for something that feels more like idiomatic R.
mapply() will take multiple lists or vectors, but it doesn't seem to work with arrays. aaply() from plyr is also close, but it doesn't take multiple inputs. The closest I've come is to use abind() with aaply() to pack A and B into one array work with 2 matrices at once, but this doesn't quite work (it only gets the first two entries; somewhere my indexing is off):
aaply(.data=abind(A,B,along=0), 1, function(ab) ab[1,,]%*%ab[2,,])
And this isn't exactly cleaner or clearer anyway!
I've tried to make this a minimal example, but my real use case requires a more complicated function of the matrix pairs (and I'd also love to scale this up to more than two arrays), so I'm looking for something that will generalize and scale.
D <- aaply(abind(A, B, along = 4), 1, function(x) x[,,1] %*% x[,,2])
This is a working solution using abind and aaply.
Sometimes a for loop is the easiest to follow. It also generalizes and scales:
n <- nrow(A)
C <- A
for(i in 1:n) C[i,,] <- A[i,,] %*% B[i,,]
R's infrastructure for lists is much better (it seems) than for arrays, so I could also approach it by converting the arrays into lists of matrices like this:
A <- alply(A, 1, function(a) matrix(a, ncol=2, nrow=2))
B <- alply(A, 1, function(a) matrix(a, ncol=2, nrow=2))
mapply(function(a,b) a%*%b, A, B, SIMPLIFY=FALSE)
I think this is more straightforward than what I have above, but I'd still love to hear better ideas.

Cross products of elements of 3D array and matrix columns without loop in R

I'm working on a fishery stock assessment model and want to speed it up by removing a loop (actually two loops of the same form).
I have an array, A, dim(A)=[L,L,Y], and a matrix, M, dim(M)=[L,Y].
These are used to make a matrix, mat, dim(mat)=[L,Y], by calculating matrix products. My loop looks like:
for(i in 1:Y){
mat[,i]<-(A[,,i]%*%M[,i])[,1]}
Can anyone help me out? I really need a speed gain.
Also, (don't know if it'll make a difference but) each A[,,i] matrix is lower triangular.
I'm pretty sure this will give you the results you want. Since there is no reproducible example, I can't be absolutely sure. Had to trace some of the linear algebra logic to see what you are trying to accomplish.
library(plyr) # We need this to split the array into a list of 9 matrices
B = lapply(alply(A, 3), function(x) (x%*%M)) # Perform 9 linear algebra multiplications
sapply(1:9, function(i) (B[[i]])[,i]) # Extract the 9 columns you actually want.
I used the following test data:
A = array(rnorm(225), dim = c(5,5,9))
M = matrix(rnorm(45), nrow = 5, ncol = 9)

Invert an apply with rbind

Lets say that I have an array, foo, in R that has dimensions == c(150, 40, 30).
Now, if I:
bar <- apply(foo, 3, rbind)
dim(bar) is now c(6000, 30).
What is the most elegant and generic way to invert this procedure and go from bar to foo so that they are identical?
The trouble isn't getting the dimensions right, but getting the data back in the same order, within it's respected, original, dimension.
Thank you for taking the time, I look forward to your responses.
P.S. For those thinking that this is part of a larger problem, it is, and no, I cannot use plyr, quite yet.
I think you can just call array again and specify the original dimensions:
m <- array(1:210,dim = c(5,6,7))
m1 <- apply(m, 3, rbind)
m2 <- array(as.vector(m1),dim = c(5,6,7))
all.equal(m,m2)
[1] TRUE
I'm wondering about your initial transformation. You call rbind from apply, but that won't do anything - you could just as well have called identity!
foo <- array(seq(150*40*30), c(150, 40, 30))
bar <- apply(foo, 3, rbind)
bar2 <- apply(foo, 3, identity)
identical(bar, bar2) # TRUE
So, what is it you really wanted to accomplish? I was under the assumption that you had a couple (30) matrix slices and wanted to stack them and then unstack them again. If so, the code would be more involved than #joran suggested. You need some calls to aperm (as #Patrick Burns suggested):
# Make a sample 3 dimensional array (two 4x3 matrix slices):
m <- array(1:24, 4:2)
# Stack the matrix slices on top of each other
m2 <- matrix(aperm(m, c(1,3,2)), ncol=ncol(m))
# Reverse the process
m3 <- aperm(array(m2, c(nrow(m),dim(m)[[3]],ncol(m))), c(1,3,2))
identical(m3,m) # TRUE
In any case, aperm is really powerful (and somewhat confusing). Well worth learning...

Resources