R - Indexing Array using Array - arrays

Lets assume I have an array of dim(x) <- c(3,3,3). I also have a df or matrix with two** columns containing index combinations that I need.
When I pass x[df[[1]],df[[2]],] I get a VERY large array which I then need to go through and take the diagonal of using the apply function. This is very memory and time inefficient. Is there some sort of shortcut (without using for loops) to index an array so that it would return the vector of values that the df asks for.
Trivial Example:
`a <- array(1:27,dim = c(3,3,3))
df <- data.frame(c(1,2,2,1,3,2),c(2,3,2,1,3,2))`
In this example, I would want to pass something like "a[df[[1]],df[[2]],]"
and get something like this (or transposed):
. [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 4 8 5 1 9 5
[2,] 13 17 14 10 18 14
[3,] 22 26 23 19 27 23
When I pass that function now, I get a 3-d array of dim = c(6,6,3) as apposed to the more helpful dim = c(6,3). I can easily take apply(result, 3,diag) to get what I want, but when df>>6 it takes up a lot of space (like 750GB of space and throws warnings, errors and stops execution before beginning)

This works
temp <- array(1:27, dim=c(3,3,3))
df <- data.frame(a=c(1,2,3), b=c(1,2,3), c=c(1,2,3))
temp[cbind(df[[1]], df[[2]], df[[3]])]
[1] 1 14 27
This is sometimes referred to as matrix indexing.
To query by two of the dimensions and leave the third one open, you might just use the regular matrix subsetting: For example, to select the the first and second row and second column for each of the "z" dimension matrices, you could use something like temp[1:2, 2,] or from your dataset:
temp[1:2, 2,]
[,1] [,2] [,3]
[1,] 4 13 22
[2,] 5 14 23
temp[df[[1]][1:2], df[[2]][2], ]
[,1] [,2] [,3]
[1,] 4 13 22
[2,] 5 14 23
Which are of course identical.

Related

R subsetting and assigning in a multidimensional array

I am working with R with a 3D dimensional array. I am trying to use it like a set of 2D matrix for different time instants.
I have find a behavior that I really don't understand and I will like to know why is happening. I have tried to find a explanation here and in other places but until now I still have the doubt.
I have my 3D array like this:
array3D=array(1:45,c(5,3,3))
And as I expected I can access to an individual 2D matrix
array3D[1,,]
[,1] [,2] [,3]
[1,] 1 16 31
[2,] 6 21 36
[3,] 11 26 41
However trying to access to two 2D matrices I don't get what I expect
array3D[1:2,,]
, , 1
[,1] [,2] [,3]
[1,] 1 6 11
[2,] 2 7 12
, , 2
[,1] [,2] [,3]
[1,] 16 21 26
[2,] 17 22 27
, , 3
[,1] [,2] [,3]
[1,] 31 36 41
[2,] 32 37 42
I have find that I can solve this using aperm(array3D[1:2,,]) but I don't understand what is doing.
And the other problem is when I try to do an assignment, that I don't understand why this doesn't works
array3D[1:2,,]=matrix(9:1,3,3)
array3D[1,,]
[,1] [,2] [,3]
[1,] 9 3 6
[2,] 7 1 4
[3,] 5 8 2
I think that I can solve this with a loop or maybe with aaply as I read here, but I think that if I want to work with 3D arrays is really important to understand what is happening. If someone can point me to the right direction I will be really happy.
I have tried to find the answer here and reading http://adv-r.had.co.nz/ but so far no luck.
Update
I have found that everything works if instead of using the first index I use the last one, but I still doesn't understand why.
Is something inherent to R?
Is possible to use the first one in some other way?
array3D=array(1:45,c(3,3,5))
array3D[,,1:2]=matrix(9:1,3,3)
array3D[,,2]
[,1] [,2] [,3]
[1,] 9 6 3
[2,] 8 5 2
[3,] 7 4 1
I think it's not quite clear what you want to achieve, but here are some examples:
On your first point, you can select two of the three three-by-three matrices in the z-direction by doing:
array3D[,,1:2]
And accordingly, you can replace with an array of appropriate size:
array3D[,,1:2] <- array(18:1,c(3,3,2))
About your question on why you have to use the third index: Think about it like the z-direction in a 3D coordinate system. The rows would be the x-direction (vertical) and the columns the y-direction (horizontal). When indexing array3D[1:2,,] you selected the first two rows, while keeping everything in the x and z direction.

Write and read 3D arrays in R

I have a 3D array in R constructed as (although the names don’t seem to show up):
v.arr <- array(1:18, c(2,3,3), dimnames = c("A", "B", "X",
"Y","Z","P","Q","R"))
and it shows up like this when printed to the screen:
, , 1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
, , 2
[,1] [,2] [,3]
[1,] 7 9 11
[2,] 8 10 12
, , 3
[,1] [,2] [,3]
[1,] 13 15 17
[2,] 14 16 18
I write it out to a file using:
write.table(v.arr, file = “Test Data”)
I then read it back in with:
test.data <- read.table(“Test Data”)
and I get this:
X1 X2 X3 X4 X5 X6 X7 X8 X9
1 1 3 5 7 9 11 13 15 17
2 2 4 6 8 10 12 14 16 18
Obviously, I need to do something to either structure the file before writing or restructure it on the read-back to get back the 3D array. I can always restructure the data that I get from reading. Is that the best approach? Thanks in advance.
Your issue is that you are using write.table to do this, so it is (I believe) coercing your array to a table. If you are looking to save it and don't mind that it would be in an R-specific format, you can easily use the save and load functions.
save(v.arr,file = "~/Desktop/v.arr.RData")
rm(list=ls())
load("~/Desktop/v.arr.RData")
v.arr

How do I smartly subset an array in R with dynamic dimensions?

I'm crafting a simulation model, and I think this problem has an easy fix, but I'm just not that used to working with arrays. Let's say I have an array, Array1, that has 3 dimensions. The first two are of constant and equal length, L, but the third dimension can be of length from 1 to X at any given time.
I want to be able to periodically subset Array1 to create a second array, Array2, that is composed of up to the last Y "sheets" of Array1. In other words, if the length of the third dimension of Array1 is greater than Y, then I want just the last Y sheets of Array1 but, if it's less than Y, I want all sheets of Array1.
I know that I can crudely pull this off using the tail function and a little finagling:
tmp1 = tail(Array1, (L*L*Y))
Array2 = array(tmp1, dim = (L, L, (L*L/length(tmp1))))
But it seems like there could be a more elegant way of doing this. Is there an equivalent of tail for arrays in R? Or is there a way that Array2 could be produced via simple logical indexing of Array1? Or perhaps the apply function could be used somehow?
Were you after something like this?
a <- array(1:(3*4*5), dim=c(3,4,5))
x <- dim(a)[3]
y <- 2
a[, , seq(to=x, len=min(x, y))]
, , 1
[,1] [,2] [,3] [,4]
[1,] 37 40 43 46
[2,] 38 41 44 47
[3,] 39 42 45 48
, , 2
[,1] [,2] [,3] [,4]
[1,] 49 52 55 58
[2,] 50 53 56 59
[3,] 51 54 57 60

R filling array by rows

I would like to do some matrix operations and it would be best to utilize 3 (or higher) dimensional arrays. If I want to fill matrices by row there is an argument (byrow = TRUE) however there is no such option for creating/filling a multidimensional array. The only way I've been able to accomplish it is by using aperm to transpose an array that was filled by column. For example:
arr.1 <- array(1:12, c(3,2,2))
arr.1
arr.2 <- aperm(arr.1, c(2,1,3))
arr.2
produces the correct result, a dimension 2,3,2 array that is filled by row. It seems a bit counter intuitive to have to work backward from a Column x Row x Range array to get to a Row x Column x Range array. This might be bad habits from previous f77 coding or have I overlooked something simple?
Arrays in R are filled in by traversing the first dimension first. So first the first dimension is traversed, then the second dimension, and then third dimension if it is available.
In case of a matrix:
array(c(1,2,3), dim = c(3,3))
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 2
[3,] 3 3 3
Or with assignment:
M <- array(dim = c(3,3))
M[,] <- c(1,2,3)
M
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 2
[3,] 3 3 3
Assigning to the second dimension is easy:
M <- array(dim = c(3,3))
M[,2:3] <- c(1,2,3)
M
[,1] [,2] [,3]
[1,] NA 1 1
[2,] NA 2 2
[3,] NA 3 3
But assigning to first dimension is more tricky. The following doesn't give the expected result:
M <- array(dim = c(3,3))
M[2:3,] <- c(1,2,3)
M
[,1] [,2] [,3]
[1,] NA NA NA
[2,] 1 3 2
[3,] 2 1 3
Data is filled by first traversing the first dimension, then second. What we want is to first traverse the second dimension, then first. So we have to aperm the array (or transpose in case of matrix).
M <- array(dim = c(3,3))
Mt <- aperm(M)
Mt[,2:3] <- c(1,2,3)
M <- aperm(Mt)
M
[,1] [,2] [,3]
[1,] NA NA NA
[2,] 1 2 3
[3,] 1 2 3
There are maybe more elegant ways to do the last which I am not aware of.
If you want to interpret a vector of values as an n-dimensional array in "row-major order" (as in C, or more precisely, last dimension varying fastest), you can use function array2() from package listarrays. This is the logical extension of matrix(..., byrow = TRUE) to multiple dimensions.
Compare base R
> (a <- array(1:24, c(2,3,4)))
, , 1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
, , 2
[,1] [,2] [,3]
[1,] 7 9 11
[2,] 8 10 12
, , 3
[,1] [,2] [,3]
[1,] 13 15 17
[2,] 14 16 18
, , 4
[,1] [,2] [,3]
[1,] 19 21 23
[2,] 20 22 24
with
> library(listarrays)
> (a <- array2(1:24, c(2,3,4)))
, , 1
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 13 17 21
, , 2
[,1] [,2] [,3]
[1,] 2 6 10
[2,] 14 18 22
, , 3
[,1] [,2] [,3]
[1,] 3 7 11
[2,] 15 19 23
, , 4
[,1] [,2] [,3]
[1,] 4 8 12
[2,] 16 20 24
My recommendation would be to "teach" yourself the default order by running
foo<- array(1:60,3,4,5)
Then you can fill an arbitrary array either by rearranging your source vector or by creating matrices and loading them into the z-layers of the array in the order desired.

Extract the anti-diagonals from an array

I want to extract the anti-diagonals of an array
m=array(1:18,c(3,3,2))
My best shot
k=dim(m)[3]
mn=matrix(nrow = k, ncol = 3)
for (i in 1:k){
mn=diag(m[,,i][3:1,1:3])
}
This returns 12 14 16, the anti-diagonal of the second matrix in the array. I want to achieve this
[1] 3 5 7
[2] 12 14 16
I want the “anti-diags” as arrays
Manually diag(m[,,1][3:1,1:3]) and diag(m[,,2][3:1,1:3]) works fine, but the array I’m working with is dim(c(3,3,22)), so I thought "loop!"
MQ: How to extract the anti-diagonals from an array using the loop? (better and elegant solutions are more than welcome)
This should work:
mn <- array(NA, dim=dim(m))
for (i in 1:dim(m)[3]){
mn[,,i]=diag(m[,,i][cbind(3:1,1:3)])
}
It was unclear whether you want the "anti-diag" to become the new diag, but that is what your code suggested as the intent. The form matrix[cbind(vec1,vec2)] pulls the (R,C) referenced elements from the matrix.
If you do not want them as arrays then this is an alternate result:
mn <- array(NA, dim=c(2,3))
for (i in 1:dim(m)[3]){
mn[i,]=m[,,i][cbind(3:1,1:3)]
}
mn
[,1] [,2] [,3]
[1,] 3 5 7
[2,] 12 14 16
This is a loopless way of getting the same values:
m[cbind( rep(3:1,2), rep(1:3,2), rep(1:2,each=3)) ]
[1] 3 5 7 12 14 16
You could use lapply across the third dimension and extract the anti-diagonal by first rotating the matrix ( see this great answer ) by reversing the column order and taking the diagonal of that. Basically like this...
out <- lapply( 1:dim(m)[3] , function(x) diag( t( apply( m[,,x] , 2 , rev ) ) ) )
[[1]]
[1] 3 5 7
[[2]]
[1] 12 14 16
If you need them glued together as an array then use do.call...
do.call( rbind , out )
[,1] [,2] [,3]
[1,] 3 5 7
[2,] 12 14 16
In this particular case, a for loop will be much quicker (benchmark it) and you should use #DWin's answer.
It occurs to me that we can simplfy this a bit and avoid using lists and bad use of lapply (by assuming thatm is available outside the scope of lapply) because we can also simply apply across the third dimension of your matrices. So we can apply once to rotate the matrices, then take the diag of each rotated matrix like so...
rotM <- apply( m , 2:3 , rev )
out <- t( apply( rotM , 3 , diag ) )
[,1] [,2] [,3]
[1,] 3 5 7
[2,] 12 14 16

Resources