Let A and B be arrays, of dimension [3,4,5] and [4,5], respectively.
E.g.,
A <- array(100,c(3, 4,5))
B <- array(80, c(4,5))
My desired answer is an array C of dimension [3,4,5] such that
C[i,j,k] = A[i,j,k] - B[j,k]
for all i,j,k
Edit Which answer is fastest code?
To evaluate the following three answers, I executed the following code to quantify the time of the three codes.
The result is the following;
> mb
Unit: microseconds
expr min lq mean median uq max neval
f1 28.4 33.00 37.329 34.75 37.00 213.5 100
f2 32.5 37.65 40.069 38.95 40.55 103.0 100
f3 33.8 40.25 42.397 41.65 43.30 64.5 100
Thus the f1 is the most faster, thus I choose the answer of #user10488504 as an answer of this question.
Thank you, three persons #user10488504, #Stéphane Laurent and #Lyngbakr. I will use your suggesting code in my package. It helps me very much.
Code, which calculates running times.
f1 <- function(){
A <- array(1:100, c(3, 4, 5))
B <- array(1:80, c(4,5))
C <- array(aperm(sapply(1:dim(A)[1], function(i) A[i,,] - B)), dim(A))
}
f2<-function(){
A <- array(1:100, c(3, 4, 5))
B <- array(1:80, c(4,5))
sweep(A, c(2,3), B)
}
f3 <- function(){
A <- array(1:100, c(3, 4, 5))
B <- array(1:80, c(4,5))
# Perform calculation
res <- array(t(apply(A, MARGIN = 1, function(x)x-B)), c(3, 4, 5))
}
library(microbenchmark)
library(ggplot2)
mb <- microbenchmark(
f1 = f1(),
f2 = f2(),
f3 = f3()
)
mb
autoplot(mb)
With sweep:
A <- array(1:100, c(3, 4, 5))
B <- array(1:80, c(4,5))
> sweep(A, c(2,3), B)
, , 1
[,1] [,2] [,3] [,4]
[1,] 0 2 4 6
[2,] 1 3 5 7
[3,] 2 4 6 8
, , 2
[,1] [,2] [,3] [,4]
[1,] 8 10 12 14
[2,] 9 11 13 15
[3,] 10 12 14 16
, , 3
[,1] [,2] [,3] [,4]
[1,] 16 18 20 22
[2,] 17 19 21 23
[3,] 18 20 22 24
, , 4
[,1] [,2] [,3] [,4]
[1,] 24 26 28 30
[2,] 25 27 29 31
[3,] 26 28 30 32
, , 5
[,1] [,2] [,3] [,4]
[1,] 32 34 36 38
[2,] 33 35 37 39
[3,] 34 36 38 40
You can do this with sapply and aperm like:
C <- array(aperm(sapply(1:dim(A)[1], function(i) A[i,,] - B)), dim(A))
Here's an attempt that uses apply.
# Define arrays
A <- array(1:100, c(3, 4, 5))
B <- array(1:80, c(4,5))
# Perform calculation
res <- array(t(apply(A, MARGIN = 1, function(x)x-B)), c(3, 4, 5))
# Check result
res[1,,]
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0 8 16 24 32
#> [2,] 2 10 18 26 34
#> [3,] 4 12 20 28 36
#> [4,] 6 14 22 30 38
A[1,,] - B
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0 8 16 24 32
#> [2,] 2 10 18 26 34
#> [3,] 4 12 20 28 36
#> [4,] 6 14 22 30 38
Created on 2019-06-19 by the reprex package (v0.3.0)
Related
I have the array A
A
,,A
[,1] [,2] [,3]
[1,] 3 7 8
[2,] 4 11 9
[3,] 2 12 4.3
,,B
[,1] [,2] [,3]
[1,] 31 7 8
[2,] 4.2 4 9.5
[3,] 1 1 7
,,C
[,1] [,2] [,3]
[1,] 4 71 8.3
[2,] 4 41 9
[3,] 11 0 73
,,D
[,1] [,2] [,3]
[1,] 7 7 8.3
[2,] 3 4.1 9
[3,] 1 0.5 73
dim(A)
3 3 4
dimnames(A)[3]
A B C D
and I have the data.frame df
df
X Y Z
2 1 A
3 2 D
I would like to put in a new column of df, the values of array A, based on df index-values X(row for the array), Y(column for the array) and third dimension Z, let's say, my aspect result is:
df
X Y Z Res
2 1 A 4 # Res is the value of array A in A[2,1,"A"]
3 2 D 0.5 # Res is the value of array A in A[3,2,"D"]
I tried this code:
df$Res <- NA
if (df$Z == dimnames(A)[3]){
for (i in 1:nrow(df)){
df[i,4] <- A[df[i,1],df[i,2],df[i,3]]
}
}
But it'doesn't work well...
Any idea to associate the dimnames of third dimension array and data frame index-value?
P.S. This is a simple example. My true array is:
dim(A)
137 93 227
and
dim(df)
6080 3
P.S.2 I prefer to don't use merge or other type of similar code for allocation problem
For each of my N variables, I have a (T * M) feature matrix, i.e., M observations per t \in T. The problem is how to convert this into a (T * N * M) array. For example, in the following example N=2, T=3, M=4 :
x1 <- matrix(1:24, 3,4)
> x1
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
x2 <- matrix(25:48, 3,4)
x2
[,1] [,2] [,3] [,4]
[1,] 25 28 31 34
[2,] 26 29 32 35
[3,] 27 30 33 36
And I need to make a 3 dimensional (number of rows) array, such that the first element is
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 25 28 31 34
and the second is:
[,1] [,2] [,3] [,4]
[1,] 2 5 8 11
[2,] 26 29 32 35
and third:
[,1] [,2] [,3] [,4]
[1,] 3 6 9 12
[2,] 27 30 33 36
and so on and so forth. For the following example, the output's dimensions should be (3,2,4).
I need to do this for relatively large N and T, so appreciate extendable implementations!
Here is a base R option.
out <- `dim<-`(rbind(c(t(x1)), c(t(x2))), c(2, 4, 3))
out
#, , 1
#
# [,1] [,2] [,3] [,4]
#[1,] 1 4 7 10
#[2,] 25 28 31 34
#
#, , 2
#
# [,1] [,2] [,3] [,4]
#[1,] 2 5 8 11
#[2,] 26 29 32 35
#
#, , 3
#
# [,1] [,2] [,3] [,4]
#[1,] 3 6 9 12
#[2,] 27 30 33 36
When we call x <- rbind(c(t(x1)), c(t(x2))) we get the following matrix as a result
x
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
#[1,] 1 4 7 10 2 5 8 11 3 6 9 12
#[2,] 25 28 31 34 26 29 32 35 27 30 33 36
We need to change the dimensions of this object for which we can do
dim(x) <- c(2, 4, 3)
Another way to get the same result is to call the replacement method of dim in its functional form, i.e.
`dim<-`(...)
Which allows us to do all in one line.
Let's assume I have a 3d array of dimensions (x,y,z) and would like to restructure my data as a matrix of dimensions (x*y,z), something like:
my_array <- array(1:600, dim=c(10,5,12))
my_matrix<-data.frame()
for (j in 1:5) {
for (i in 1:10) {
my_matrix <- rbind (my_matrix, my_array[i,j,1:12])
}
}
Could you suggest a faster and more elegant way?
thanks
Change the dimension of the array:
dim(my_array) <- c(10 * 5 , 12)
We can convert to a matrix by calling the matrix and specifying the dimensions
res <- matrix(my_array, prod(dim(my_array)[1:2]), dim(my_array)[3])
all.equal(as.matrix(my_matrix), res, check.attributes=FALSE)
#[1] TRUE
NOTE: This will not change the original 'my_array`. Also, in fact, the code can be simplified to
matrix(my_array, 10*5, 12)
and make it compact.
nchar("matrix(my_array, 10*5, 12)")
#[1] 26
nchar("dim(my_array) <- c(10 * 5 , 12)")
#[1] 31
Both #akrun and #Lars Arne Jordanger's solutions work and generate the same results.
The two solutions work by:
(1) concatenating the first rows of all matrices together and placing these rows in the top of the combined matrix; and then
(2) concatenating the second rows of all matrices together and placing these rows under the concatenation of the first rows, and so on.
The following example illustrates the idea very well:
> threeDimArray <- array( NA, dim=c(3,3,4) )
> dims <- dim( threeDimArray )
>
> constants <- c(1, 10, 100)
> for( id in 1:length(constants) ){
const <- constants[id]
threeDimArray[id,,] <- matrix( (1:prod(dims[2:3]))*const, dims[2], dims[3] )
}
> threeDimArray[1,,]
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
> threeDimArray[2,,]
[,1] [,2] [,3] [,4]
[1,] 10 40 70 100
[2,] 20 50 80 110
[3,] 30 60 90 120
> threeDimArray[3,,]
[,1] [,2] [,3] [,4]
[1,] 100 400 700 1000
[2,] 200 500 800 1100
[3,] 300 600 900 1200
> # solution 1:
> twoDimMat <- matrix(threeDimArray, prod(dims[1:2]), dims[3])
> twoDimMat
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 10 40 70 100
[3,] 100 400 700 1000
[4,] 2 5 8 11
[5,] 20 50 80 110
[6,] 200 500 800 1100
[7,] 3 6 9 12
[8,] 30 60 90 120
[9,] 300 600 900 1200
>
> # solution 2:
> threeDArray <- threeDimArray
> dim(threeDArray) <- c(prod( dims[1:2] ), dims[3])
> threeDArray
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 10 40 70 100
[3,] 100 400 700 1000
[4,] 2 5 8 11
[5,] 20 50 80 110
[6,] 200 500 800 1100
[7,] 3 6 9 12
[8,] 30 60 90 120
[9,] 300 600 900 1200
>
I've got an array like this :
, , 1
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
, , 2
[,1] [,2] [,3]
[1,] 10 13 16
[2,] 11 14 17
[3,] 12 15 18
, , 3
[,1] [,2] [,3]
[1,] 19 22 25
[2,] 20 23 26
[3,] 21 24 27
And I would like to add a column for each component, filled with 0, to finally have this :
, , 1
[,1] [,2] [,3] [,4]
[1,] 1 4 7 0
[2,] 2 5 8 0
[3,] 3 6 9 0
, , 2
[,1] [,2] [,3] [,4]
[1,] 10 13 16 0
[2,] 11 14 17 0
[3,] 12 15 18 0
, , 3
[,1] [,2] [,3] [,4]
[1,] 19 22 25 0
[2,] 20 23 26 0
[3,] 21 24 27 0
Is there a simple way to do this using R?
Here is a way:
library(abind)
abind(x, array(0, replace(dim(x), 2, 1)), along = 2)
And another one:
aperm(apply(x, c(1, 3), c, 0), c(2, 1, 3))
You could also try something like (though its a bit manual but should be faster than the other base R solution)
indx <- dim(df) + c(0, 1, 0)
array(sapply(1:indx[3], function(x) cbind(df[,,x], 0)), indx)
Some benchmarks
n <- 1e5
df <- array(1:27, c(3, 3, n))
library(abind)
library(microbenchmark)
flodel1 <- function(x) abind(x, array(0, replace(dim(x), 2, 1)), along = 2)
flodel2 <- function(x) aperm(apply(x, c(1, 3), c, 0), c(2, 1, 3))
David <- function(x) {indx <- dim(x) + c(0, 1, 0) ; array(sapply(seq_len(indx[3]), function(y) cbind(x[,,y], 0)), indx)}
Res <- microbenchmark(flodel1(df),
flodel2(df),
David(df))
# Unit: milliseconds
# expr min lq mean median uq max neval
# flodel1(df) 45.8943 65.37496 90.68902 90.24751 107.5991 159.9881 100
# flodel2(df) 553.4831 634.73127 673.95636 679.79709 710.0540 808.6248 100
# David(df) 434.9524 531.85597 576.77011 555.46865 626.3344 757.9358 100
Just for the challenge, another idea (with some extra sauce) that seems valid unless I've missed something:
add_col_or_row = function(x, n = 1, add_col = T, fill = 0)
{
m1 = matrix(x, ncol = if(add_col) nrow(x) * ncol(x) else nrow(x), byrow = T)
m2 = matrix(fill, nrow = if(add_col) dim(x)[3] else prod(dim(x)[-1]),
ncol = if(add_col) nrow(x) * n else n)
array(t(cbind(m1, m2)),
c(nrow(x) + ((!add_col) * n), ncol(x) + (add_col * n), dim(x)[3]))
}
aa = array(1:24, c(3, 4, 2))
aa
#, , 1
#
# [,1] [,2] [,3] [,4]
#[1,] 1 4 7 10
#[2,] 2 5 8 11
#[3,] 3 6 9 12
#
#, , 2
#
# [,1] [,2] [,3] [,4]
#[1,] 13 16 19 22
#[2,] 14 17 20 23
#[3,] 15 18 21 24
add_col_or_row(aa, 2, T)
#, , 1
#
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 1 4 7 10 0 0
#[2,] 2 5 8 11 0 0
#[3,] 3 6 9 12 0 0
#
#, , 2
#
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 13 16 19 22 0 0
#[2,] 14 17 20 23 0 0
#[3,] 15 18 21 24 0 0
#
add_col_or_row(aa, 2, F)
#, , 1
#
# [,1] [,2] [,3] [,4]
#[1,] 1 4 7 10
#[2,] 2 5 8 11
#[3,] 3 6 9 12
#[4,] 0 0 0 0
#[5,] 0 0 0 0
#
#, , 2
#
# [,1] [,2] [,3] [,4]
#[1,] 13 16 19 22
#[2,] 14 17 20 23
#[3,] 15 18 21 24
#[4,] 0 0 0 0
#[5,] 0 0 0 0
And a benchmark using David Arenburg's data:
microbenchmark(flodel1(df), add_col_or_row(df), times = 20)
#Unit: milliseconds
# expr min lq median uq max neval
# flodel1(df) 35.69158 54.88014 55.58363 56.40300 58.31250 20
# add_col_or_row(df) 19.87134 38.57792 39.11297 39.58347 44.59873 20
identical("dimnames<-"(flodel1(df), NULL), add_col_or_row(df))
#[1] TRUE
Within R I would like to transform an array (dimensions: i, j, k) into a matrix such that the observations (i.e. rows) of the new matrix are each element from the array pulled k "layers" at a time. Essentially, again, the rows of the new matrix will be composed of each element of the previous array with the columns of the matrix being equivalent to the k dimension of the array. Thus, the new matrix should be composed of i*j rows with k columns.
Please let me know if I can clarify or provide an example of input / output!
Thanks!
Edit:
This code works (but is not optimized) —
m = array(1:27,dim = c(3,3,3))
m
dim = dim(m)
mparam = dim[3]
listm = list()
for (i in 1:mparam){
listm[[i]] = as.vector(m[,,i])
}
untran = do.call(rbind,listm)
transposed = t(untran)
transposed
Like this?
m <- array(1:27,dim = c(3,3,3))
> m
, , 1
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
, , 2
[,1] [,2] [,3]
[1,] 10 13 16
[2,] 11 14 17
[3,] 12 15 18
, , 3
[,1] [,2] [,3]
[1,] 19 22 25
[2,] 20 23 26
[3,] 21 24 27
> matrix(m,9,3)
[,1] [,2] [,3]
[1,] 1 10 19
[2,] 2 11 20
[3,] 3 12 21
[4,] 4 13 22
[5,] 5 14 23
[6,] 6 15 24
[7,] 7 16 25
[8,] 8 17 26
[9,] 9 18 27