I've got an array like this :
, , 1
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
, , 2
[,1] [,2] [,3]
[1,] 10 13 16
[2,] 11 14 17
[3,] 12 15 18
, , 3
[,1] [,2] [,3]
[1,] 19 22 25
[2,] 20 23 26
[3,] 21 24 27
And I would like to add a column for each component, filled with 0, to finally have this :
, , 1
[,1] [,2] [,3] [,4]
[1,] 1 4 7 0
[2,] 2 5 8 0
[3,] 3 6 9 0
, , 2
[,1] [,2] [,3] [,4]
[1,] 10 13 16 0
[2,] 11 14 17 0
[3,] 12 15 18 0
, , 3
[,1] [,2] [,3] [,4]
[1,] 19 22 25 0
[2,] 20 23 26 0
[3,] 21 24 27 0
Is there a simple way to do this using R?
Here is a way:
library(abind)
abind(x, array(0, replace(dim(x), 2, 1)), along = 2)
And another one:
aperm(apply(x, c(1, 3), c, 0), c(2, 1, 3))
You could also try something like (though its a bit manual but should be faster than the other base R solution)
indx <- dim(df) + c(0, 1, 0)
array(sapply(1:indx[3], function(x) cbind(df[,,x], 0)), indx)
Some benchmarks
n <- 1e5
df <- array(1:27, c(3, 3, n))
library(abind)
library(microbenchmark)
flodel1 <- function(x) abind(x, array(0, replace(dim(x), 2, 1)), along = 2)
flodel2 <- function(x) aperm(apply(x, c(1, 3), c, 0), c(2, 1, 3))
David <- function(x) {indx <- dim(x) + c(0, 1, 0) ; array(sapply(seq_len(indx[3]), function(y) cbind(x[,,y], 0)), indx)}
Res <- microbenchmark(flodel1(df),
flodel2(df),
David(df))
# Unit: milliseconds
# expr min lq mean median uq max neval
# flodel1(df) 45.8943 65.37496 90.68902 90.24751 107.5991 159.9881 100
# flodel2(df) 553.4831 634.73127 673.95636 679.79709 710.0540 808.6248 100
# David(df) 434.9524 531.85597 576.77011 555.46865 626.3344 757.9358 100
Just for the challenge, another idea (with some extra sauce) that seems valid unless I've missed something:
add_col_or_row = function(x, n = 1, add_col = T, fill = 0)
{
m1 = matrix(x, ncol = if(add_col) nrow(x) * ncol(x) else nrow(x), byrow = T)
m2 = matrix(fill, nrow = if(add_col) dim(x)[3] else prod(dim(x)[-1]),
ncol = if(add_col) nrow(x) * n else n)
array(t(cbind(m1, m2)),
c(nrow(x) + ((!add_col) * n), ncol(x) + (add_col * n), dim(x)[3]))
}
aa = array(1:24, c(3, 4, 2))
aa
#, , 1
#
# [,1] [,2] [,3] [,4]
#[1,] 1 4 7 10
#[2,] 2 5 8 11
#[3,] 3 6 9 12
#
#, , 2
#
# [,1] [,2] [,3] [,4]
#[1,] 13 16 19 22
#[2,] 14 17 20 23
#[3,] 15 18 21 24
add_col_or_row(aa, 2, T)
#, , 1
#
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 1 4 7 10 0 0
#[2,] 2 5 8 11 0 0
#[3,] 3 6 9 12 0 0
#
#, , 2
#
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 13 16 19 22 0 0
#[2,] 14 17 20 23 0 0
#[3,] 15 18 21 24 0 0
#
add_col_or_row(aa, 2, F)
#, , 1
#
# [,1] [,2] [,3] [,4]
#[1,] 1 4 7 10
#[2,] 2 5 8 11
#[3,] 3 6 9 12
#[4,] 0 0 0 0
#[5,] 0 0 0 0
#
#, , 2
#
# [,1] [,2] [,3] [,4]
#[1,] 13 16 19 22
#[2,] 14 17 20 23
#[3,] 15 18 21 24
#[4,] 0 0 0 0
#[5,] 0 0 0 0
And a benchmark using David Arenburg's data:
microbenchmark(flodel1(df), add_col_or_row(df), times = 20)
#Unit: milliseconds
# expr min lq median uq max neval
# flodel1(df) 35.69158 54.88014 55.58363 56.40300 58.31250 20
# add_col_or_row(df) 19.87134 38.57792 39.11297 39.58347 44.59873 20
identical("dimnames<-"(flodel1(df), NULL), add_col_or_row(df))
#[1] TRUE
Related
Let A and B be arrays, of dimension [3,4,5] and [4,5], respectively.
E.g.,
A <- array(100,c(3, 4,5))
B <- array(80, c(4,5))
My desired answer is an array C of dimension [3,4,5] such that
C[i,j,k] = A[i,j,k] - B[j,k]
for all i,j,k
Edit Which answer is fastest code?
To evaluate the following three answers, I executed the following code to quantify the time of the three codes.
The result is the following;
> mb
Unit: microseconds
expr min lq mean median uq max neval
f1 28.4 33.00 37.329 34.75 37.00 213.5 100
f2 32.5 37.65 40.069 38.95 40.55 103.0 100
f3 33.8 40.25 42.397 41.65 43.30 64.5 100
Thus the f1 is the most faster, thus I choose the answer of #user10488504 as an answer of this question.
Thank you, three persons #user10488504, #Stéphane Laurent and #Lyngbakr. I will use your suggesting code in my package. It helps me very much.
Code, which calculates running times.
f1 <- function(){
A <- array(1:100, c(3, 4, 5))
B <- array(1:80, c(4,5))
C <- array(aperm(sapply(1:dim(A)[1], function(i) A[i,,] - B)), dim(A))
}
f2<-function(){
A <- array(1:100, c(3, 4, 5))
B <- array(1:80, c(4,5))
sweep(A, c(2,3), B)
}
f3 <- function(){
A <- array(1:100, c(3, 4, 5))
B <- array(1:80, c(4,5))
# Perform calculation
res <- array(t(apply(A, MARGIN = 1, function(x)x-B)), c(3, 4, 5))
}
library(microbenchmark)
library(ggplot2)
mb <- microbenchmark(
f1 = f1(),
f2 = f2(),
f3 = f3()
)
mb
autoplot(mb)
With sweep:
A <- array(1:100, c(3, 4, 5))
B <- array(1:80, c(4,5))
> sweep(A, c(2,3), B)
, , 1
[,1] [,2] [,3] [,4]
[1,] 0 2 4 6
[2,] 1 3 5 7
[3,] 2 4 6 8
, , 2
[,1] [,2] [,3] [,4]
[1,] 8 10 12 14
[2,] 9 11 13 15
[3,] 10 12 14 16
, , 3
[,1] [,2] [,3] [,4]
[1,] 16 18 20 22
[2,] 17 19 21 23
[3,] 18 20 22 24
, , 4
[,1] [,2] [,3] [,4]
[1,] 24 26 28 30
[2,] 25 27 29 31
[3,] 26 28 30 32
, , 5
[,1] [,2] [,3] [,4]
[1,] 32 34 36 38
[2,] 33 35 37 39
[3,] 34 36 38 40
You can do this with sapply and aperm like:
C <- array(aperm(sapply(1:dim(A)[1], function(i) A[i,,] - B)), dim(A))
Here's an attempt that uses apply.
# Define arrays
A <- array(1:100, c(3, 4, 5))
B <- array(1:80, c(4,5))
# Perform calculation
res <- array(t(apply(A, MARGIN = 1, function(x)x-B)), c(3, 4, 5))
# Check result
res[1,,]
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0 8 16 24 32
#> [2,] 2 10 18 26 34
#> [3,] 4 12 20 28 36
#> [4,] 6 14 22 30 38
A[1,,] - B
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0 8 16 24 32
#> [2,] 2 10 18 26 34
#> [3,] 4 12 20 28 36
#> [4,] 6 14 22 30 38
Created on 2019-06-19 by the reprex package (v0.3.0)
For each of my N variables, I have a (T * M) feature matrix, i.e., M observations per t \in T. The problem is how to convert this into a (T * N * M) array. For example, in the following example N=2, T=3, M=4 :
x1 <- matrix(1:24, 3,4)
> x1
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
x2 <- matrix(25:48, 3,4)
x2
[,1] [,2] [,3] [,4]
[1,] 25 28 31 34
[2,] 26 29 32 35
[3,] 27 30 33 36
And I need to make a 3 dimensional (number of rows) array, such that the first element is
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 25 28 31 34
and the second is:
[,1] [,2] [,3] [,4]
[1,] 2 5 8 11
[2,] 26 29 32 35
and third:
[,1] [,2] [,3] [,4]
[1,] 3 6 9 12
[2,] 27 30 33 36
and so on and so forth. For the following example, the output's dimensions should be (3,2,4).
I need to do this for relatively large N and T, so appreciate extendable implementations!
Here is a base R option.
out <- `dim<-`(rbind(c(t(x1)), c(t(x2))), c(2, 4, 3))
out
#, , 1
#
# [,1] [,2] [,3] [,4]
#[1,] 1 4 7 10
#[2,] 25 28 31 34
#
#, , 2
#
# [,1] [,2] [,3] [,4]
#[1,] 2 5 8 11
#[2,] 26 29 32 35
#
#, , 3
#
# [,1] [,2] [,3] [,4]
#[1,] 3 6 9 12
#[2,] 27 30 33 36
When we call x <- rbind(c(t(x1)), c(t(x2))) we get the following matrix as a result
x
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
#[1,] 1 4 7 10 2 5 8 11 3 6 9 12
#[2,] 25 28 31 34 26 29 32 35 27 30 33 36
We need to change the dimensions of this object for which we can do
dim(x) <- c(2, 4, 3)
Another way to get the same result is to call the replacement method of dim in its functional form, i.e.
`dim<-`(...)
Which allows us to do all in one line.
I'm terribly stuck at reshaping a matrix into multidimensional array to be used in convolutional neural network training.
Here's a sample 2x16 matrix (the actual matrix will be around 11000x1024)
two_samples <- structure(c(257, 17, 258, 18, 65795, 19, 65796, 20, 261, 21,
262, 22, 65799, 23, 65800, 24, 9, 25, 10, 26, 65547, 27, 65548,
28, 13, 29, 14, 30, 65551, 31, 65552, 32), .Dim = c(2L, 16L))
two_samples
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
#> [1,] 257 258 65795 65796 261 262 65799 65800 9 10 65547 65548
#> [2,] 17 18 19 20 21 22 23 24 25 26 27 28
#> [,13] [,14] [,15] [,16]
#> [1,] 13 14 65551 65552
#> [2,] 29 30 31 32
Here, each row is one sample and I would like to represent each sample as if it's color image. And I would like to arrange the data such that it can be used with Keras/Tensorflow for CNN training.
Generating RGB data from decimal numbers is done via bitwAnd() and bitwShiftR() functions. I run the commands below to generate RGB from the matrix. Afterwards I used dim() for array and aperm() to reshape the array:
mat_r <- bitwAnd(bitwShiftR(t(two_samples),16), 255)
mat_g <- bitwAnd(bitwShiftR(t(two_samples),8), 255)
mat_b <- bitwAnd(t(two_samples),255)
two_samples_flat <- array(c(mat_r, mat_g, mat_b))
arr <- array(two_samples_flat, dim=c(4,4,3,2))
data <- aperm(arr, c(4,1,2,3))
data[1,,,]
#> , , 1
#>
#> [,1] [,2] [,3] [,4]
#> [1,] 0 0 0 0
#> [2,] 0 0 0 0
#> [3,] 1 1 1 1
#> [4,] 1 1 1 1
#>
#> , , 2
#>
#> [,1] [,2] [,3] [,4]
#> [1,] 0 0 0 0
#> [2,] 0 0 0 0
#> [3,] 0 0 0 0
#> [4,] 0 0 0 0
#>
#> , , 3
#>
#> [,1] [,2] [,3] [,4]
#> [1,] 1 1 0 0
#> [2,] 1 1 0 0
#> [3,] 1 1 0 0
#> [4,] 1 1 0 0
However, bit functions return flat vectors and the modifying the dimension afterwards does not allow correct slicing of the multidimensional array. My expected dimension and output (one sample shown) are shown below
> dim(data)
2 4 4 3
> data[1,,,]
, , 1
[,1] [,2] [,3] [,4]
[1,] 0 0 1 1
[2,] 0 0 1 1
[3,] 0 0 1 1
[4,] 0 0 1 1
, , 2
[,1] [,2] [,3] [,4]
[1,] 1 1 1 1
[2,] 1 1 1 1
[3,] 0 0 0 0
[4,] 0 0 0 0
, , 3
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
[3,] 9 10 11 12
[4,] 13 14 15 16
So, my question is, how can I prepare a multidimensional array from a matrix so that I can use with CNN. If I get this right, I can use array_reshape to reshape it to be used for dense layer network: array_reshape(data, c(num_of_samples, width*height*3))
Say I had an array with x representing repeat measurements (1-4), y representing treatments (A,B) and z representing timepoints (1-3)
x <- c(2,2,4,15,17,13,3,10,3,4,11,14,1,3,19,6,13,6,12,18,9,13,12,16)
dim(x) <- c(4,2,3)
, , 1
[,1] [,2]
[1,] 2 17
[2,] 2 13
[3,] 4 3
[4,] 15 10
, , 2
[,1] [,2]
[1,] 3 1
[2,] 4 3
[3,] 11 19
[4,] 14 6
, , 3
[,1] [,2]
[1,] 13 9
[2,] 6 13
[3,] 12 12
[4,] 18 16
I want to create a new array that has the number of times each replicate is greater than all other replicates for that treatment and timepoint combination:
, , 1
[,1] [,2]
[1,] 2 0 #both 4 and 15 are bigger then 2, so for 1,1,1 the result is 2
[2,] 2 1
[3,] 1 3 #15 is the only replicate bigger than 4 so result for 3,1,1 is 1
[4,] 0 2
, , 2
[,1] [,2]
[1,] 3 3
[2,] 2 2
[3,] 1 0
[4,] 0 1
, , 3
[,1] [,2]
[1,] 1 3
[2,] 3 1
[3,] 2 2
[4,] 0 0
apply can do this, acting within each column (2) and strata (3):
## recreate your data array:
arr <- c(2,2,4,15,17,13,3,10,3,4,11,14,1,3,19,6,13,6,12,18,9,13,12,16)
dim(arr) <- c(4,2,3)
## one liner using apply
apply(arr, 2:3, function(x) sapply(x, function(y) sum(y < x) ) )
#, , 1
#
# [,1] [,2]
#[1,] 2 0
#[2,] 2 1
#[3,] 1 3
#[4,] 0 2
#
#, , 2
#
# [,1] [,2]
#[1,] 3 3
#[2,] 2 2
#[3,] 1 0
#[4,] 0 1
#
#, , 3
#
# [,1] [,2]
#[1,] 1 3
#[2,] 3 1
#[3,] 2 2
#[4,] 0 0
Here you go... If you're question is incorrectly phrased (as I suspect above), then you will need to use "<" instead of ">".
a <- array(rnorm(24), dim= c(4,2,3))
cnts <- function(a) {
a2 <- array(NA, dim= dim(a))
for (i in 1:dim(a)[3]) {
for (j in 1:dim(a)[2]) {
for (k in 1:length(a[,j,i])) {
a2[k,j,i] <- sum(a[k,j,i] > a[,j,i])
}
}
}
return(a2)
}
I'm sorry but I can't find a way to replace every 0 in a 3-dimension array by the mean of the column they are in.
Thanks. So, for example, let's say I have this array j
j
, , 1
[,1] [,2]
[1,] 0 6
[2,] 1 5
[3,] 2 2
, , 2
[,1] [,2]
[1,] 11 0
[2,] 14 12
[3,] 0 14
, , 3
[,1] [,2]
[1,] 19 22
[2,] 20 23
[3,] 21 24
I would like
j
, , 1
[,1] [,2]
[1,] 1 6
[2,] 1 5
[3,] 2 2
, , 2
[,1] [,2]
[1,] 11 26/3
[2,] 14 12
[3,] 25/3 14
, , 3
[,1] [,2]
[1,] 19 22
[2,] 20 23
[3,] 21 24
You can use apply for this.
Starting with the following data :
arr <- array(0:5,dim=c(4,3,2))
, , 1
[,1] [,2] [,3]
[1,] 0 4 2
[2,] 1 5 3
[3,] 2 0 4
[4,] 3 1 5
, , 2
[,1] [,2] [,3]
[1,] 0 4 2
[2,] 1 5 3
[3,] 2 0 4
[4,] 3 1 5
You can do :
apply(arr, c(2,3),function(v) { v[v==0] <- mean(v); v})
Which gives :
, , 1
[,1] [,2] [,3]
[1,] 1.5 4.0 2
[2,] 1.0 5.0 3
[3,] 2.0 2.5 4
[4,] 3.0 1.0 5
, , 2
[,1] [,2] [,3]
[1,] 1.5 4.0 2
[2,] 1.0 5.0 3
[3,] 2.0 2.5 4
[4,] 3.0 1.0 5