I have a 2D array A with 10,000 rows and 2 columns.
At a first instance I want to use only the first 200 rows of the array A. I did the following:
New_array=A[A(1:200) ,]
Each time I want to increase the number of the rows by 50.
i.e. in the second iteration I want to have access on the 250 rows of matrix A, third iteration 300 and so on until I reach the original size of the matrix.
I know that I have to create a for loop but I struggle. Any help will be highly appreciated
The seq function allows you to specify intervals in a sequence, as shown in #d.b's comment.
seq(0, 20, by = 5)
[1] 0 5 10 15 20
The output of seq can then be used to drive the loop. Here i is used as the endpoint for a sequence in each iteration.
for ( i in seq(5, 20, by = 5) ) {
print(1:i)
}
[1] 1 2 3 4 5
[1] 1 2 3 4 5 6 7 8 9 10
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Applied to your example, the sequences can be used to subset the matrix
# Example matrix
m <- 10000
n <- 2
A <- matrix(1:(m*n), ncol = n)
head(A)
[,1] [,2]
[1,] 1 10001
[2,] 2 10002
[3,] 3 10003
[4,] 4 10004
[5,] 5 10005
[6,] 6 10006
# Iterate with a loop
jump <- 5 # I'm using 5 instead of 50
for ( i in seq(jump, m, by = jump) ) {
print(paste("i =", i))
print( A[1:i, ] ) # subset the matrix
if ( i > 15 ) break # limiting the output for readability
}
[1] "i = 5"
[,1] [,2]
[1,] 1 10001
[2,] 2 10002
[3,] 3 10003
[4,] 4 10004
[5,] 5 10005
[1] "i = 10"
[,1] [,2]
[1,] 1 10001
[2,] 2 10002
[3,] 3 10003
[4,] 4 10004
[5,] 5 10005
[6,] 6 10006
[7,] 7 10007
[8,] 8 10008
[9,] 9 10009
[10,] 10 10010
[1] "i = 15"
[,1] [,2]
[1,] 1 10001
[2,] 2 10002
[3,] 3 10003
[4,] 4 10004
[5,] 5 10005
[6,] 6 10006
[7,] 7 10007
[8,] 8 10008
[9,] 9 10009
[10,] 10 10010
[11,] 11 10011
[12,] 12 10012
[13,] 13 10013
[14,] 14 10014
[15,] 15 10015
[1] "i = 20"
[,1] [,2]
[1,] 1 10001
[2,] 2 10002
[3,] 3 10003
[4,] 4 10004
[5,] 5 10005
[6,] 6 10006
[7,] 7 10007
[8,] 8 10008
[9,] 9 10009
[10,] 10 10010
[11,] 11 10011
[12,] 12 10012
[13,] 13 10013
[14,] 14 10014
[15,] 15 10015
[16,] 16 10016
[17,] 17 10017
[18,] 18 10018
[19,] 19 10019
[20,] 20 10020
Related
I need to create a 3D array sorted by row, from left to right and descendent.
x <- 100
I have tried with this:
b <- array(1:96, dim= c(8,4,3))
but it sorts firstly descendently. Using apperm(b) doesn't work as well
The result I want is this:
, , 1
1 2 3 4 5
6 7 8 9 10
11 12 13 14
15 16 17 18
19 20 21 22
array by default fill values along 1st dimension, then 2nd dimension, then 3rd; What you are looking for is fill it in the order of (2nd, 1st, 3rd), you can initialize the array with the shape of 1st dimension and 2nd dimension switched and then use aperm on it:
b <- aperm(array(1:96, dim= c(4,8,3)), c(2,1,3))
# ^ ^ ^ ^ switch the dimension twice here
b
, , 1
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
[3,] 9 10 11 12
Edit: My first try, but #Psidom's answer is the right way to do this.
You need to make it as a combination of 3 matrices and then combine them into an array. In the code below I used 96*i/3 to make it flexible for more than 3 matrices to be combined.
b <- array( c( aperm(array(1:(96*1/3), dim = c(4,8))),
aperm(array(33:(96*2/3), dim = c(4,8))),
aperm(array(65:(96*3/3), dim = c(4,8))) ) ,
dim = c(8, 4, 3))
This will be the output:
b[, , 1]
# [,1] [,2] [,3] [,4]
# [1,] 1 2 3 4
# [2,] 5 6 7 8
# [3,] 9 10 11 12
# [4,] 13 14 15 16
# [5,] 17 18 19 20
# [6,] 21 22 23 24
# [7,] 25 26 27 28
# [8,] 29 30 31 32
I know there have been similar topics but they don't seem to answer my problem. I have a 3-dimensional array, composed of matrices (10,5).
I want to fill each matrix by row, by let say 1:5.
Previous topics talks about aperm, but here's the problem : since my matrices are not symmetric, when I first fill them by column as follows :
kappa=array(0,dim=c(10,5,2))
kappa[1:10,,1]=1:5
kappa[,,1]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 2 2 2 2
[3,] 3 3 3 3 3
[4,] 4 4 4 4 4
[5,] 5 5 5 5 5
[6,] 1 1 1 1 1
[7,] 2 2 2 2 2
[8,] 3 3 3 3 3
[9,] 4 4 4 4 4
[10,] 5 5 5 5 5
Since the column dimension is lower than the row dim, the sequence is replicated. So when I aperm the result, it gives :
aperm(kappa[,,1])
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 2 3 4 5 1 2 3 4 5
[2,] 1 2 3 4 5 1 2 3 4 5
[3,] 1 2 3 4 5 1 2 3 4 5
[4,] 1 2 3 4 5 1 2 3 4 5
[5,] 1 2 3 4 5 1 2 3 4 5
But what I want instead is
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
[2,] 1 2 3 4 5
[3,] 1 2 3 4 5
[4,] 1 2 3 4 5
[5,] 1 2 3 4 5
[6,] 1 2 3 4 5
[7,] 1 2 3 4 5
[8,] 1 2 3 4 5
[9,] 1 2 3 4 5
[10,] 1 2 3 4 5
at first i have a matrix like this :
x <- matrix(rnorm(1e3),260)
and then an Array
lst <- lapply(seq(1,length(x[,1]), by=52), function(i) x[i:(i+51),])
Data_array <- array(unlist(lst), dim=c(52,length(x[1,]),(length(x[,1])/52)))
This array is a sequence of the Dataframe by 52 (weeks).
It's a temporal analysis (weekly)
I would like to compute an ecdf function on this array.
, , 1
[,1] [,2] [,3]
[1,] **0.66319631** 0.01004290 0.02133477
[2,] -1.64273648 0.23105503 1.02862145
[3,] 1.17083363 -0.49700717 -0.01119745
, , 2
[,1] [,2] [,3]
[1,] **-0.79365987** 1.28394049 -0.547763434
[2,] -0.09221301 1.07676841 0.570294731
[3,] 0.20293308 1.00182888 0.247373981
, , 3
[,1] [,2] [,3]
[1,] **1.03862172** -0.961678683 1.25334651
[2,] 0.58476540 0.745250484 -0.06183788
[3,] 0.24057690 1.226575038 0.23363005
compute ecdf function for each cell. It's for a weekly seasonal analysis.
i.e. calcul quantile for this time series (**): 0.66319631;-0.79365987;1.03862172
for MEAN it's works :
array_lag_sum<-apply(Data_array,c(1,2),FUN=function(x){mean(x,na.rm=TRUE)})
i tried a similar function whith ecdf, but it doesn't work.
percent_array<-apply(Data_array,c(1,2),FUN=function(u){ecdf(u)(u)})
Then...it is not finish, i would like to reformat this array like the original format of the data dataframe (x). (like a rbind but on an array.)
Thank you so much for your help.
edit :
sorry, but i don't know if i was so clear. It's sur that array is complicated for me;
but with your method, if i have this simple data frame :
B <- matrix(seq(1,20), 20, 3)
> B
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 2
[3,] 3 3 3
[4,] 4 4 4
[5,] 5 5 5
[6,] 6 6 6
[7,] 7 7 7
[8,] 8 8 8
[9,] 9 9 9
[10,] 10 10 10
[11,] 11 11 11
[12,] 12 12 12
[13,] 13 13 13
[14,] 14 14 14
[15,] 15 15 15
[16,] 16 16 16
[17,] 17 17 17
[18,] 18 18 18
[19,] 19 19 19
[20,] 20 20 20
Your function gives :
Data_array <- array( B, dim=c(10,3,5))
, , 1
[,1] [,2] [,3]
[1,] 1 11 1
[2,] 2 12 2
[3,] 3 13 3
[4,] 4 14 4
[5,] 5 15 5
[6,] 6 16 6
[7,] 7 17 7
[8,] 8 18 8
[9,] 9 19 9
[10,] 10 20 10
, , 2
[,1] [,2] [,3]
[1,] 11 1 11
[2,] 12 2 12
[3,] 13 3 13
[4,] 14 4 14
[5,] 15 5 15
[6,] 16 6 16
[7,] 17 7 17
[8,] 18 8 18
[9,] 19 9 19
[10,] 20 10 20
or i would more something like this :
,,1
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 2
[3,] 3 3 3
[4,] 4 4 4
[5,] 5 5 5
[6,] 6 6 6
[7,] 7 7 7
[8,] 8 8 8
[9,] 9 9 9
[10,] 10 10 10
,,2
[,1] [,2] [,3]
[1,] 11 11 11
[2,] 12 12 12
[3,] 13 13 13
[4,] 14 14 14
[5,] 15 15 15
[6,] 16 16 16
[7,] 17 17 17
[8,] 18 18 18
[9,] 19 19 19
[10,] 20 20 20
and get in result a table which is the percentile value of the time series.
percentile values of 1 and 11, 2 and 12 for each column and each row (i know it's not pertinent but it's just for exemple)
Sorry if my last question was not understandable
The answer is:
ecdf_mat <- apply( Data_array, 1:2, ecdf)
This passes values from each combination of the first two indices to the the function, ecdf. Each of those passes will return a function into a matrix location. You are getting something most people will not be able to use without a bit of coaching: one 52 x 4 matrix of functions. The functions are contained in lists which are valid matrix or array elements:
> dim(apply( Data_array, 1:2, ecdf) )
[1] 52 4
To access them you need to first pull them out of the matrix with standard "[" indexing but then pull them out of the list container with a call to "[[1]]":
> str(apply( Data_array, 1:2, ecdf)[1,1] )
List of 1
$ :function (v)
..- attr(*, "class")= chr [1:3] "ecdf" "stepfun" "function"
..- attr(*, "call")= language FUN(newX[, i], ...)
> apply( Data_array, 1:2, ecdf)[1,1][[1]]
Empirical CDF
Call: FUN(newX[, i], ...)
x[1:5] = -0.92217, -0.37471, 0.058284, 0.28502, 0.44391
> apply( Data_array, 1:2, ecdf)[1,1][[1]](0)
[1] 0.4
Edit:------
It appears you don't want the ecdf's themselves (despite getting no response to my efforts at getting you to recognize the distinction), but rather want an identically shaped array with the percentile values for the i-j positions considered as individual length k-sequences. I can think of two ways to do this. The first one would use that matrix of ecdf functions I built and demonstrated, but I believe that is the more baroque method and it would be easier to give you a more direct route. I've take the liberty of making this more manageable by making the long first dimension only 10-long.
x <- matrix(rnorm(1e3),260)
lst <- lapply(seq(1,length(x[,1]), by=10), function(i) x[i:(i+51),])
Data_array <- array(unlist(lst), dim=c(10,length(x[1,]),(length(x[,1])/52
pctiles2 <- apply( Data_array, 1:2, function(x) ecdf(x)(x) )
> str(pctiles2)
num [1:5, 1:10, 1:4] 0.8 0.4 0.6 0.2 1 0.4 1 0.2 0.6 0.8 ...
They aren't actually percentiles, but that could be easily remedied by slipping a 100* in from of the ecdf call (or multiplying the result by 100.. You will notice that the structure has been permuted so that the quantile/percentiles sequences run down the first column. That because apply always delivers its result in column major order. There is a function aperm which would allow you to re-arrange these in the original order:
re_pctiles <- aperm(pctiles, c(2,3,1) )
Consider this array:
the.seq <- 1:4
sol<- outer(outer(the.seq, the.seq, `+`), the.seq, `+`)
I want to find all elements that sum 6. That is pretty easy to do with which:
indices <- which(sol == 6)
indices
[1] 4 7 10 13 19 22 25 34 37 49
Now I want a vector with the dimension indexes of these elements, the answer would be:
[,1] [,2] [,3]
[1,] 4 1 1
[2,] 3 2 1
[3,] 2 3 1
[4,] 1 4 1
[5,] 3 1 2
[6,] 2 2 2
[7,] 1 3 2
[8,] 2 1 3
[9,] 2 1 3
[10,] 1 1 4
How would you do this?
You can use the arr.ind argument in which. When set to TRUE, which will return the array indices for which its first argument is TRUE.
which(sol == 6, arr.ind = TRUE)
# dim1 dim2 dim3
# [1,] 4 1 1
# [2,] 3 2 1
# [3,] 2 3 1
# [4,] 1 4 1
# [5,] 3 1 2
# [6,] 2 2 2
# [7,] 1 3 2
# [8,] 2 1 3
# [9,] 1 2 3
#[10,] 1 1 4
Within R I would like to transform an array (dimensions: i, j, k) into a matrix such that the observations (i.e. rows) of the new matrix are each element from the array pulled k "layers" at a time. Essentially, again, the rows of the new matrix will be composed of each element of the previous array with the columns of the matrix being equivalent to the k dimension of the array. Thus, the new matrix should be composed of i*j rows with k columns.
Please let me know if I can clarify or provide an example of input / output!
Thanks!
Edit:
This code works (but is not optimized) —
m = array(1:27,dim = c(3,3,3))
m
dim = dim(m)
mparam = dim[3]
listm = list()
for (i in 1:mparam){
listm[[i]] = as.vector(m[,,i])
}
untran = do.call(rbind,listm)
transposed = t(untran)
transposed
Like this?
m <- array(1:27,dim = c(3,3,3))
> m
, , 1
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
, , 2
[,1] [,2] [,3]
[1,] 10 13 16
[2,] 11 14 17
[3,] 12 15 18
, , 3
[,1] [,2] [,3]
[1,] 19 22 25
[2,] 20 23 26
[3,] 21 24 27
> matrix(m,9,3)
[,1] [,2] [,3]
[1,] 1 10 19
[2,] 2 11 20
[3,] 3 12 21
[4,] 4 13 22
[5,] 5 14 23
[6,] 6 15 24
[7,] 7 16 25
[8,] 8 17 26
[9,] 9 18 27