R how to store jagged arrays - arrays

Here, I need to build a jagged arrays to store data, here is the code for you
a<-list()
b=1
for(i in 1:5){
b+1
a[i]<-array(0,c(b,1))
}
You can see, what I want to do it to save arrays of different dimensions, even though here is 0 as an example, into a.
Would you please help me on how to create a jagged arrays/lists/matrices, to store different dimensions arrays?
Thanks

For create jagged arrays also know as ragged arrays this is an example using list function and using x[1] for indexing:
x = list(a=c(1,2,3,4),b=c(-1,-3,-4),c=c(23,45,23,45,25,48),d=c(2,1))
stk = stack(x)
ustk= unstack(stk)
identical(ustk,x)
[1] TRUE
> x
$a
[1] 1 2 3 4
$b
[1] -1 -3 -4
$c
[1] 23 45 23 45 24 48
$d
[1] 2 1
> x[1]
$a
[1] 1 2 3 4
> x[2]
$b
[1] -1 -3 -4
other method:
# empty list to start with
X <- list()
# we get a vector
v1 <- c(1, 2, 3, 4, 5)
# add it to the ragged array
X <- c(X, list(v1))
# get another couple of vectors and add them as well
v2 <- c(9, 8, 7, 6)
v3 <- c(2, 4, 6, 8)
X <- c(X, list(v2, v3))
# add some more elements to the first vector in
# the vector directly
X[[1]] <- c(X[[1]], 4, 3, 2, 1)
For created jagged matrices this is an example:
x = c(3,2,5)
struct = lapply(x, function(i){
sapply(1:i, function(k){
c(rep(NA, k-1),exp( - (0:(i-k))))
})
})
> x
[1] 3 2 5
> struct
[[1]]
[,1] [,2] [,3]
[1,] 1.0000000 NA NA
[2,] 0.3678794 1.0000000 NA
[3,] 0.1353353 0.3678794 1
[[2]]
[,1] [,2]
[1,] 1.0000000 NA
[2,] 0.3678794 1
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1.00000000 NA NA NA NA
[2,] 0.36787944 1.00000000 NA NA NA
[3,] 0.13533528 0.36787944 1.0000000 NA NA
[4,] 0.04978707 0.13533528 0.3678794 1.0000000 NA
[5,] 0.01831564 0.04978707 0.1353353 0.3678794 1

Related

R - Find unique minimum across array dimension

In a 4D array, I would like to find the unique minimum value across the 4th dimension. I want to get a matrix of the array indices for the minimum.
I have tried solving the issue with the following code block. I would have liked using which.min, but I haven't found a good way to return the array indices.
dims =c(3,3,3,4)
# create sample data with multiple mins in [,,,1]
mat_rep = array(c(rep(0,3),sample(1:prod(dims))), dim = dims)
pos_rep = apply(mat_rep, 4, function(x) which(x == min(x), arr.ind = T)) # get position of unique minimum
# create sample data with unique min
mat_norep = array(sample(1:prod(dims)), dim = dims)
pos_norep = apply(mat_norep, 4, function(x) which(x == min(x), arr.ind = T))
# formating depending on class of pos_ object
format_pos = function(x, dims){
if(class(x) == "matrix") x = t(x)
if(class(x) == "list") x = do.call(rbind, lapply(x, head, 1))
x = cbind(x, 1:dims[4]) # add 4th dimension
return(x)
}
format_pos(pos_norep, dims = dims)
format_pos(pos_rep, dims = dims)
The described solution works, however it doesn't work generally and the if(class()) and cbind(x, 1:dims[4]) in my opinion is prone to producing errors.
Does someone have a cleaner way of solving this issue?
To create uniform outputs you can call arrayInd explicitly on the output of apply(..., which.min), instead of implicitly as in which(..., arr.ind = TRUE). The fourth dimension indices still need to be added manually though:
## add 4D indices using 1D values
start.ind <- (seq_len(dims[4]) - 1) * prod(head(dims, 3))
arrayInd(apply(mat_rep, 4, which.min) + start.ind, .dim = dims)
#> [,1] [,2] [,3] [,4]
#> [1,] 1 1 1 1
#> [2,] 1 3 3 2
#> [3,] 1 3 2 3
#> [4,] 3 1 1 4
arrayInd(apply(mat_norep, 4, which.min) + start.ind, .dim = dims)
#> [,1] [,2] [,3] [,4]
#> [1,] 2 1 3 1
#> [2,] 1 2 1 2
#> [3,] 1 2 1 3
#> [4,] 2 2 3 4
## add 4D indices using cbind
cbind(arrayInd(apply(mat_rep, 4, which.min), .dim = head(dims, 3)), seq_len(dims[4]))
#> [,1] [,2] [,3] [,4]
#> [1,] 1 1 1 1
#> [2,] 1 3 3 2
#> [3,] 1 3 2 3
#> [4,] 3 1 1 4
cbind(arrayInd(apply(mat_norep, 4, which.min), .dim = head(dims, 3)), seq_len(dims[4]))
#> [,1] [,2] [,3] [,4]
#> [1,] 2 1 3 1
#> [2,] 1 2 1 2
#> [3,] 1 2 1 3
#> [4,] 2 2 3 4
Data
dims <- c(3,3,3,4)
mat_rep <- array(c(rep(0,3),sample(1:prod(dims))), dim = dims)
mat_norep <- array(sample(1:prod(dims)), dim = dims)

How can to associate dimnames of array and data frame index-values in R?

I have the array A
A
,,A
[,1] [,2] [,3]
[1,] 3 7 8
[2,] 4 11 9
[3,] 2 12 4.3
,,B
[,1] [,2] [,3]
[1,] 31 7 8
[2,] 4.2 4 9.5
[3,] 1 1 7
,,C
[,1] [,2] [,3]
[1,] 4 71 8.3
[2,] 4 41 9
[3,] 11 0 73
,,D
[,1] [,2] [,3]
[1,] 7 7 8.3
[2,] 3 4.1 9
[3,] 1 0.5 73
dim(A)
3 3 4
dimnames(A)[3]
A B C D
and I have the data.frame df
df
X Y Z
2 1 A
3 2 D
I would like to put in a new column of df, the values of array A, based on df index-values X(row for the array), Y(column for the array) and third dimension Z, let's say, my aspect result is:
df
X Y Z Res
2 1 A 4 # Res is the value of array A in A[2,1,"A"]
3 2 D 0.5 # Res is the value of array A in A[3,2,"D"]
I tried this code:
df$Res <- NA
if (df$Z == dimnames(A)[3]){
for (i in 1:nrow(df)){
df[i,4] <- A[df[i,1],df[i,2],df[i,3]]
}
}
But it'doesn't work well...
Any idea to associate the dimnames of third dimension array and data frame index-value?
P.S. This is a simple example. My true array is:
dim(A)
137 93 227
and
dim(df)
6080 3
P.S.2 I prefer to don't use merge or other type of similar code for allocation problem

Randomly populate R arrays

I would like to be able to populate an array with different values at each array level.
Consider the array
x <- array(dim = c(2, 5, 2)) # 2 levels, 5 columns, 2 rows
x
, , 1
[,1] [,2] [,3] [,4] [,5]
[1,] NA NA NA NA NA
[2,] NA NA NA NA NA
, , 2
[,1] [,2] [,3] [,4] [,5]
[1,] NA NA NA NA NA
[2,] NA NA NA NA NA
I now generate random values to populate levels of the the array
y <- replicate(2, sample(1:10, 5, replace = FALSE))
y
[,1] [,2]
[1,] 6 5
[2,] 8 6
[3,] 9 3
[4,] 3 7
[5,] 2 9
What is the best way to go about randomly populating the first level of x (x[,,1]) with the first column of y (i.e., values 6, 8, 9, 3, 2) and likewise the second level of x (x[,, 2]) with the second column of y (i.e., 5, 6, 3, 7, 9)?
That is, the final result might be
, , 1
[,1] [,2] [,3] [,4] [,5]
[1,] 6 8 3 3 2
[2,] 9 6 3 2 8
, , 2
[,1] [,2] [,3] [,4] [,5]
[1,] 6 6 9 3 5
[2,] 7 5 7 9 3
This would be a straightforward task if each level consisted of the same value (e.g., both levels having values 6, 8, 9, 3, 2 at random), but for this purpose, each level of the array needs to contain a different subset of values.
Any simple solutions? I realize the 'abind' R package may work here, but I think there is an easier way forward.
You seem to have all the pieces in hand. You just need to put them together. Decide the size of the third dimension for your array, construct it so you can populate rather than grow, put together the set of values you'll use to populate the array, then populate it.
set.seed(123)
d3 <- 2
x <- array(dim = c(2, 5, d3))
y <- replicate(d3, sample(1:10, 5, replace = FALSE))
for (i in seq_len(d3)) x[,,i] <- sample(y[,i], 10, replace = TRUE)
x
# , , 1
#
# [,1] [,2] [,3] [,4] [,5]
# [1,] 6 7 3 8 8
# [2,] 4 4 6 3 6
#
# , , 2
#
# [,1] [,2] [,3] [,4] [,5]
# [1,] 3 4 4 8 5
# [2,] 4 3 4 8 1
y
# [,1] [,2]
# [1,] 3 1
# [2,] 8 5
# [3,] 4 8
# [4,] 7 4
# [5,] 6 3
You can simply recast a 10x2 matrix into a 2x5x2 array:
set.seed(2017);
y <- replicate(2, sample(1:10, 5), replace = F)
array(apply(y, 2, function(x) sample(x, 10, replace = TRUE)), dim = c(2, 5, 2))
#, , 1
#
# [,1] [,2] [,3] [,4] [,5]
#[1,] 3 10 4 5 9
#[2,] 10 4 5 3 9
#
#, , 2
#
# [,1] [,2] [,3] [,4] [,5]
#[1,] 10 2 1 10 8
#[2,] 1 10 2 1 1
#y;
# [,1] [,2]
#[1,] 10 8
#[2,] 5 1
#[3,] 4 4
#[4,] 3 10
#[5,] 9 2
Explanation: apply(y, 2, ...) creates a 10x2 matrix where entries in column 1 are sampled from column 1 of y, and entries in column 2 are sampled from column 2 of y. We then simply recast the 10x2 matrix into an array of dimensions c(2, 5, 2).

Apply column vector to each row of each matrix in array in R

If I have the array
> (arr = array(c(1,1,2,2), c(2,2,2)))
, , 1
[,1] [,2]
[1,] 1 2
[2,] 1 2
, , 2
[,1] [,2]
[1,] 1 2
[2,] 1 2
then how could I apply a column vector, say c(3,3), to each row of each matrix and sum these up? So essentially, I need to do 4 * c(1,2) %*% c(3,3). Could an apply function be used here?
Thanks everyone for the help! I believe the correct method is
sum(apply(arr, c(1,3), function(x) x %*% c(1,2,3)))
which here we are dotting the vector [1,2,3] to each row of each matrix in our array called arr and summing them up. Note that here I changed the array to be
arr = array(c(1,2,3,4,5,6,7,8,9,10,11,12), c(2,3,2))
arr
, , 1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
, , 2
[,1] [,2] [,3]
[1,] 7 9 11
[2,] 8 10 12
and now the vector we are dotting the rows with is c(1,2,3) instead of c(3,3) in the original post.
EDITED
This should give you the answer:
l <- list(matrix(c(1,1,2,2), ncol = 2),
matrix(c(1,1,2,2), ncol = 2))
l
#[[1]]
# [,1] [,2]
# [1,] 1 2
# [2,] 1 2
#
# [[2]]
# [,1] [,2]
# [1,] 1 2
# [2,] 1 2
ivector <- c(3, 3) # a vector that is multiplied with the rows of each listelement
# apply over all listelements
res <- lapply(l, function(x, ivector){
#apply to all rows of the matrizes
apply(x, 1, function(rowel, ivector){
return(sum(rowel %*% ivector))
}, ivector = ivector)
}, ivector = ivector)
res
#[[1]]
#[1] 9 9
#
#[[2]]
#[1] 9 9
# And finally sum up the results:
sum(Reduce("+", res))
#[1] 36
Does that help?

equalizing matrices for combining in abind R

I believe that a similar question was asked here, but I can't seem to find it anymore.
I have two matrices with different dimensions, and I want to equalise them so that I can combine them in an array.
for example, I have the following two matrices:
a <- matrix(1:6, 3, 2)
b <- matrix(1:12, 4, 3)
a
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
b
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
Because I am working with time series data, I would like the added rows/colums to have NAs in them. In my example, matrix a would get an extra column and an extra row only containing NAs like this:
[,1] [,2] [,3]
[1,] 1 4 NA
[2,] 2 5 NA
[3,] 3 6 NA
[4,] NA NA NA
In my dataset I will have 79 matrices with unequal dimensions, and I need to make them as big as the matrix with the largest dimensions.
If b is the largest matrix, you can create a matrix with the same dimensions as b, filled with NA, and replace the rows and columns corresponding to the smaller matrix a with the values of a:
a2 <- "[<-"(x = matrix(NA, nrow = nrow(b), ncol = ncol(b)),
i = 1:nrow(a), j = 1:ncol(a),
value = a)
a2
# [,1] [,2] [,3]
# [1,] 1 4 NA
# [2,] 2 5 NA
# [3,] 3 6 NA
# [4,] NA NA NA
Example with several matrices, where we find the largest matrix and pad all matrices with NA to match the dimension of the largest.
# create some matrices of different size
a <- matrix(1:6, nrow = 3, ncol = 2)
b <- matrix(1:12, nrow = 4, ncol = 3)
c <- matrix(1:4, nrow = 2, ncol = 2)
# put them in a list
l <- list(a, b, c)
# index of largest (here, max number of rows) matrix in the list
id <- which.max(unlist((lapply(l, nrow))))
# pad matrices with NA
l2 <- lapply(l, function(x){
x <- "[<-"(x = matrix(NA, nrow = nrow(l[[id]]), ncol = ncol(l[[id]])),
i = 1:nrow(x), j = 1:ncol(x),
value = x)
})
l2
# [[1]]
# [,1] [,2] [,3]
# [1,] 1 4 NA
# [2,] 2 5 NA
# [3,] 3 6 NA
# [4,] NA NA NA
#
# [[2]]
# [,1] [,2] [,3]
# [1,] 1 5 9
# [2,] 2 6 10
# [3,] 3 7 11
# [4,] 4 8 12
#
# [[3]]
# [,1] [,2] [,3]
# [1,] 1 3 NA
# [2,] 2 4 NA
# [3,] NA NA NA
# [4,] NA NA NA
As you only want to extend the small matrix with NA, we can use a simple approach such as:
create a matrix as big as b, with only NA. Code:
extended.a = matrix(NA,nrow(b),ncol(b))
fill this matrix with the values from a. Code:
extended.a[cbind(rep(1:nrow(a),ncol(a)), rep(1:ncol(a),each=nrow(a)))] = a
edit:
As per Roland's suggestion, you can also get the vector of indices with which(..., arr.ind=TRUE).
For example, which(TRUE | a, arr.ind=TRUE)
Or even: which(matrix(TRUE,nrow(a),ncol(a), arr.ind=TRUE)
Or far better, using the expand.grid function: expand.grid(1:nrow(a), 1:ncol(a))
Maybe we can try the following base R option
> replace(NA + b, cbind(c(row(a)), c(col(a))), a)
[,1] [,2] [,3]
[1,] 1 4 NA
[2,] 2 5 NA
[3,] 3 6 NA
[4,] NA NA NA

Resources