How to read multiple files into a multi-dimensional array

How to read multiple files into a multi-dimensional array - arrays

I want to make array in 3 dimension.
Here is what I tried:
z<-c(160,720,420)
first_data_set <-array(dim = length(file_1), dimnames = z)
Data that I am reading is in one level. (only x and y)
There are other data in the same format, and I need to put them in the same array with the first data. So once I finish reading all data, all of them are in the same array but there is no overwriting.
So I think array has to be 3 dimensions; otherwise I cannot keep all data that I read in loop.

Say that you have two matrices of size 3x4:
m1 <- matrix(rnorm(12), nrow = 3, ncol = 4)
m2 <- matrix(rnorm(12), nrow = 3, ncol = 4)
If you want to place them in an array, first make an array of NA's:
A <- array(as.numeric(NA), dim = c(3,4,2))
Then populate the layers with data:
A[,,1] <- m1
A[,,2] <- m2
As suggested by #Justin, you could also just put the matrices together in a list:
A2 <- list()
A2[['m1']] <- m1
A2[['m2']] <- m2
To read matrices from files: using a list makes it easier to get these matrices from files in a directory, without having to specify the dimensions in advance. Assume you want all files with extension csv:
myfiles <- dir(pattern = ".csv")
for (i in 1:length(myfiles)){
A2[[myfiles[i]]] <- read.table(myfiles[i], sep = ',')
}

Related

Is there any function that calculate correlation between a set of matrices included in an array in R?

I have a list that includes 20 matrices. I want to calculate Pearson's correlation betweeen all matrices. but I can not find any possible code or functions? Could you please give some tips for doing so.
something like:
a=matrix(1:8100, ncol = 90)
b=matrix(8100:16199, ncol = 90)
c=matrix(sample(16200:24299),ncol = 90)
z=list(a,b,c)
I find this:
https://rdrr.io/cran/lineup/man/corbetw2mat.html and try it:
library(lineup)
corbetw2mat(z[a], z[b], what = "all")
I've got the following error:
Error in corbetw2mat(z[a], z[b], what = "all") :
(list) object cannot be coerced to type 'double'
I want a list like this for the result:
a & b
correlations
a & c
correlations
b & c
correlations
Thanks

I will create a smaller data set to illustrate the solution below.
To get pairwise combinations the best option is to compute a matrix of combinations with combn and then loop through it, in this case a lapply loop.
set.seed(1234) # Make the results reproducible
a <- matrix(1:9, ncol = 3)
b <- matrix(rnorm(9), ncol = 3)
c <- matrix(sample(1:9), ncol = 3)
sample_list <- list(a, b, c)
cmb <- combn(3, 2)
res <- lapply(seq.int(ncol(cmb)), function(i) {
cor(sample_list[[ cmb[1, i] ]], sample_list[[ cmb[2, i] ]])
})
The results are in the list res.
Note that sample is a base r function, so I changed the name to sample_list.

R Accessing vector inside list inside Array

I have a very long Array (1955x2417x1) in R where each position stores a list of two vector (named "max" and "min") of length 5.
I would like to find a simple way to create a multidimensional array (dim 1955x2417x5) where each position holds a single value from vector "max"
I have looked at answers such as array of lists in r
but so far without success.
I know I can access the list in each position of the array using
myarray[posX, PosY][[1]][["max"]]
but how to apply that to the whole Array?
SO far I have tried
newArray <- array( unlist(myarray[][[1]][["max"]]), c(1955, 2417, 5))
and
NewArray <-parApply(cl, myarray, c(1:2), function(x) {
a=x[[1]][["max"]]
} )
but the results are not right.
Do you have any suggestion?

Let
e <- list(min = 1:3, max = 4:6)
arr <- array(list(e)[rep(1, 8)], c(2, 4))
dim(arr)
# [1] 2 4
Then one option is
res <- apply(arr, 1:2, function(x) x[[1]][["max"]])
dim(res)
# [1] 3 2 4
and, if the order of dimensions matters,
dim(aperm(res, c(2, 3, 1)))
# [1] 3 2 4

Optimizing function speed on 3D array

I am applying a user-defined function to individual cells of a 3D array. The contents of each cell are one of the following possibilities, all of which are character vectors because of prior formatting:
"N"
"A"
""
"1"
"0"
I want to create a new 3D array of the same dimensions, where cells contain either NA or a numeric vector containing 1 or 0. Thus, I wrote a function named Numericize and used aaply to apply it to the entire array. However, it takes forever to apply it.
Numericize <- function(x){
if(!is.na(x)){
x[x=="N"] <- NA; x
x[x=="A"] <- NA; x
x[x==""] <- NA; x
x <- as.integer(x)
}
return(x)
}
The dimensions original array are 480x866x366. The function takes forever to apply using the following code:
Final.Daily.Array <- aaply(.data = Complete.Daily.Array,
.margins = c(1,2,3),
.fun = Numericize,
.progress = "text")
I am unsure if the speed issue comes from an inefficient Numericize, an inefficient aaply, or something else entirely. I considered trying to set up parallel computing using the plyr package but I wouldn't think that such a simple command would require parallel processing.
On one hand I am concerned that I created a stack overflow for myself (see this for more), but I have applied other functions to similar arrays without problems.
ex.array <- array(dim = c(3,3,3))
ex.array[,,1] <- c("N","A","","1","0","N","A","","1")
ex.array[,,2] <- c("0","N","A","","1","0","N","A","")
ex.array[,,3] <- c("1","0","N","A","","1","0","N","A")
desired.array <- array(dim = c(3,3,3))
desired.array[,,1] <- c(NA,NA,NA,1,0,NA,NA,NA,1)
desired.array[,,2] <- c(0,NA,NA,NA,1,0,NA,NA,NA)
desired.array[,,3] <- c(1,0,NA,NA,NA,1,0,NA,NA)
ex.array
desired.array
Any suggestions?

You can just use a vectorized approach:
ex.array[ex.array %in% c("", "N", "A")] <- NA
storage.mode(ex.array) <- "integer"
You can simply use the second line and it will introduce NAs by coercion.

R Multiply second dimension of 3D Array by a Vector for each of the 3rd dimension

When trying to multiply the first dimension of an array by each index of a vector by the second dimension, my array is converted to a matrix and things get squirrelly. I can only do the proper multiplication long-hand.
What a mouth full...
It's easier to explain with code...
Arr <- array(runif(10*5*3), dim = c(10,5,3))
dim(Arr)
Vect <- c(1:5)
Arr[,1,1] <- Arr[,1,1]*Vect[1]
Arr[,1,2] <- Arr[,1,2]*Vect[1]
Arr[,1,3] <- Arr[,1,3]*Vect[1]
Arr[,2,1] <- Arr[,2,1]*Vect[2]
Arr[,2,2] <- Arr[,2,2]*Vect[2]
Arr[,2,3] <- Arr[,2,3]*Vect[2]
Arr[,3,1] <- Arr[,3,1]*Vect[3]
Arr[,3,2] <- Arr[,3,2]*Vect[3]
Arr[,3,3] <- Arr[,3,3]*Vect[3]
Arr[,4,1] <- Arr[,4,1]*Vect[4]
Arr[,4,2] <- Arr[,4,2]*Vect[4]
Arr[,4,3] <- Arr[,4,3]*Vect[4]
Arr[,5,1] <- Arr[,5,1]*Vect[5]
Arr[,5,2] <- Arr[,5,2]*Vect[5]
Arr[,5,3] <- Arr[,5,3]*Vect[5]
How do I clean this up to be one command?

Try:
sweep(Arr,2,Vect,FUN="*")

Cast Vect into an array first, then element multiply:
varr <- aperm(array(Vect, dim = c(5L, 10L, 3L)), perm = c(2L, 1L, 3L))
Arr <- varr * Arr
(of course we don't need to store varr if you want this in one command)
(also, turns out this is basically what sweep does under the hood...)

The aaply() function from the plyr package does exactly what you're looking for. It can operate on arrays of any dimension and split them however you like. In this case you're splitting by rows so:
library(plyr)
Arr2 <- aaply(Arr, 1, function(x,y){x*y}, Vect)

We can also replicate the 'Vect' and multiply with 'Arr'. The col is a convenient function that gives the numeric index of columns.
res1 <- Arr * Vect[col(Arr[,,1])]
Or we explicitly do the rep
res2 <- Arr* rep(Vect, each=dim(Arr)[1])
identical(res1, res2)
#[1] TRUE

Transform two arrays in to one data frame in R

I have two arrays coming from a postgreSQL database as following.
iarray
{9.467182035,9.252423958,9.179368178,9.142931845,9.118895803,9.098669713,9.093398102,9.092035392,9.091328028,9.090594437,9.090000456,9.089253543......keeps going on}
varray
{-1.025945126,-0.791203874,-0.506481774,-0.255416444,-0.028424464,0.188855034,0.390787963,0.579327969,0.761521769 ...keeps going on}
Both arrays have equal number of entries. I want to convert these to a data frame hence I can draw a graph of i over v.
How should I proceed?
I tried n<-gsub("^\\{+(.+)\\}+$", '\\1', iarray) to remove the {} and
n2 <- strsplit(n, ",") to remove the commas.

Assuming you are getting iarray & varray as strings :
iarray = "{9.467182035,9.252423958,9.179368178,9.142931845}"
varray = "{-1.025945126,-0.791203874,-0.506481774,-0.255416444}"
n<-gsub("^\\{+(.+)\\}+$", '\\1', iarray)
n1 <- strsplit(n,",")
n1 <- unlist(n1)
df <- as.data.frame(n1)
n<-gsub("^\\{+(.+)\\}+$", '\\1', varray)
n2 <- strsplit(n,",")
n2 <- unlist(n2)
df <- cbind(df,n2)

This seems one of the few occasions to correctly use eval(parse()):
df<-list(iarray,varray)
df<-data.frame(lapply(df,
function(x) eval(parse(text=sub("\\}$",")",sub("^\\{","c(",x))))
))
names(df)<-c("iarray","varray")
We just replace the { with (, add a c at the beginning and iarray and varray become command lines to create vectors which we parse and eval.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to read multiple files into a multi-dimensional array - arrays

Related

Is there any function that calculate correlation between a set of matrices included in an array in R?

R Accessing vector inside list inside Array

Optimizing function speed on 3D array

R Multiply second dimension of 3D Array by a Vector for each of the 3rd dimension

Transform two arrays in to one data frame in R

Categories

Resources