Non comformable arrays error in R - arrays

numberofusers=75000
numberofitems=65000
number.of.factors=10
# N is a numberofusers*numberofitems sparse Matrix (loaded from a dataset).
#X,Y matrices are already available and have dimensions
# (numberofusers,number.of.factors) and
#(numberofitems,number.of.factors) respectively
ptempuser<-rep(0,numberofitems)
tempuser<-rep(0,numberofitems)
Y.big<-t(Y)%*%Y
for (i in 1:numberofusers) {
matrixproduct1 <- matrix(0,numberofitems,number.of.factors)
nonzerolistforthatuser <- which(N[i,]!=0)
tempuser[nonzerolistforthatuser] <- alpha*N[i,nonzerolistforthatuser]
ptempuser[nonzerolistforthatuser] <- 1
matrixproduct1[nonzerolistforthatuser,] <-tempuser[nonzerolistforthatuser]*Y[nonzerolistforthatuser,]
finalproductmatrix1 <- matrix(0,number.of.factors,number.of.factors)
finalproductmatrix1 <- t(Y)[,nonzerolistforthatuser] %*% matrixproduct1[nonzerolistforthatuser,]
tempuser <- 1+tempuser
matrixproduct2 <- t(Y)
matrixproduct2[,nonzerolistforthatuser] <- t(Y)[,nonzerolistforthatuser]*tempuser[nonzerolistforthatuser]
Agen<-Y.big + finalproductmatrix1
dim1<-dim(Y.big)
dim2<-dim(finalproductmatrix1)
if(dim1[1]!=dim2[1]){
print(i)
print(dim1[1])
print(dim2[1])
}
if(dim1[2]!=dim2[2]){
print(i)
print(dim1[2])
print(dim2[2])
}
finalproductmatrix2 <- matrixproduct2[,nonzerolistforthatuser] %*% cbind(ptempuser[nonzerolistforthatuser])
X[i,] <- (ginv(Y.big+finalproductmatrix1+diag(rep(lambda,number.of.factors))))%*%(finalproductmatrix2)
}
I get the error as 'Error in Y.big + finalproductmatrix1 : non-conformable arrays' . But I even tried doing Agen<-Y.big+ final productmatrix1 inside the function and that has no problem. So surely the dimensions are not causing a problem. Still I get non conformable.
Please tell me what to do. I am stuck on this for hours. I have also checked for the dimension condition and that shows no print results. So I am confused.

Related

R: Array changing type in a loop

I have created a 3D-array and want to fill it with data from two other data.frames
Those data.frames have different colnames and rownames, so sometimes a NULL will pop out when I address a non-existent cell. Both data.frames have a list of 'lm' output in their cells.
But the problem is I keep getting this error:
Error in diff_models[i, j, "cont"] <- cont :
incorrect number of subscripts
I have also noticed that upon creation "diff_models" is a logical type (also strange, btw), but when the error pops out it becomes a list. So I guess the problem is about there being no [i,j,'cont'] in a list. But why does the loop change the type of "diff_models"?
cont_col <- colnames(temp1)
cont_row <- rownames(temp1)
dis_col <- colnames(temp2)
dis_row <- rownames(temp2)
cols <- unique(c(cont_col,dis_col))
rows <- unique(c(cont_row,dis_row))
diff_models <- array(NA, c(length(rows),length(cols),2), dimnames =
list('predictor'=rows,'response'=cols, 'condition'=c('dis','cont')))
for (j in cols) {
for (i in rows) {
cont <- cont_models[i,j]
dis <- dis_models[i,j]
diff_models[i,j,"dis"] <- ifelse(is.null(dis),NA,dis)
diff_models[i,j,"cont"] <- ifelse(is.null(cont),NA,cont)
}
}
Using
diff_models[i][j]["dis"] <- ifelse(is.null(dis),NA,dis)
diff_models[i][j]["cont"] <- ifelse(is.null(cont),NA,cont)
does not end up in an error but turns "diff_models" into an empty list.
Saving numerics into the array, however, work perfectly well

retain array class when operation results in 2-dimensional matrix

I have an array that can have one or more pages or sheets (my names for the third dimension). I am attempting to perform operations on the array. When there is only one sheet or page the result of the operation is a matrix. I would like the result to be an array. Is there a way to retain the class array even when the result of the operation has only 1 sheet or page?
Here is an example. I would like my.var.2 and my.var.3 to be arrays. The variable my.pages is set to 1 here, which seems to be causing the problem. However, my.pages can be >1. If my.pages <- 2 then my.var.2 and my.var.3 are arrays.
set.seed(1234)
my.rows <- 10
my.columns <- 4
my.pages <- 1
my.var.1 <- array( rnorm((my.rows*my.columns*my.pages), 10, 2),
c(my.rows,my.columns,my.pages))
my.var.1
my.var.2 <- 2 * my.var.1[,-my.columns,]
my.var.3 <- 10 * my.var.1[,-1,]
class(my.var.2)
class(my.var.3)
my.var.2 <- as.array(my.var.2)
my.var.3 <- as.array(my.var.3)
class(my.var.2)
class(my.var.3)
my.var.2 <- as.array( 2 * my.var.1[,-my.columns,])
my.var.3 <- as.array(10 * my.var.1[,-1,] )
class(my.var.2)
class(my.var.3)
The switch to matrix causes problems when I try to use my.var.1 and my.var.2 in nested for-loops.
The following if statement seems to solve the problem, but also seems a little clunky. Is there a more elegant solution?
if(my.pages == 1) {my.var.2 <- array(my.var.2, c(my.rows,(my.columns-1),my.pages))}
From help([):
Usage:
x[i, j, ... , drop = TRUE]
...
drop: For matrices and arrays. If 'TRUE' the result is coerced to
the lowest possible dimension (see the examples). This only
works for extracting elements, not for the replacement. See
'drop' for further details.
Your code, revisited:
set.seed(1234)
my.rows <- 10
my.columns <- 4
my.pages <- 1
my.var.1 <- array( rnorm((my.rows*my.columns*my.pages), 10, 2),
c(my.rows,my.columns,my.pages))
my.var.2 <- 2 * my.var.1[,-my.columns,,drop=FALSE]
my.var.3 <- 10 * my.var.1[,-1,,drop=FALSE]
class(my.var.2)
## [1] "array"
class(my.var.3)
## [1] "array"

creating an array where every element is a list of varying lengths in R

I wish to create an array. Each element will be assigned to be a list, and each list will be of a different length (unknown before the script is executed). A simple example would be to let a[1] be the list q and a[2] be the list. Is there a construct that I can use, perhaps different than array, that would allow for such assignments.
q <- c(1,2,3,4,5)
w <- c(6,7,8)
a <- array(2)
a[1] <- q
Warning message:
In a[1] <- q : number of items to replace is not a multiple of replacement length
Since you want an array of lists, try:
a[1] <- list(q)
As has been pointed out in the comments, you are likely looking for list and not array (the latter being more akin to a multi-dimensional matrix or mathematical vector.)
However in addition to that is the issue if indexing:
In R there is a major difference between a[1] <- q and a[[1]] <- q
Try the following to spot the diff:
a <- list()
a[[1]] <- q
a[[2]] <- w
a
Compare with
a <- list()
a[1] <- q
a[2] <- w
a
I think what you want is a list of vectors.
q <- c(1,2,3,4,5)
w <- c(6,7,8)
a <- list()
a[[1]] <- q
list works - thanks! it allows me to partition an array of positions into a list of lists according to some separation cutoff.
delta <- 200
pcls <- list(nrow=pctot)
v <- posvec[1]
pcind <- 0
jtest <- 0
for (j in 2:nr) {
dist <- posvec[j]-posvec[j-1]
if (dist <= delta) {
v <- c(v,posvec[j])
jtest <- 1
}
if (dist > delta) {
if (jtest > 0) {
pcind <- pcind + 1
pcls[[pcind]] <- v
v <- posvec[j]
}
jtest <- 0
}
}

lapply and rbind not properly appending the results

SimNo <- 10
for (i in 1:SimNo){
z1<-rnorm(1000,0,1)
z2<-rnorm(1000,0,1)
z3<-rnorm(1000,0,1)
z4<-rnorm(1000,0,1)
z5<-rnorm(1000,0,1)
z6<-rnorm(1000,0,1)
X<-cbind(z1,z2,z3,z4,z5,z6)
sx<-scale(X)/sqrt(999)
det1<-det(t(sx)%*%sx)
detans<-do.call(rbind,lapply(1:SimNo, function(x) ifelse(det1<1,det1,0)))
}
when I run all commands with in loop except last one I get different values of determinant but when I run code with loops at once I get last value of determinant repeated for all.
Please help and guide to control all situation like this.
Is there way to have short and efficient way for this code, so that each individual variable can also be accessed.
Whenever you are repeating the same operation multiple times, and without inputs, think about using replicate. Here you can use it twice:
SimNo <- 10
det1 <- replicate(SimNo, {
X <- replicate(6, rnorm(1000, 0, 1))
sx <- scale(X) / sqrt(999)
det(t(sx) %*% sx)
})
detans <- ifelse(det1 < 1, det1, 0)
Otherwise, this is what your code should have looked with your for loop. You needed to create a vector for storing your outputs at each loop iteration:
SimNo <- 10
detans <- numeric(SimNo)
for (i in 1:SimNo) {
z1<-rnorm(1000,0,1)
z2<-rnorm(1000,0,1)
z3<-rnorm(1000,0,1)
z4<-rnorm(1000,0,1)
z5<-rnorm(1000,0,1)
z6<-rnorm(1000,0,1)
X<-cbind(z1,z2,z3,z4,z5,z6)
sx<-scale(X)/sqrt(999)
det1<-det(t(sx)%*%sx)
detans[i] <- ifelse(det1<1,det1,0)
}
Edit: you asked in the comments how to access X using replicate. You would have to make replicate create and store all your X matrices in a list. Then use the *apply family of functions to loop throughout that list to finish the computations:
X <- replicate(SimNo, replicate(6, rnorm(1000, 0, 1)), simplify = FALSE)
det1 <- sapply(X, function(x) {
sx <- scale(x) / sqrt(999)
det(t(sx) %*% sx)
})
detans <- ifelse(det1 < 1, det1, 0)
Here, X is now a list of matrices, so you can get e.g. the matrix for the second simulation by doing X[[2]].
SimNo <- 10
matdet <- matrix(data=NA, nrow=SimNo, ncol=1, byrow=TRUE)
for (i in 1:SimNo){
z1<-rnorm(1000,0,1)
z2<-rnorm(1000,0,1)
z3<-rnorm(1000,0,1)
z4<-rnorm(1000,0,1)
z5<-rnorm(1000,0,1)
z6<-rnorm(1000,0,1)
X<-cbind(z1,z2,z3,z4,z5,z6)
sx<-scale(X)/sqrt(999)
det1<-det(t(sx)%*%sx)
matdet[i] <-do.call(rbind,lapply(1:SimNo, function(x) ifelse(det1<1,det1,0)))
}
matdet

R error occurs when assgin a vector to an array

I have coding of a function in R as following:
matching_score=function(nitems, tot.score) {
nInterval <- 4*nitems+1
tot <- array(0, dim=c(nInterval,2,nGroup.all) )
minimum <- nitems
maximum <- nitems*5
tot[,1,] <- c(minimum: maximum)
for (nGcut in 1:nGroup.all)
{
...
But R gave an error message as :
Error in tot[, 1, ] <- c(minimum:maximum) :
incorrect number of subscripts
How can I solve this issue? When minimum and maximum were actual numbers, the error was not presented.
Thanks in advance for your advice.
The error probably occurs when you try to cbind the tot object. The error message is complaining about dimensions. You are using "[" as if this object is an array with three dimensions and 'cbind' will not work with arrays. If it's really a three dimensional object the install package 'abind' and use function abind.
require(abind)
arr <- array(1:(2*3*4), c(4,3,2) )
abind(arr, arr[,,1], along=3)
The dimensions of this line:
tempo[nRw,] <- cbind(tot[nRw,1,1], sum(tot[nRw,2,]))
... seem all wrong. The LHS has two dimensions, the 'tot' object has three and the return from sum will be a scalar.

Resources