Vectorized sum over slices of an array - arrays

Suppose I have an array of three dimensions:
set.seed(1)
foo <- array(rnorm(250),dim=c(5,10,5))
And I want to create a matrix of each row and layer summed over columns 4, 5 and 6. I can write do this like this:
apply(foo[,4:6,],c(1,3),sum)
But this splits the array per row and layer and is pretty slow since it is not vectorized. I could also just add the slices:
foo[,4,]+foo[,5,]+foo[,6,]
Which is faster but gets abit tedious to do manually for multiple slices. Is there a function that does the above expression without manually specifying each slice?

I think you are looking for rowSums / colSums (fast implementations of apply)
colSums(aperm(foo[,4:6,], c(2,1,3)))
> all.equal(colSums(aperm(foo[,4:6,], c(2,1,3))), foo[,4,]+foo[,5,]+foo[,6,])
[1] TRUE

How about this:
eval(parse(text=paste(sprintf('foo[,%i,]',4:6),collapse='+')))
I am am aware that there are reasons to avoid parse, but I am not sure how to avoid it in this case.

Related

Choose conditionally column-wise or row-wise iteration

I would like to iterate over an array but I want to pick whether the iteration is column wise or row wise. In other words, I want to define each time at runtime, whether rows or cols goes to the outer loop, against a condition. The dummy implementation of course would be:
if cond:
for rows:
for cols:
ar[rows][cols];
elif !cond:
for cols:
for rows:
ar[rows][cols];
Now, is there a compressed way to express the above implementation?
Unfortunately, going over all cases (my array is 4-dimensional, so I have 16 cases) is not the best way to go.
So, is there any algorithmic approach that compresses these loops into one loop?
Statically-allocated arrays of the same type can be treated as being 1-dimensional if you're careful - in this case, you can calculate the Nth-offset within it (perhaps with a new function) and iterate over those indices rather than by the I,J,K,L-th member.
Alternatively, if you have a language which doesn't allocate static arrays or you can't use one for some reason (fully dynamic type system?), you may instead be able to put the starting pointers into a new array and iterate over that array (which is cheap as it'll only have 4 members in your case), using each as the starting point!
You may find you need to create a logical view into the first array from the second with a different ordering.
In both cases, you likely want to wrap with a modulo % operator, which will allow you to keep adding your sub-indices together while calculating offsets that wrap around the length of your array.

Matlab: multiply subset of three dimensional array with two dimensional array

I have a AxBxC array where AXB are pointing to individual grids of a field that i sampled (like coordinates) and C corresponds to the layers underneath. Now I want to calculate the impact of certain activities on these individual points by multiplying it with a 2D matrix.
E.g.
x=5; %x-Dimensions of the sampled area
y=5; %y-Dimensions of the sampled area
z=3; %z-number of layers sampled
Area= zeros(x,y,z);
AreaN= zeros(x,y,z);
now I want to multiply every layer of a given point in X*Y with:
AppA=[0.4,0.4,0.2;0.4,0.5,0.1;0.1,0.2,0.7];
I tried:
for i=1:x
for j=1:y
AreaN(i,j,:)= AppA*Area(i,j,:);
end
end
Unfotunately I get the error:
Error using *
Inputs must be 2-D, or at least one input must be scalar.
To compute elementwise TIMES, use TIMES (.*) instead.
Any help to this is appreciated since I am not yet really familiar with matlab.
Correct Approach
I think, to correct your code, you need to convert that Area(i,j,:) to a column vector, which you can do with squeeze. Thus, the correct loop-based code would look something like this -
AreaN= zeros(x,y,z);
for i=1:x
for j=1:y
AreaN(i,j,:)= AppA*squeeze(Area(i,j,:));
end
end
Now, there are efficient no-loop/vectorized approaches that can be suggested here to get to the output.
Vectorized Approach #1
First approach could be with matrix multiplication and has to be pretty efficient one -
AreaN = reshape(reshape(Area,x*y,z)*AppA.',x,y,z)
Vectorized Approach #2
Second one with bsxfun -
AreaN = squeeze(sum(bsxfun(#times,Area,permute(AppA,[3 4 2 1])),3))
Vectorized Approach #2 Rev 1
If you would like to get rid of the squeeze in the bsxfun code, you need to use an extra permute in there -
AreaN = sum(bsxfun(#times,permute(Area,[1 2 4 3]),permute(AppA,[4 3 1 2])),4)
This would solve the matrix multiplication problem:
AreaN(i,j,:)= AppA*reshape(Area(i,j,:),3,[]);
You might want to consider using bsxfun to aviod loops.

Mean of a 4D array across selected dimensions

I am using the mean function in MATLAB on a 4D matrix.
The matrix is a 32x2x20x7 array and I wish to find the mean of each row, of all columns and elements of 3rd dimension, for each 4th dimension.
So basically mean(data(b,:,:,c)) [pseudo-code] for each b, c.
However, when I do this it spits me out separate means for each 3rd dimension, do you know how I can get it to give me one mean for the above equation - so it would be (32x7=)224 means.
You could do it without loops:
data = rand(32,2,20,7); %// example data
squeeze(mean(mean(data,3),2))
The key is to use a second argument to mean, which specifies across which dimension the mean is taken (in your case: dimensions 2 and 3). squeeze just removes singleton dimensions.
this should work
a=rand(32,2,20,7);
for i=1:32
for j=1:7
c=a(i,:,:,j);
mean(c(:))
end
end
Note that with two calls to mean, there will be small numerical differences in the result depending on the order of operations. As such, I suggest doing this with one call to mean to avoid such concerns:
squeeze(mean(reshape(data,size(data,1),[],size(data,4)),2))
Or if you dislike squeeze (some people do!):
mean(permute(reshape(data,size(data,1),[],size(data,4)),[1 3 2]),3)
Both commands use reshape to combine the second and third dimensions of data, so that a single call to mean on the new larger second dimension will perform all of the required computations.

Irregular array subsetting in R

Let's say I have the array
TestArray=array(1:(3*3*4),c(3,3,4))
In the following I will refer to TestArray[i,,], TestArray[,j,] and TestArray[,,k] as the x=i, y=j and z=k subsets, respectively. In this specific example, the indices i and j can go from 1 to 3 and k from 1 to 4.
Now, I want to subset this 3 dimensional array so that I get the x=y subset. The output should be
do.call("cbind",
list(TestArray[1,1,,drop=FALSE],
TestArray[2,2,,drop=FALSE],
TestArray[3,3,,drop=FALSE]
)
)
I have (naively) thought that such an operation should be possible by
library(Matrix)
TestArray[as.array(Diagonal(3,TRUE)),]
This works in 2 dimensions
matrix(1:9,3,3)[as.matrix(Diagonal(3,TRUE))]
However, in 3 dimensions it gives an error.
I know that I could produce an index array
IndexArray=outer(diag(1,3,3),c(1,1,1,1),"*")
mode(IndexArray)="logical"
and access the elements by
matrix(TestArray[IndexArray],nrow=4,ncol=3,byrow=TRUE)
But the first method would be much nicer and would need less memory as well. Do you know how I could fix TestArray[as.array(Diagonal(3,TRUE)),] so that it works as desired? Maybe I am just missing some syntactic sugar...
I don't know if abind::asub will do what I (you) want. This uses a more efficient form of matrix indexing than you have above, but I still have to coerce the results into the right shape ...
indmat <- cbind(1:3,as.matrix(expand.grid(1:3,1:4)))
matrix(TestArray[indmat],nrow=4,ncol=3,byrow=TRUE)
Slightly more generally:
d <- dim(TestArray)[1]
d2 <- dim(TestArray)[3]
indmat <- cbind(1:d,as.matrix(expand.grid(1:d,1:d2))
matrix(TestArray[indmat],nrow=d2,ncol=d,byrow=TRUE)
In addition to Ben's answer, here a surprisingly simple modification to my original line of code that does the job.
matrix(TestArray[as.matrix(Diagonal(3,TRUE))],ncol=3,nrow=4,byrow=TRUE)
This works because as.matrix(Diagonal(3,TRUE)) gets recycled.

How can I use any() on a multidimensional array?

I'm testing an arbitrarily-large, arbitrarily-dimensioned array of logicals, and I'd like to find out if any one or more of them are true. any() only works on a single dimension at a time, as does sum(). I know that I could test the number of dimensions and repeat any() until I get a single answer, but I'd like a quicker, and frankly, more-elegant, approach.
Ideas?
I'm running 2009a (R17, in the old parlance, I think).
If your data is in a matrix A, try this:
anyAreTrue = any(A(:));
EDIT: To explain a bit more for anyone not familiar with the syntax, A(:) uses the colon operator to take the entire contents of the array A, no matter what the dimensions, and reshape them into a single column vector (of size numel(A)-by-1). Only one call to ANY is needed to operate on the resulting column vector.
As pointed out, the correct solution is to reshape the result into a vector. Then any will give the desired result. Thus,
any(A(:))
gives the global result, true if any of numel(A) elements were true. You could also have used
any(reshape(A,[],1))
which uses the reshape operator explicitly. If you don't wish to do the extra step of converting your matrices into vectors to apply any, then another approach is to write a function of your own. For example, here is a function that would do it for you:
======================
function result = myany(A)
% determines if any element at all in A was non-zero
result = any(A(:));
======================
Save this as an m-file on your search path. The beauty of MATLAB (true for any programming language) is it is fully extensible. If there is some capability that you wish it had, just write a little idiom that does it. If you do this often enough, you will have customized the environment to fit your needs.

Resources