Using drop [R] selectively - only remove specified length-1 dimensions - arrays

The drop function and the drop argument in the [ subsetting function both work completely or not at all. If I have, for example, a four-dimensional array whose 2nd and 4th dimensions are length 1 only, then drop will return a 2-dimensional array with both these dimensions removed. How can I drop the 4th dimension but not the 2nd?
e.g.
arr <- array(1:4, c(2, 1, 2, 1))
drop(arr)
arr[,,, 1, drop = TRUE]
The reason for doing this in my case is so that I can set up the array correctly for binding to other arrays in abind. In the example given, one might wish to bind arr[,,, 1] to a 3-dimensional array along dimension 2.

I've thought about it some more and come up with a solution. The below function defines the new target dimensions and fits the data into a new array of those dimensions. omit is a vector of dimension numbers never to drop.
library("abind")
drop.sel <- function(x, omit){
ds <- dim(x)
dv <- ds == 1 & !(seq_along(ds) %in% omit)
adrop(x, dv)
}
If anyone thinks they have a more elegant solution, I'd be happy to see it.

Related

Avoid collpasing dimensions when omitting NAs from array

I have an array where I have to omit NA values. I know that it is an array full of matrices where every row has exactly one NA value. My approach works well for >2 columns of the matrices, but apply() drops one dimension when there are only two columns (as after omitting the NA values, one column disappears).
As this step is part of a much larger code, I would like to avoid recoding the rest and make this step robust to the case when the number of columns is two. Here is a simple example:
#create an array
arr1 <- array(rnorm(3000),c(500,2,3))
#randomly distribute 1 NA value per row of the array
for(i in 1:500){
arr1[i,,sample(3,1)] <- NA
}
#omit the NAs from the array
arr1.apply <- apply(arr1, c(1,2),na.omit)
#we lose no dimension as every dimension >1
dim(arr1.apply)
[1] 2 500 2
#now repeat with a 500x2x2 array
#create an array
arr2 <- array(rnorm(2000),c(500,2,2))
#randomly distribute 1 NA value per row of the array
for(i in 1:500){
arr2[i,,sample(2,1)] <- NA
}
#omit the NAs from the array
arr2.apply <- apply(arr2, c(1,2),na.omit)
#we lose one dimension because the last dimension collapses to size 1
dim(arr2.apply)
[1] 500 2
I do not want apply() to drop the last dimension as it breaks the rest of my code.
I am aware that this is a known issue with apply(), however, I am eager to resolve the problem in this very step, so any help would be appreciated. So far I've tried to wrap apply() in an array() command using the dimensions that should result, however, I think this mixes up the values in the matrix in a way that is not desirable.
Thanks for your help.
I propose a stupid solution, but I think you have no choice if you want to keep it this way:
arr1.apply <- if(dim(arr1)[3] > 2){
apply(arr1, c(1,2),na.omit)} else{
array(apply(arr1, c(1,2),na.omit),dim = c(1,dim(arr1)[1:2]))}

Drop Julia array dimensions of length 1

Say if I have a 5D Array with size 1024x1024x1x1x100. How can I make a new array that is 1024x1024x100?
The following works if you know which dimensions you want to keep ahead of time:
arr = arr[:, :, 1, 1, :]
But I don't know which dimensions are what size ahead of time and I would like to only keep dimensions given a boolean mask; something like this...
arr2 = arr[(size(arr) .> 1)]
The squeeze function was defined specifically for the purpose of removing dimensions of length 1. From the manual:
Base.squeeze — Function.
squeeze(A, dims)
Remove the dimensions
specified by dims from array A. Elements of dims must be unique and
within the range 1:ndims(A). size(A,i) must equal 1 for all i in dims.
To "squeeze" all the dimensions of size 1 (when they are unknown in advance), we need to find them and make them into a tuple. This is accomplished by ((size(arr).==1)...). So the result is:
squeeze(a,(find(size(a).==1)...))

Applying Movmedian Within Cell Array

I have a cell array (2 x 6) called "output", each cell in row #1 {1 -> 6, 2} contains a 1024 x 1024 x 100 matrix. I want to apply movmedian to each cell in row #1. I would like to apply this function in dimension = 3 with window size = 5.
output = cellfun(#movmedian(5,3), output,'uniform', 0);
This is the code that I have come up with so far, however, it produces an "unbalenced or unexpected parenthesis or bracket" error. I am unsure what is causing this error. I am also somewhat unsure how to instruct matlab to perform this operation only on row 1 of the cell array, please help!
Thank you for your time!!
The function handle passed as the first argument to cellfun will be sequentially passed the contents of each cell (i.e. each 3-D matrix). Since you need to also pass the additional parameters needed by movmedian, you should create an anonymous function like so:
#(m) movmedian(m, 5, 3)
Where the input argument m is the 3-D matrix. If you want to apply this to the first row of output, you just have to index the cell array like so:
output(1, :)
This will return a cell array containing the first row of output, with : indicating "all columns". You can use the same index in the assignment if you'd like to store the modified matrices back in the same cells of output.
Putting it all together, here's the solution:
output(1, :) = cellfun(#(m) movmedian(m, 5, 3), output(1, :),...
'UniformOutput', false);
...and a little trick to avoid having to specify 'UniformOutput', false is to encapsulate the results of the anonymous function in a cell array:
output(1, :) = cellfun(#(m) {movmedian(m, 5, 3)}, output(1, :));

Fill a vector in Julia with a repeated list

I would like to create a column vector X by repeating a smaller column vector G of length h a number n of times. The final vector X will be of length h*n. For example
G = [1;2;3;4] #column vector of length h
X = [1;2;3;4;1;2;3;4;1;2;3;4] #ie X = [G;G;G;G] column vector of
length h*n
I can do this in a loop but is there an equivalent to the 'fill' function that can be used without the dimensions going wrong. When I try to use fill for this case, instead of getting one column vector of length h*n I get a column vector of length n where each row is another vector of length h. For example I get the following:
X = [[1,2,3,4];[1,2,3,4];[1,2,3,4];[1,2,3,4]]
This doesn't make sense to me as I know that the ; symbol is used to show elements in a row and the space is used to show elements in a column. Why is there the , symbol used here and what does it even mean? I can access the first row of the final output X by X[1] and then any element of this by X[1][1] for example.
Either I would like to use some 'fill' equivalent or some sort of 'flatten' function if it exists, to flatten all the elements of the X into one column vector with each entry being a single number.
I have also tried the reshape function on the output but I can't get this to work either.
Thanks Dan Getz for the answer:
repeat([1, 2, 3, 4], outer = 4)
Type ?repeat at the REPL to learn about this useful function.
In older versions of Julia, repmat was an alternative, but it has now been deprecated and absorbed into repeat
As #DanGetz has pointed out in a comment, repeat is the function you want. From the docs:
repeat(A, inner = Int[], outer = Int[])
Construct an array by repeating the entries of A. The i-th element of inner specifies the number of times that the individual entries of the i-th dimension of A should be repeated. The i-th element of outer specifies the number of times that a slice along the i-th dimension of A should be repeated.
So an example that does what you want is:
X = repeat(G; outer=[k])
where G is the array to be repeated, and k is the number of times to repeat it.
I will also attempt to answer your confusion about the result of fill. Julia (like most languages) makes a distinction between vectors containing numbers and numbers themselves. We know that fill(5, 5) produces [5, 5, 5, 5, 5], which is a one-dimensional array (a vector) where each element is 5.
Note that fill([5], 5), however, produces a one-dimensional array (a vector) where each element is [5], itself a vector. This prints as
5-element Array{Array{Int64,1},1}:
[5]
[5]
[5]
[5]
[5]
and we see from the type that this is indeed a vector of vectors. That of course is not the same thing as the concatenation of vectors. Note that [[5]; [5]; [5]; [5]; [5]] is syntax for concatenation, and will return [5, 5, 5, 5, 5] as you might expect. But although ; syntax (vcat) does concatenation, fill does not do concatenation.
Mathematically (under certain definitions), we may imagine R^(kn) to be distinct (though isomorphic to) from (R^k)^n, for instance, where R^k is the set of k-tuples of real numbers. fill constructs an object of the latter, whereas repeat constructs an object of the former.
As long as you are working with 1-dimensional arrays (Vectors)...
X=repmat(G,4) should do it.
--
On another note, Julia makes no distinction between row and column vector, they are both one-dimensional arrays.
[1,2,3]==[1;2;3] returns true as they are both 3-element Array{Int64,1} or vectors (Array{Int,1} == Vector{Int} returns true)
This is one of the differences between Matlab and Julia...
If, for some specific reason you want to do it, you can create 2-dimensional Arrays (or Matrices) with one of the dimensions equal to 1.
For example:
C = [1 2 3 4] will create a 1x4 Array{Int64,2} the 2 there indicates the dimensions of the Array.
D = [1 2 3 4]' will create a 4x1 Array{Int64,2}.
In this case, C == D returns false of course. But neither is a Vector for Julia, they are both Matrices (Array{Int,2} == Matrix{Int} returns true).

Dynamic slicing of Matlab array

I have an n-dimensional array A and want to slice it dynamically, i.e., given a list of array dimensions, like [2 4], and a list of values, like [6 8], I want
B = A(:,6,:,8,:,:,:,:,...)
List lengths are unknown. Using eval would work but is not an option. This question is a generalization of a previous post to multiple indices and dimensions without a for-loop.
You can still use the previous post I linked to (which I originally flagged as a duplicate) to answer your question. This original post only slices in one dimension. I originally flagged it as a duplicate and closed it because all you need to do is replace one line of code in the original post's accepted answer to achieve what you want. However, because it isn't that obvious, I have decided to reopen the question and answer the question for you.
Referring to the previous post, this is what Andrew Janke (the person with the accepted answer on the linked post) did (very clever I might add):
function out = slice(A, ix, dim)
subses = repmat({':'}, [1 ndims(A)]);
subses{dim} = ix;
out = A(subses{:});
Given a matrix A, an index number ix and the dimension you want to access dim, the above function would equivalently perform:
out = A(:, :, ..., ix, :, :,...:);
^ ^ ^ ^
dimensions --> 1 2 dim dim+1
You would access your desired dimension in dim, and place what value you want to use to slice into that dimension. As such, you'd call it like this:
out = slice(A, ix, dim);
How the function works is that subses would generate a cell array of ':' strings (that will eventually be converted into ':' operators) that is as long as the total number of dimensions of A. Next, you would access the element at dim, which corresponds to the dimension you want and you would replace this with ix. You would then unroll this cell array so that we would access A in the manner that you see in the above equivalent statement.
Who would have thought that you can use strings to index into an array!?
Now, to generalize this, all you have to do is make one small but very crucial change. ix would now be a vector of indices, and dim would be a vector of dimensions you want to access. As such, it would look something like this:
function out = slice(A, ix, dim)
subses = repmat({':'}, [1 ndims(A)]);
subses(dim) = num2cell(ix);
out = A(subses{:});
The only difference we see here is the second line of the code. We have to use num2cell so that you can convert each element into a cell array, and we slice into this cell array to replace the : operators with your desired dimensions. Note that we are using () braces and not {} braces. () braces are used to slice through cell arrays while {} are used to access cell array contents. Because we are going to assign multiple cells to subses, () is needed. We then perform our slicing in A accordingly.
As such, given your problem and with the above modifications, you would do:
out = slice(A, [6 8], [2 4]);
Be advised that ix and dim must contain the same number of elements and they must be 1D. Also, ix and dim should be sensible inputs (i.e. not floating point and negative). I don't do this error checking because I'm assuming you know what you're doing and you're smart enough to know how to use this properly.
Good luck!

Resources