Use apply on a multi-dimension array - arrays

A normal matrix would be 2-dimension matrix. But, I can initialise:
a<-array(0,dim=c(2,3,4,5))
Which is a 2*4*5*3 matrix, or array.
Command
apply(a,c(2,3),sum)
will give a 4*5 array, contain the sum over elements in the 1st and 4th dimension.
Why it that? As far as I know, in the apply function, 1 indicates rows, 2 indicates columns, but what does 3 mean here?
We need some abstraction here.

The easiest way to understand apply on an array is to try some examples. Here's some data modified from the last example object in the documentation:
> z <- array(1:24, dim = 2:4)
> dim(z)
[1] 2 3 4
> apply(z, 1, function(x) sum(x))
[1] 144 156
> apply(z, 2, function(x) sum(x))
[1] 84 100 116
> apply(z, 3, function(x) sum(x))
[1] 21 57 93 129
What's going on here? Well, we create a three-dimensional array z. If you use apply with MARGIN=1 you get row sums (two values because there are two rows), if you use MARGIN=2 you get column sums (three values because there are three columns), and if you use MARGIN=3 you get sums across the array's third dimension (four values because there are four levels to the third dimension of the array).
If you specify a vector for MARGIN, like c(2,3) you get the sum of the rows for each column and level of the third dimension. Note how in the above examples, the results from apply with MARGIN=1 are the row sums and with MARGIN=2 the column sums, respectively, of the matrix seen in the result below:
> apply(z, c(2,3), function(x) sum(x))
[,1] [,2] [,3] [,4]
[1,] 3 15 27 39
[2,] 7 19 31 43
[3,] 11 23 35 47
If you specify all of the dimensions as MARGIN=c(1,2,3) you simply get the original three-dimensional object:
> all.equal(z, apply(z, c(1,2,3), function(x) sum(x)))
[1] TRUE
Best way to learn here is just to start playing around with some real matrices. Your example data aren't helpful for looking at sums because all of the array entries are zero.

Related

Multi-Dimensional Arrays Julia

I am new to using Julia and have little experience with the language. I am trying to understand how multi-dimensional arrays work in it and how to access the array at the different dimensions. The documentation confuses me, so maybe someone here can explain it better.
I created an array (m = Array{Int64}(6,3)) and am trying to access the different parts of that array. Clearly I am understanding it wrong so any help in general about Arrays/Multi-Dimensional Arrays would help.
Thanks
Edit I am trying to read a file in that has the contents
58 129 10
58 129 7
25 56 10
24 125 25
24 125 15
13 41 10
0
The purpose of the project is to take these fractions (58/129) and round the fractions using farey sequence. The last number in the row is what both numbers need to be below. Currently, I am not looking for help on how to do the problem, just how to create a multidimensional array with all the numbers except the last row (0). My trouble is how to put the numbers into the array after I have created it.
So I want m[0][0] = 58, so on. I'm not sure how syntax works for this and the manual is confusing. Hopefully this is enough information.
Julia's arrays are not lists-of-lists or arrays of pointers. They are a single container, with elements arranged in a rectangular shape. As such, you do not access successive dimensions with repeated indexing calls like m[j][i] — instead you use one indexing call with multiple indices: m[i, j].
If you trim off that last 0 in your file, you can just use the built-in readdlm to load that file into a matrix. I've copied those first six rows into my clipboard to make it a bit easier to follow here:
julia> str = clipboard()
"58 129 10\n58 129 7\n25 56 10\n24 125 25\n24 125 15\n13 41 10"
julia> readdlm(IOBuffer(str), Int) # or readdlm("path/to/trimmed/file", Int)
6×3 Array{Int64,2}:
58 129 10
58 129 7
25 56 10
24 125 25
24 125 15
13 41 10
That's not very helpful in teaching you how Julia's arrays work, though. Constructing an array like m = Array{Int64}(6,3) creates an uninitialized matrix with 18 elements arranged in 6 rows and 3 columns. It's a bit easier to see how things work if we fill it with a sensible pattern:
julia> m .= [10,20,30,40,50,60] .+ [1 2 3]
6×3 Array{Int64,2}:
11 12 13
21 22 23
31 32 33
41 42 43
51 52 53
61 62 63
This has set up the values of the array to have the row number in their tens place and the column number in the ones place. Accessing m[r,c] returns the value in m at row r and column c.
julia> m[2,3] # second row, third column
23
Now, r and c don't have to be integers — they can also be vectors of integers to select multiple rows or columns:
julia> m[[2,3,4],[1,2]] # Selects rows 2, 3, and 4 across columns 1 and 2
3×2 Array{Int64,2}:
21 22
31 32
41 42
Of course ranges like 2:4 are just vectors themselves, so you can more easily and efficiently write that example as m[2:4, 1:2]. A : by itself is a shorthand for a vector of all the indices within the dimension it indexes into:
julia> m[1, :] # the first row of all columns
3-element Array{Int64,1}:
11
12
13
julia> m[:, 1] # all rows of the first column
6-element Array{Int64,1}:
11
21
31
41
51
61
Finally, note that Julia's Array is column-major and arranged contiguously in memory. This means that if you just use one index, like m[2], you're just going to walk down that first column. As a special extension, we support what's commonly referred to as "linear indexing", where we allow that single index to span into the higher dimensions. So m[7] accesses the 7th contiguous element, wrapping around into the first row of the second column:
julia> m[5],m[6],m[7],m[8]
(51, 61, 12, 22)

R: If I have two matrices, one with nearly the same columns but mixed up. How can I effectively map the two matrices?

Of the two matrices have one has i) the columns in different orders and ii) entire columns (every elements in the column) has the opposite different signs. An example would be
A = 1 2
3 4
b = 1.99 -1.02
3.99 -2.99
How can I re-order b such that it looks like:
b = 1.02 1.99
2.99 3.99
Is there away to do this quickly in R?
You could treat it as an optimization problem -- minimize the absolute difference between the two matrices by reordering the columns in one of the matrices.
Example data
A <- matrix(c(1, 2, 3, 4), nrow = 2)
A
[,1] [,2]
[1,] 1 3
[2,] 2 4
b <- matrix(c(-2.99, 3.99, -1.02, 1.99), nrow = 2)
b
[,1] [,2]
[1,] -2.99 -1.02
[2,] 3.99 1.99
Optimization / search
# Data frame with a row for every possible column arrangement
ordering <- (expand.grid(rep(list(1:ncol(A)), ncol(A))))
ordering
Var1 Var2
1 1 1
2 2 1
3 1 2
4 2 2
# Create a function to compute the difference for a particular arrangement
loss <- function(i) {
ord <- unlist(ordering[i, ])
sum(abs(abs(A) - abs(b[, ord])))
}
# Find the best arrangement
result <- optimize(loss, 1:nrow(ordering))
result$minimum # row index from the data frame
[1] 2.145956
# Extract the row to get the actual solution
solution <- unname(unlist(ordering[result$minimum, ]))
solution
[1] 2 1
Verify
A
[,1] [,2]
[1,] 1 3
[2,] 2 4
b[, solution]
[,1] [,2]
[1,] -1.02 -2.99
[2,] 1.99 3.99
Assuming that your matrices are as small as they are in your examples, you could change the order of the columns the following way:
Your example indicates you want to switch the first column and the second column. We can do that by reordering the column indexes like so:
b <- b[ , c(2, 1)]
c(2, 1) indicates that from now on, column 2 will be displayed as the first column and then column 1 will be displayed as the second column. We specify this in the column portion of the index operator and leave the row portion blank.
If we want to change the sign of an entire column, we can perform operations on specific columns like so:
b[ , 1] <- -1*b[ , 1]
This makes it so that every value in what is now the first column gets multiplied by -1.
If the matrix you're dealing with is much bigger, this is probably an impractical approach.

Is there a dimension function that works for vectors, matrices and arrays in R

As we all know the function dim calculates the dimension of a multidemnsional array or matrix.
n = 2
A = matrix(rnorm(n^2),n,n)
dim(A)
Which yields the answer 2,2 as expected. Now the issue is often you don't know if an object will be a vector or a matrix or an array. dim only works on the latter two types. Of course one could write a function as follows
dimVorM = function(x) ifelse( is.vector(x), return(c(1,length(x))), dim(x) )
But is there a better way?
You could write something like this, which would be analogous to NROW and NCOL.
DIM <- function(x) if(is.null(dim(x))) length(x) else dim(x)
I wouldn't return a length-2 vector if something only has one dimension. And don't use ifelse for control flow.
Technically, dim() works for vectors. The function dim() extracts a "dim" attribute and returns its values. A vector doesn't have that attribute dim, so the function dim() rightfully returns NULL.
> x <- 1:10
> attr(x, "dim") <- c(2,5)
> x
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
> dim(x)
[1] 2 5
> attributes(x)
$dim
[1] 2 5
> dim(x) <- NULL
> x
[1] 1 2 3 4 5 6 7 8 9 10
> dim(x)
NULL
The dim attribute is a vector with one value for each dimension, indicating the number of elements in that dimension. Both NROW and NCOL are constructed in such a way that they consider a vector to be a column vector with 1 column and n rows, and the solution of Hong Ooi is consistent with this.
Also keep in mind that a table is something entirely different. That is not a vector but a one-dimensional array :
> y <- table(iris$Species)
> y
setosa versicolor virginica
50 50 50
> dim(y)
[1] 3
> class(y)
[1] "table"

how to identify positions of max value in an array?

My array is
x <- array(1:24, dim=c(3,4,3))
My task 1 is to find the max value according to the first two dimensions
x.max <- apply(x,c(1,2), function(x) ifelse(all(is.na(x)), NA, max(x, na.rm = TRUE)))
in case there is NA data
my task 2 is to find the max value position on the third dimension.
I tried
x.max.position = apply(x, c(1,2),which.max(x))
But this only give me the position on the fist two dimensions.
Can anyone help me?
It's not totally clear, but if you want to find the max for each matrix of the third dimension (is that even a technically right thing to say?), then you need to use apply across the third dimension. The argument margin under ?apply states that:
a vector giving the subscripts which the function will be applied over. E.g., for a matrix 1 indicates rows, 2 indicates columns, c(1, 2) indicates rows and columns.
So for this example where you have a 3D array, 3 is the third dimension. So...
t( apply( x , 3 , function(x) which( x == max(x) , arr.ind = TRUE ) ) )
[,1] [,2]
[1,] 3 4
[2,] 3 4
[3,] 3 4
Which returns a matrix where each row contains the row and then column index of the max value of each 2D array/matrix of the third dimension.
If you want the max across all dimensions you can use which and the arr.ind argument like this:
which( x==max(x,na.rm=T) , arr.ind = T )
dim1 dim2 dim3
[1,] 3 4 2
Which tells us the max value is the third row, fourth column, second matrix.
EDIT
To find the position at dim 3 where where values on dim 1 and 2 are max try:
which.max( apply( x , 3 , max ) )
# [1] 2
Which tells us that at position 2 of the third dimension contains the maximal value.

Return a n-by-1 matrix from a multi-dimensional array

I was quite surprised when I found out that for x <- array(0, c(5,3,1)), e.g. x[2,,] returns a vector instead of a two-dimensional array (or matrix).
Why is it that this array is obviously interpreted as 5 vectors of length 3 instead of 5 3-by-1 arrays? attr(array(0, c(5,3,1)), "dim") yields [1] 5 3 1 as expected, so it seems that the last dimension didn't get lost.
How can I make sure that I get a two-dimensional array? I understand that arrays are nothing but vectors with additional attributes, but I don't understand this apparent "inconsistent" behaviour.
Please enlighten me :) I'm using a three-dimensional array in the context of another function in order to store several matrices. In general, these matrices have n-by-m shape where, in particular, m can be 1 (although it is usually higher).
It's a classic, and has been in the R FAQ for over a decade too: use drop=FALSE to prevent the collapsing of a 1-row / col matrix to a vector.
R> M <- matrix(1:4,2,2)
R> M
[,1] [,2]
[1,] 1 3
[2,] 2 4
R> M[,1]
[1] 1 2
R> M[,1,drop=FALSE]
[,1]
[1,] 1
[2,] 2
R>

Resources