Julia strange answer from indexing an array - arrays

I'm attempting to index an array based on conditions in the array itself and corresponding locations in other arrays. I've produced a MWE but this will be hopefully used in a much larger example, within a loop to automate across scenarios, dare I say... using parallelisation?
MWE:
# create 3d array
a, b, c = [8;8;6;5;5;6], [8;8;7;6;6;6], [8;2;7;7;6;6]
d = transpose(cat(a,b,c, dims = 2))
e, f, g = [3;2;5;1;4;1], [4;3;1;1;1;2], [5;1;2;1;2;3]
h = transpose(cat(e,f,g, dims = 2))
wrkarr = cat(d,h,dims = 3)
# create temp array for indexes
temp = wrkarr[size(wrkarr,1), :, 1]
# calculate indexes
temp[(wrkarr[size(wrkarr,1),:,1] .== 8) .& (wrkarr[size(wrkarr,1),:,2]) .>= 4] .= 22
In my case it doesn't change anything in the temp array, when I would expect the first element to be changed from 8 to 22. Both of the individual conditional tests produce a vector [1,0,0,0,0,0] so why won't the .& test produce the same? Thx. J

First, and as an aside, please note that you can use arr[end,i] to refer to the element arr with the last index in the 1st axis, and index i in the second axis.
Using this notation, your condition can be rewritten as:
julia> (wrkarr[end,:,1] .== 8) .& (wrkarr[end,:,2]) .>=4
6-element BitArray{1}:
0
0
0
0
0
0
It might be a bit easier to see here that there is a parenthesis issue. I think you actually wanted to write:
julia> (wrkarr[end,:,1] .== 8) .& (wrkarr[end,:,2] .>=4)
6-element BitArray{1}:
1
0
0
0
0
0
Making this change does what (I think) you want. (Also note that I added #views below in order to avoid allocations and speed things up a little)
julia> idx = (#view(wrkarr[end,:,1]) .== 8) .& (#view(wrkarr[end,:,2]) .>=4);
julia> temp[idx] .= 22
1-element view(::Array{Int64,1}, [1]) with eltype Int64:
22
julia> temp
6-element Array{Int64,1}:
22
2
7
7
6
6
EDIT: as mentioned in comments, other solutions could be considered:
using findall to generate a vector of indices matching the condition
using a simple for loop
Here is a benchmark of all three solutions, in order to see how they compare in terms of readability and performance.
TLDR: the for loop seems to be much more efficient in this case, and allocates less.
# create 3d array
d = [8 8 6 5 5 6;
8 8 7 6 6 6;
8 2 7 7 6 6]
h = [3 2 5 1 4 1;
4 3 1 1 1 2;
5 1 2 1 2 3]
wrkarr = cat(d,h,dims = 3)
using BenchmarkTools
Option 1 logical indexing
julia> function version1!(temp, wrkarr)
idx = (#view(wrkarr[end,:,1]) .== 8) .& (#view(wrkarr[end,:,2]) .>= 4)
temp[idx] .= 22
end
version1! (generic function with 1 method)
julia> temp = wrkarr[end, :, 1]; #btime version1!($temp, $wrkarr); temp
182.646 ns (3 allocations: 224 bytes)
6-element Array{Int64,1}:
22
2
7
7
6
6
Option 2 vector of indices
julia> function version2!(temp, wrkarr)
idx = findall(i -> (wrkarr[end,i,1] == 8) & (wrkarr[end,i,2] >=4), axes(wrkarr,2))
temp[idx] .= 22
end
version2! (generic function with 1 method)
julia> temp = wrkarr[end, :, 1]; #btime version2!($temp, $wrkarr); temp
134.395 ns (3 allocations: 208 bytes)
6-element Array{Int64,1}:
22
2
7
7
6
6
Option 3 for loop
julia> function version3!(temp, wrkarr)
#inbounds for i in axes(wrkarr, 2)
if wrkarr[end, i, 1] == 8 && wrkarr[end, i, 2] >= 4
temp[i] = 22
end
end
end
version3! (generic function with 1 method)
julia> temp = wrkarr[end, :, 1]; #btime version3!($temp, $wrkarr); temp
21.820 ns (0 allocations: 0 bytes)
6-element Array{Int64,1}:
22
2
7
7
6
6

Based on all the clever stuff given to me above I pushed it a bit further and this solution appears to work and dispenses with the need for an index array, simply does the substitution right back into the original array. Is there anything wrong with that? Looking at #François Févotte's answer the loop function is the fastest, and this could be very important when scaled up to proper size. Now for my next challenge I want to use the same function to loop through sets of numbers to replace the 8, 4, 22, for example for the same wrkarr:
== 8 &>= 4 -> 22
== 7 &>= 2 -> 8
== 6 &>= 3 -> 7
Any takers or suggestions? When I get that to work I want to see if it can be parallelized across these individual sets of numbers, i.e. each substitution can be done independently, not dependent on any other. Thx a bunch!
function version4(wrkarr)
#inbounds for i in axes(wrkarr, 2)
if wrkarr[end, i, 1] == 8 && wrkarr[end, i, 2] >= 4
wrkarr[end,i, 1] = 22
end
end
end
wrkarr[end, :, 1]; #btime version4($wrkarr); wrkarr
2.237 ns (0 allocations: 0 bytes)
3×6×2 Array{Int64,3}:
[:, :, 1] =
8 8 6 5 5 6
8 8 7 6 6 6
22 2 7 7 6 6
[:, :, 2] =
3 2 5 1 4 1
4 3 1 1 1 2
5 1 2 1 2 3

Related

The inverse operation of tensor circular unfolding in Julia

I implemented Tensor Circular Unfolding (TCU) defined in this document (See Definition 2).
The TCU reshapes a tensor (or multidimensional array) X into a matrix (or two-dimensional array). By using TensorToolbox, I implemented that as follows:
using TensorToolbox
function TCU(X,d,k)
N = ndims(X)
#assert d < N
if d <= k
a = k-d+1
else
a = k-d+1+N
end
tenmat(permutedims(X,circshift(1:N,-a+1)),row=1:d)
end
for positive integers d<N and k≦N where N is the depth of input tensor X. The function tenmat comes from TensorToolbox.jl and it is for matricization of a tensor. Please see ReadMe file in TensorToolbox.jl.
Here I put an example with N=4.
X = rand(1:9,3,4,2,2)
#3×4×2×2 Array{Int64, 4}:
#[:, :, 1, 1] =
# 5 7 2 6
# 4 5 6 2
# 6 8 9 1
#
#[:, :, 2, 1] =
# 4 3 7 5
# 8 3 3 1
# 8 2 4 7
#
#[:, :, 1, 2] =
# 4 3 9 6
# 7 4 9 2
# 6 7 2 4
#
#[:, :, 2, 2] =
# 9 2 1 7
# 8 2 1 3
# 6 2 4 9
M = TCU(X, 2, 3)
#8×6 Matrix{Int64}:
# 5 4 4 7 6 6
# 7 3 5 4 8 7
# 2 9 6 9 9 2
# 6 6 2 2 1 4
# 4 9 8 8 8 6
# 3 2 3 2 2 2
# 7 1 3 1 4 4
# 5 7 1 3 7 9
What I need
I would like to write the reverse operation of the above function. That is, I need the function InvTCU that satisfies
X == InvTCU( TCU(X, d, k), d, k )
If we need, InvTCU can require the original tensor size size(X)
X == InvTCU( TCU(X, d, k), d, k, size(X) )
The reason why I need InvTCU
It is required in Equation (18) in the document to implement the algorithm named PTRC. In this situation, the size of the original tensor size(X) are available information.
EDIT
I added the description about tenmat.
I added the description that InvTCU can require the original tensor size.
Before giving the function, it might be noted that to get the matrixfied tensor, it is possible to use views instead of permuting the dimensions, which might be more efficient (depending on processing later). This can be done (I think) with the TrasmuteDims or TensorCast packages (https://docs.juliahub.com/TransmuteDims/NIYrh/0.1.15/).
Here is an attempt at a permutedims approach:
function invTCU(M,d,k, presize)
N = length(presize)
a = d<=k ? k-d+1 : k-d+1+N
X = reshape(M,Tuple(circshift(collect(presize),1-a)))
permutedims(X,circshift(1:N,a-1))
end
with this definition:
julia> X = reshape(1:48,3,4,2,2)
3×4×2×2 reshape(::UnitRange{Int64}, 3, 4, 2, 2) with eltype Int64:
[:, :, 1, 1] =
1 4 7 10
...
julia> X == invTCU(TCU(X, 2, 3), 2, 3, size(X))
true
seems to recover original tensor.

How to convert an array of arrays into a matrix?

I can't find an answer to this simple question.
I have the following:
A(a,j)=[a*j*i*k for i in 1:2, k in 1:2];
B=[A(a,j) for a in 1:2, j in 1:2];
B is a an array of arrays: 2×2 Array{Array{Int64,2},2}. This is useful to easily access the subarrays with indices (e.g., B[2,1]). However, I also need to convert B to a 4 by 4 matrix. I tried hcat(B...) but that yields a 2 by 8 matrix, and other options are worse (e.g., cat(Test2...;dims=(2,1))).
Is there an efficient way of writing B as a matrix while keeping the ability to easily access its subarrays, especially as B gets very large?
Do you want this:
julia> hvcat(size(B,1), B...)
4×4 Array{Int64,2}:
1 2 2 4
2 4 4 8
2 4 4 8
4 8 8 16
or without defining B:
julia> hvcat(2, (A(a,j) for a in 1:2, j in 1:2)...)
4×4 Array{Int64,2}:
1 2 2 4
2 4 4 8
2 4 4 8
4 8 8 16
What about
B = reduce(hcat, reduce(vcat, A(a,j) for a in 1:2) for j in 1:2)
EDIT: Actually this is very slow, I would recommend making a function, e.g.,
function buildB(A, n)
A0 = A(1,1)
nA = size(A0, 1)
B = Array{eltype(A0),2}(undef, n * nA, n * nA)
for a in 1:n, j in 1:n
B[(a-1)*nA .+ (1:nA), (j-1)*nA .+ (1:nA)] .= A(a,j)
end
return B
end
or maybe consider a package like BlockArrays.jl?
EDIT 2 This is an example with BlockArrays.jl:
using BlockArrays
function blockarrays(A, n)
A0 = A(1,1)
nA = size(A0, 1)
B = BlockArray{eltype(A0)}(undef_blocks, fill(nA,n), fill(nA,n))
for a in 1:n, j in 1:n
setblock!(B, A(a,j), a, j)
end
return B
end
which should do what you want:
julia> B = blockarrays(A, 2)
2×2-blocked 4×4 BlockArray{Int64,2}:
1 2 │ 2 4
2 4 │ 4 8
──────┼───────
2 4 │ 4 8
4 8 │ 8 16
julia> getblock(B, 1, 2)
2×2 Array{Int64,2}:
2 4
4 8
julia> B[4,2]
8

Converting Array of CartesianIndex to 2D-Matrix in Julia

let's say we have an array of cartesian indices in Julia
julia> typeof(indx)
Array{CartesianIndex{2},1}
Now we want to plot them as a scatter-plot using PyPlot. so we should convert the indx-Array of Cartesian to a 2D-Matrix so we can plot it like this:
PyPlot.scatter(indx[:, 1], indx[:, 2])
How can i convert an Array of type Array{CartesianIndex{2},1} to a 2D-Matrix of type Array{Int,2}
By the way here is a code snippet how to produce a dummy Array of cartesianindex:
A = rand(1:10, 5, 5)
indx = findall(a -> a .> 5, A)
typeof(indx) # this is an Array{CartesianIndex{2},1}
Thanks
An easy and generic way is
julia> as_ints(a::AbstractArray{CartesianIndex{L}}) where L = reshape(reinterpret(Int, a), (L, size(a)...))
as_ints (generic function with 1 method)
julia> as_ints(indx)
2×9 reshape(reinterpret(Int64, ::Array{CartesianIndex{2},1}), 2, 9) with eltype Int64:
1 3 4 1 2 4 1 1 4
2 2 2 3 3 3 4 5 5
This works for any dimensionality, making the first dimension the index into the CartesianIndex.
One possible way is hcat(getindex.(indx, 1), getindex.(indx,2))
julia> #btime hcat(getindex.($indx, 1), getindex.($indx,2))
167.372 ns (6 allocations: 656 bytes)
10×2 Array{Int64,2}:
4 1
3 2
4 2
1 3
4 3
5 3
2 4
5 4
1 5
4 5
However, note that you don't need to - and therefore probably shouldn't - bring your indices to 2D-Matrix form. You could simply do
PyPlot.scatter(getindex.(indx, 1), getindex.(indx, 2))

Given an index of choices for each column, construct a 1D array from a 2D array

I have a 2D array such as:
julia> m = [1 2 3 4 5
6 7 8 9 10
11 12 13 14 15]
3×5 Array{Int64,2}:
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
I want to pick one value from each column and construct a 1D array.
So for instance, if my choices are
julia> choices = [1, 2, 3, 2, 1]
5-element Array{Int64,1}:
1
2
3
2
1
Then the desired output is [1, 7, 13, 9, 5]. What's the best way to do that? In my particular application, I am randomly generating these values, e.g.
choices = rand(1:size(m)[1], size(m)[2])
Thank you!
This is probably the simplest approach:
[m[c, i] for (i, c) in enumerate(choices)]
EDIT:
If best means fastest for you such a function should be approximately 2x faster than the comprehension for large m:
function selector(m, choices)
v = similar(m, size(m, 2))
for i in eachindex(choices)
#inbounds v[i] = m[choices[i], i]
end
v
end

Array range complement

Is there a way to overwrite [] to have complement of range in array?
julia> a=[1:8...]
8-element Array{Int64,1}:
1
2
3
4
5
6
7
8
julia> a[-1] == a[2:8]
julia> a[-(1:3)] == a[4:8]
julia> a[-end] == a[1:7]
I haven't looked into the internals of indexing before, but at a first glance, the following might work without breaking too much:
immutable Not{T}
idx::T
end
if :to_indices in names(Base)
# 0.6
import Base: to_indices, uncolon, tail, _maybetail
#inline to_indices(A, inds, I::Tuple{Not, Vararg{Any}}) =
(setdiff(uncolon(inds, (:, tail(I)...)), I[1].idx), to_indices(A, _maybetail(inds), tail(I))...)
else
# 0.5
import Base: getindex, _getindex
not_index(a::AbstractArray, I, i::Int) = I
not_index(a::AbstractArray, I::Not, i::Int) = setdiff(indices(a, i), I.idx)
getindex(a::AbstractArray, I::Not) = getindex(a, setdiff(linearindices(a), I.idx))
_getindex(::Base.LinearIndexing, a::AbstractArray, I::Vararg{Union{Real, AbstractArray, Colon, Not}}) =
Base._getindex(Base.linearindexing(a), a, (not_index(a, idx, i) for (i,idx) in enumerate(I))...)
end
For example:
julia> a = reshape(1:9, (3, 3))
3×3 Base.ReshapedArray{Int64,2,UnitRange{Int64},Tuple{}}:
1 4 7
2 5 8
3 6 9
julia> a[Not(2:8)]
2-element Array{Int64,1}:
1
9
julia> a[Not(1:2), :]
1×3 Array{Int64,2}:
3 6 9
julia> a[Not(end), end]
2-element Array{Int64,1}:
7
8
I didn't care for performance and also did no extensive testing, so things can certainly be improved.
Edit:
I replaced the code for 0.6 with Matt B. version from his github comment linked in the comments.
Thanks to his great design of the array indexing implementation for 0.6, only a single function needs to be extended to get complement indexing for getindex, setindex and view, e.g.,
julia> view(a, Not(2:8))
2-element SubArray{Int64,1,UnitRange{Int64},Tuple{Array{Int64,1}},false}:
1
9
# collect because ranges are immutable
julia> b = collect(a); b[Not(2), Not(2)] = 10; b
3×3 Array{Int64,2}:
10 4 10
2 5 8
10 6 10
Directly overwriting [](i.e. getindex) is prone to break many indexing-related things in Base, but we can write an array wrapper to work around it. We only need to define the following three methods to get your specific test cases passed:
immutable ComplementVector{T} <: AbstractArray{T,1}
data::Vector{T}
end
Base.size(A:: ComplementVector) = size(A.data)
Base.getindex(A:: ComplementVector, i::Integer) = i > 0 ? A.data[i] : A.data[setdiff(1:end, (-i))]
Base.getindex(A:: ComplementVector, I::StepRange) = all(x->x>0, I) ? A.data[I] : A.data[setdiff(1:end, -I)]
julia> a = ComplementVector([1:8...])
julia> a[-1] == a[2:8]
true
julia> a[-(1:3)] == a[4:8]
true
julia> a[-end] == a[1:7]
true
If you would like to extend ComplementVector further more, please read the doc about Interfaces.
Update:
For safety sake, we'd better not extend AbstractArray as #Fengyang Wang suggested in the comment blow:
immutable ComplementVector{T}
data::Vector{T}
end
Base.endof(A::ComplementVector) = length(A.data)
Base.getindex(A::ComplementVector, i::Integer) = i > 0 ? A.data[i] : A.data[setdiff(1:end, (-i))]
Base.getindex(A::ComplementVector, I::OrdinalRange) = all(x->x>0, I) ? A.data[I] : A.data[setdiff(1:end, -I)]

Resources