Julia: Flattening array of array/tuples

Julia: Flattening array of array/tuples - arrays

In Julia vec reshapes multidimensional arrays into one-dimension arrays.
However it doesn't work for arrays of arrays or arrays of tuples.
A part from using array comprehension, is there another way to flatten arrays of arrays/tuples? Or arrays of arrays/tuples of arrays/tuples? Or ...

Iterators.flatten(x) creates a generator that iterates over each element of x. It can handle some of the cases you describe, eg
julia> collect(Iterators.flatten([(1,2,3),[4,5],6]))
6-element Array{Any,1}:
1
2
3
4
5
6
If you have arrays of arrays of arrays and tuples, you should probably reconsider your data structure because it doesn't sound type stable. However, you can use multiple calls to flatten, eg
julia> collect(Iterators.flatten([(1,2,[3,3,3,3]),[4,5],6]))
6-element Array{Any,1}:
1
2
[3, 3, 3, 3]
4
5
6
julia> collect(Iterators.flatten(Iterators.flatten([(1,2,[3,3,3,3]),[4,5],6])))
9-element Array{Any,1}:
1
2
3
3
3
3
4
5
6
Note how all of my example return an Array{Any,1}. That is a bad sign for performance, because it means the compiler could not determine a single concrete type for the elements of the output array. I chose these example because the way I read your question it sounded like you may have type unstable containers already.

In order to flatten an array of arrays, you can simply use vcat() like this:
julia> A = [[1,2,3],[4,5], [6,7]]
Vector{Int64}[3]
Int64[3]
Int64[2]
Int64[2]
julia> flat = vcat(A...)
Int64[7]
1
2
3
4
5
6
7

The simplest way is to apply the ellipsis ... twice.
A = [[1,2,3],[4,5], [6,7]]
flat = [(A...)...]
println(flat)
The output would be
[1, 2, 3, 4, 5, 6, 7].

If you use VectorOfArray from RecursiveArrayTools.jl, it uses the indexing fallback to provide convert(Array,A) for a VectorOfArray A.
julia> using RecursiveArrayTools
julia> A = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
3-element Array{Array{Int64,1},1}:
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
julia> VA = VectorOfArray(A)
3-element Array{Array{Int64,1},1}:
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
First of it acts as a lazy wrapper for doing the indexing without conversion:
julia> VA[1,3]
7
Note that columns are the separate arrays so that way it's still "column-major" (i.e. efficient to index down columns). But then it has a straight conversion:
julia> convert(Array,VA)
3×3 Array{Int64,2}:
1 4 7
2 5 8
3 6 9
The other way to handle this conversion is to do something like hcat(A...), but that's slow if you have a lot of arrays you're splatting!
Now, you may think: what about writing a function that pre-allocates the matrix, then loops through and fills it? That's almost what convert on the VectorOfArray works, except the fallback that convert uses here utilizes Tim Holy's Cartesian machinery. At one point, I wrote that function:
function vecvec_to_mat(vecvec)
mat = Matrix{eltype(eltype(vecvec))}(length(vecvec),length(vecvec[1]))
for i in 1:length(vecvec)
mat[i,:] .= vecvec[i]
end
mat
end
but I have since gotten rid of it because the fallback was much faster. So, YMMV but that's a few ways to solve your problem.

for Julia 0.7x:
for Arrays:
flat(arr::Array) = mapreduce(x -> isa(x, Array) ? flat(x) : x,
append!, arr,init=[])
for Tuples:
flat(arr::Tuple) = mapreduce(x -> isa(x, Tuple) ? flat(x) : x,
append!, arr,init=[])
Works for arbitrary depth.
see: https://rosettacode.org/wiki/Flatten_a_list#Julia
Code for Array/Tuple:
function flatten(arr)
rst = Any[]
grep(v) = for x in v
if isa(x, Tuple) || isa(x, Array)
grep(x)
else push!(rst, x) end
end
grep(arr)
rst
end

Related

View on Julia array using sliding window

What is the most efficient way to create a view on array using, for example, sliding window=2
Let's say we have:
x = collect(1:1:6)
# 1 2 3 4 5 6
And I want to create a view like this:
# 1 2
# 2 3
# 3 4
# 4 5
# 5 6
So far I found only this option, but not sure if it's an optimal one:
y = Array{Float32, 2}(undef, nslides, window)
#inbounds for i in 1:window
y[:, i] = #view x[i:end-(window-i)]
end

One solution with a package (well, with my package) is this:
julia> using Tullio
julia> x = 1:6; window = 2;
julia> #tullio y[r,c] := x[r+c-1] (c in 1:window)
5×2 Matrix{Int64}:
1 2
2 3
3 4
4 5
5 6

The one liner is:
view.(Ref(x), (:).(1:length(x)-1,2:length(x)))
Testing:
julia> x=collect(1:6);
julia> view.(Ref(x), (:).(1:length(x)-1,2:length(x)))
5-element Array{SubArray{Int64,1,Array{Int64,1},Tuple{UnitRange{Int64}},true},1}:
[1, 2]
[2, 3]
[3, 4]
[4, 5]
[5, 6]
Explanation:
creation of views is vectorized by the dot operator .
we do not want to vectorize on elements of x so use Ref(x) instead
(:) is just a shorter form for UnitRange and again we use the dot operator . to vectorize
I used 2 as the Window size but of course you can write view.(Ref(x), (:).(1:length(x)-(window-1),window:length(x)))
EDIT:
If you want rather a library function this would work for you:
julia> using ImageFiltering
julia> mapwindow(collect, x, 0:1,border=Inner())
5-element OffsetArray(::Array{Array{Int64,1},1}, 1:5) with eltype Array{Int64,1} with indices 1:5:
[1, 2]
[2, 3]
[3, 4]
[4, 5]
[5, 6]
Of course you could put them the function that you want to run on the sliding window rather than just collect.

How to convert a matrix to an array of arrays?

In How to convert an array of array into a matrix? we learned how to convert an array of arrays to a matrix. But what about the other way around? How do we go from input to output, as shown below?
input = [1 2 3; 4 5 6; 7 8 9]
output = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

If you want to make a copy of the data then:
[input[i, :] for i in 1:size(input, 1)]
If you do not want to make a copy of the data you can use views:
[view(input, i, :) for i in 1:size(input, 1)]
After some thought those are alternatives using broadcasting:
getindex.([input], 1:size(input, 1), :)
view.([input], 1:size(input, 1), :)

I add one alternative too:
mapslices(x->[x], input,2)
Edit:
Warning! Now I see that mapslices return 3x1 matrix! (you could change it: mapslices(x->[x], input,2)[:,1])
I am unsatisfied. I don't like any solution we find yet. They are too complicated (think for example how to explain it to children!).
It is also difficult to find function like mapslices in doc too. BTW there is non-exported Base.vect function which could be used instead of anonymous x->[x].
I was thinking that sometimes is more clever to use bigger hammer. So I tried to find something with DataFrames
julia> using DataFrames
julia> DataFrame(transpose(input)).columns
3-element Array{Any,1}:
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
unfortunately there is not DataFrame.rows
result's type is Array{Any,1}
I don't think it could be very quick
I hope Julia could get us better solution! :)

Julia's most efficient way to choose longest array in array of arrays?

I have an array of arrays A which is an N-element Array{Array{Int64,1},1} of integers. I am trying to find the largest array in A using Julia.
For example:
A = [[1, 2], [3, 4], [5, 6, 7], [1, 2, 5, 8]]
In Python I would simply do: max(A, key=len) but in Julia I don't know how to do it.
What I did is this:
L = []
for a in A
push!(L, length(a))
end
A[findmax(L)[2]]
Thanks!

#Colin has provided a compact, convenient answer. However, if speed matters (op asked for most efficient way) this should be close to optimum
function findlongest(A)
idx = 0
len = 0
#inbounds for i in 1:length(A)
l = length(A[i])
l > len && (idx = i; len=l)
end
return A[idx]
end
Note that this implementation would (presumably) be a really bad idea in Python :)
Quick benchmark:
julia> using BenchmarkTools
julia> A = [[1,2], [1,2,3,4,5,6], [1,2,3]]
3-element Array{Array{Int64,1},1}:
[1, 2]
[1, 2, 3, 4, 5, 6]
[1, 2, 3]
julia> #btime findlongest(A);
26.880 ns (0 allocations: 0 bytes)
julia> #btime A[indmax(length.(A))];
9.813 μs (25 allocations: 1.14 KiB)
That's a ~365 times speedup for this example.
EDIT: Better benchmark (suggested in the comments)
julia> #btime findlongest($A);
9.813 ns (0 allocations: 0 bytes)
julia> #btime $A[indmax(length.($A))];
41.813 ns (1 allocation: 112 bytes)
The $ signs avoid setup allocations and times. Speedup ~4.
Quick explanation
for loops are fast in Julia, so why not use them
avoid allocation (length.(A) allocates a new array of integers)
a && b is shortcut for "if a then b"
#inbounds avoids bound checks for A[i]

UPDATE: For v1+ you'll need to replace indmax in this answer with argmax.
EDIT: Note, it is also worth checking out the other answer by #crstnbr
Consider the following example code:
julia> A = [[1,2], [1,2,3,4,5,6], [1,2,3]]
3-element Array{Array{Int64,1},1}:
[1, 2]
[1, 2, 3, 4, 5, 6]
[1, 2, 3]
julia> length(A)
3
julia> length.(A)
3-element Array{Int64,1}:
2
6
3
julia> indmax(length.(A))
2
julia> A[indmax(length.(A))]
6-element Array{Int64,1}:
1
2
3
4
5
6
The first call to length gets the length of the outer vector in A, which is not what we want. In the second call, I use the broadcasting operator . so that I instead get the length of each of the inner vectors. In the indmax line, I'm finding the index of largest value in length.(A), ie the index of the longest inner vector. If you instead want to return the longest inner vector, you can just index into A using the result of the indmax line.

indmax is no longer defined in Julia (at least 1.3).
Use argmax instead.
>>> A = [[1,2], [1,2,3]]
2-element Array{Array{Int64,1},1}:
[1, 2]
[1, 2, 3]
>>> length.(A)
2-element Array{Int64,1}:
2
3
>>> argmax(length.(A))
2
>>> A[argmax(length.(A))]
3-element Array{Int64,1}:
1
2
3

How to index a Julia array

I am having trouble understanding what seems like an inconsistent behavior in Julia.
X = reshape(1:100, 10, 10)
b = [1 5 9]
X[2, :][b] # returns the correct array
X[2, :][1 5 9] # throws an error
Can someone explain why using the variable b works to index an array but not when I write the index myself?

Since x = X[2,:] is just a vector, we can simplify the example to just talking about indexing behavior on vectors.
x[v] where v is a collection of integers returns the subset of x. Thus x[(1,5,9)], or x[[1,5,9]] is thus using that getindex(x::Vector,i::AbstractArray) dispatch.
Note that x[[1 5 9]] works because v = [1 5 9] makes v a row vector. That's valid syntax, but x[1 5 9] just isn't even valid Julia syntax. That syntax means something else:
v = Float64[1 5 9]
returns a row vector with element type Float64.

I have figured out a solution.
Rather than write X[2, :][1 5 9] I should have written x[2, :][[1 5 9]]
I believe this makes sense when we imagine indexing on two dimensions the second time. This makes it possible to write more complicate indices, like X[2:4, :][[1 3],[1 3]]

How to append function to an array?

For example, I can create an array that contains a function.
julia> a(x) = x + 1
>> a (generic function with 1 method)
julia> [a]
>> 1-element Array{#a,1}:
a
But I can't seem to add the function to an empty array:
julia> append!([],a)
>> ERROR: MethodError: no method matching length(::#a)
Closest candidates are:
length(::SimpleVector) at essentials.jl:168
length(::Base.MethodList) at reflection.jl:256
length(::MethodTable) at reflection.jl:322
...
in _append!(::Array{Any,1}, ::Base.HasLength, ::Function) at .\collections.jl:25
in append!(::Array{Any,1}, ::Function) at .\collections.jl:21
What I ulimately want to do is store the pre-defined functions so that I can ultimately map them over a value. E.g.:
x = 0.0
for each fn in vec
x = x + fn(x)
end

append! is for appending one collection on to another.
You are looking for push!, to add an element to a collection.
Your code should be push!([], a)
See the docs:
julia>?append!
search: append!
append!(collection, collection2) -> collection.
Add the elements of collection2 to the end of collection.
julia> append!([1],[2,3])
3-element Array{Int64,1}:
1
2
3
julia> append!([1, 2, 3], [4, 5, 6])
6-element Array{Int64,1}:
1
2
3
4
5
6
Use push! to add individual items to collection which are not already themselves in another collection. The result is of the preceding example is equivalent to push!([1, 2, 3], 4, 5, 6).
vs:
julia>?push!
search: push! pushdisplay
push!(collection, items...) -> collection
Insert one or more items at the end of collection.
julia> push!([1, 2, 3], 4, 5, 6)
6-element Array{Int64,1}:
1
2
3
4
5
6
Use append! to add all the elements of another collection to collection. The result of the preceding example is equivalent to append!([1, 2, 3], [4, 5, 6]).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Julia: Flattening array of array/tuples - arrays

In Julia vec reshapes multidimensional arrays into one-dimension arrays. However it doesn't work for arrays of arrays or arrays of tuples. A part from using array comprehension, is there another way to flatten arrays of arrays/tuples? Or arrays of arrays/tuples of arrays/tuples? Or ...

In order to flatten an array of arrays, you can simply use vcat() like this: julia> A = [[1,2,3],[4,5], [6,7]] Vector{Int64}[3] Int64[3] Int64[2] Int64[2] julia> flat = vcat(A...) Int64[7] 1 2 3 4 5 6 7

The simplest way is to apply the ellipsis ... twice. A = [[1,2,3],[4,5], [6,7]] flat = [(A...)...] println(flat) The output would be [1, 2, 3, 4, 5, 6, 7].

Related

View on Julia array using sliding window

How to convert a matrix to an array of arrays?

Julia's most efficient way to choose longest array in array of arrays?

How to index a Julia array

How to append function to an array?

Categories

Resources