Passing two-dimensional array to a function in julia - arrays

I have a three dimensional array defined as:
x=zeros(Float64,2,2,2)
I want to assign ones to x by passing x to a function, one layer at a time.
The function is:
function init(p,y)
y=ones(p,p)
end
and I will pass x as follows:
for k=1:2
init(2,x[2,2,k])
end
but when I do that, x is zeros, not ones. Why?
julia> x
2x2x2 Array{Float64,3}:
[:, :, 1] =
0.0 0.0
0.0 0.0
[:, :, 2] =
0.0 0.0
0.0 0.0
Any idea how to get Julia to assign ones to x?

One possible solution is to use slice, which makes a SubArray:
x = zeros(2, 2, 2) # Float64 by default
function init!(y)
y[:] = ones(y) # change contents not binding
end
for k in 1:2
init!(slice(x, :, :, k)) # use slice to get SubArray
end
Note that you can use ones(y) to get a vector of ones of the same size as y.
A SubArray gives a view of an array, instead of a copy. In future versions of Julia, indexing an array may give this by default, but currently you must do it explicitly.
For a discussion about values vs. bindings, see
http://www.johnmyleswhite.com/notebook/2014/09/06/values-vs-bindings-the-map-is-not-the-territory/
EDIT: I hadn't seen #tholy's answer, which contains the same idea.

I'm also not sure I understand the question, but slice(x, :, :, k) will take a 2d slice of x.
If you're initializing x as an array of Float64 and then hoping to assign a matrix to each element (which is what it appears you're trying to do), that won't work---the type of x won't allow it. You could make x an array of Any, and then that would be permitted.

I'm not certain I understand, but if you're trying to modify x in place, you'll want to do things a little differently.
The code below should do what you need.
x = zeros(Float64, 2, 2, 2)
function init!(p, y, k)
y[:, :, k] = ones(Float64, p, p)
end
for k = 1:2
init!(2, x, k)
end
And you might also want to keep in mind that the standard convention in Julia is to include an exclamation mark in the name of a function that modifies its arguments. And if I've understood your question, then you want your init!() function to do exactly that.

A lot has changed in Julia, and I thought I would update this answer to reflect Julia 1.5 (probably most of the changes were 1.0). While I would expect the modern x[:, :, k] to work, as this is still refered to as a SubArray this actually is copy now when in an expression. Instead you must use view():
x= zeros(2, 2, 2)
function init!(y)
y[:]= ones(size(y))
end
init!(view(x, :, :, 1)) # get reference to original items
This gives you the desired result:
julia> x
2×2×2 Array{Float64,3}:
[:, :, 1] =
1.0 1.0
1.0 1.0
[:, :, 2] =
0.0 0.0
0.0 0.0
There are also helper macros for writing it in a more palatable form,
init!(#view x[:,:,1])
but you run the danger of greedy macro parsing if you have other arguments, such that
otherfunc!(#view x[:,:,1], 10)
would give you an error Invalid use of #view macro: argument must be a reference expression. To get around this, there is the kludge #views which turns all SubArrays into views, or you can wrap the argument in parenthesis.
otherfunc!(#views x[:,:,1], 10)
otherfunc!(#view( x[:,:,1]), 10)
You can find more information on the manipulation of Arrays and Matrices in this presentation:
(Youtube) Arrays: slices and views

Related

Why does a subtype of AbstractArray result in imprecise matrix operations in Julia?

I'm currently working on creating a subtype of AbstractArray in Julia, which allows you to store a vector in addition to an Array itself. You can think of it as the column "names", with element types as a subtype of AbstractFloat. Hence, it has some similarities to the NamedArray.jl package, but restricts to only assigning the columns with Floats (in case of matrices).
The struct that I've created so far (following the guide to create a subtype of AbstractArray) is defined as follows:
struct FooArray{T, N, AT, VT} <: AbstractArray{T, N}
data::AT
vec::VT
function FooArray(data::AbstractArray{T1, N}, vec::AbstractVector{T2}) where {T1 <: AbstractFloat, T2 <: AbstractFloat, N}
length(vec) == size(data, 2) || error("Inconsistent dimensions")
new{T1, N, typeof(data), typeof(vec)}(data, vec)
end
end
#inline Base.#propagate_inbounds Base.getindex(fooarr::FooArray, i::Int) = getindex(fooarr.data, i)
#inline Base.#propagate_inbounds Base.getindex(fooarr::FooArray, I::Vararg{Int, 2}) = getindex(fooarr.data, I...)
#inline Base.#propagate_inbounds Base.size(fooarr::FooArray) = size(fooarr.data)
Base.IndexStyle(::Type{<:FooArray}) = IndexLinear()
This already seems to be enough to create objects of type fooArray and do some simple math with it. However, I've observed that some essential functions such as matrix-vector multiplications seem to be imprecise. For example, the following should consistently return a vector of 0.0, but:
R = rand(100, 3)
S = FooArray(R, collect(1.0:3.0))
y = rand(100)
S'y - R'y
3-element Vector{Float64}:
-7.105427357601002e-15
0.0
3.552713678800501e-15
While the differences are very small, they can quickly add up over many different calculations, leading to significant errors.
Where do these differences come from?
A look at the calculations via macro #code_llvm reveals that appearently different matmul functions from LinearAlgebra are used (with other minor differences):
#code_llvm S'y
...
# C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\matmul.jl:111 within `*'
...
#code_llvm S'y
...
# C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\matmul.jl:106 within `*'
...
Redefining the adjoint and * functions on our FooArray object provides the expected, correct result:
import Base: *, adjoint, /
Base.adjoint(a::FooArray) = FooArray(a.data', zeros(size(a.data, 1)))
*(a::FooArray{T, 2, AT, VT} where {AT, VT}, b::AbstractVector{S}) where {T, S} = a.data * b
S'y - R'y
3-element Vector{Float64}:
0.0
0.0
0.0
However, this solution (which is also done in NamedArrays here) would require defining and maintaining all sorts of functions, not just the standard functions in base, adding more and more dependencies just because of this small error margin.
Is there any simpler way to get rid of this issue without redefining every operation and possibly many other functions from other packages?
I'm using Julia version 1.6.1 on Windows 64-bit system.
Yes, the implementation of matrix multiplication will vary depending upon your array type. The builtin Array will use BLAS, whereas your custom fooArray will use a generic implementation, and due to the non-associativity of floating point arithmetic, these different approaches will indeed yield different values — and note that they may be different from the ground truth, even for the builtin Arrays!
julia> using Random; Random.seed!(0); R = rand(100, 3); y = rand(100);
julia> R'y - Float64.(big.(R)'big.(y))
3-element Vector{Float64}:
-3.552713678800501e-15
0.0
0.0
You may be able to implement your custom array as a DenseArray, which will ensure that it uses the same (BLAS-enabled) codepath. You just need to implement a few more methods, most importantly strides and unsafe_convert:
julia> struct FooArray{T, N} <: DenseArray{T, N}
data::Array{T, N}
end
Base.getindex(fooarr::FooArray, i::Int) = fooarr.data[i]
Base.size(fooarr::FooArray) = size(fooarr.data)
Base.IndexStyle(::Type{<:FooArray}) = IndexLinear()
Base.strides(fooarr::FooArray) = strides(fooarr.data)
Base.unsafe_convert(P::Type{Ptr{T}}, fooarr::FooArray{T}) where {T} = Base.unsafe_convert(P, fooarr.data)
julia> R = rand(100, 3); S = FooArray(R); y = rand(100)
R'y - S'y
3-element Vector{Float64}:
0.0
0.0
0.0
julia> R = rand(100, 1000); S = FooArray(R); y = rand(100)
R'y == S'y
true

Two different types of arrays of arrays

I'm confused about the different types of arrays of arrays. Consider these two examples
a = Array{Float64}[]
push!(a,[1, 2])
push!(a,[3, 4])
push!(a,[1 2; 3 4])
b = Array[[1.0, 2.0], [3.0,4.0], [1.0 2.0; 3.0 4.0]]
I'm not sure how a and b differ. Suppose I intend to run a for loop over each of the elements in a or b, and multiple each element by 2. That is,
for i in 1:3 a[i] = a[i]*2 end
for i in 1:3 b[i] = b[i]*2 end
I time the run time of both lines respectively, but they are equally fast. Are a and b the same? If so, why does a even exist? It looks fairly complicated, because typeof(a) yields "Array{Array{Float64,N} where N,1}". What does where do here?
Both a and b are vectors but they allow different type of elements. You can check this by writing:
julia> typeof(a)
Array{Array{Float64,N} where N,1}
julia> typeof(b)
Array{Array,1}
Now Array isa parametric type taking two parameters. The first parameter is type of elements that it allows. The second parameter is the dimension. You can see that in both cases the dimension is 1 which means both a and b are vectors. You can also check it using the ndims function:
julia> ndims(a)
1
julia> ndims(b)
1
The first parameter is allowed element type. In the case of a it is Array{Float64,N} where N while in the case of b just Array is printed. Before I explain how to read them note that the first parameter can be extracted using the eltype function:
julia> eltype(a)
Array{Float64,N} where N
julia> eltype(b)
Array
You can see that both a and b allow Array to be stored. First let me explain how to read Array{Float64, N} where N. It means that a allows storing of arrays of Float64 of any dimension. Actually you could have written in a shorter way like this Array{Float64} as you can check that:
julia> (Array{Float64,N} where N) === Array{Float64}
true
The reason is that if you do not put a restriction on a tail parameter it can be dropped in syntax. The where N part is a restriction on the parameter. In this case there is no restriction on the second parameter.
Now we can turn to b. You see its eltype is just Array, so both parameters are dropped, thus there are no restrictions on them as explained above. So Array is the same as Array{T, N} where {T,N} as you can see here:
julia> (Array{T, N} where {T,N}) === Array
true
So the difference is that a can store arrays of any dimension but they must have Float64 element type, while b can store arrays of any dimension and any element type. The distinction, in this case, has no performance impact as you have noted, but will have an impact on what can be stored in a and b. Here are some examples.
In this case they work the same, as you try to store an Int in them, but they allow only arrays:
julia> a[1] = 1
ERROR: MethodError: Cannot `convert` an object of type
julia> b[1] = 1
ERROR: MethodError: Cannot `convert` an object of type
but here they differ:
julia> a[1] = ["a"]
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Float64
julia> b[1] = ["a"]
1-element Array{String,1}:
"a"
julia> a
3-element Array{Array{Float64,N} where N,1}:
[1.0, 2.0]
[3.0, 4.0]
[1.0 2.0; 3.0 4.0]
julia> b
3-element Array{Array,1}:
["a"]
[3.0, 4.0]
[1.0 2.0; 3.0 4.0]
As you can see you can store an array of String in b, but this is not allowed to be stored in a.
Two additional comments (both topics are a bit complex so I leave out the details, but just give you hints what is going on):
You are allowed not to specify element type of an array when defining it. In this case Julia will automatically pick its element type:
julia> [[1.0, 2.0], [3.0,4.0], [1.0 2.0; 3.0 4.0]]
3-element Array{Array{Float64,N} where N,1}:
[1.0, 2.0]
[3.0, 4.0]
[1.0 2.0; 3.0 4.0]
The performance impact of the choice of element type of an array depends on the fact:
if the type is abstract (that can be checked by isabstracttype; this impact type inference made by the compiler),
if it is bits type (that can be checked by isbitstype; this impacts the storage layout),
if the element type is Union (small unions can are handled more efficiently).
In your case both element types are abstract and non-bits so the performance will be the same.

Julia: rational behind array size and index for "extra" dimensions?

I am using Julia from time to time, however I am surprised by the following behavior:
Let's define an 3x4 array
julia> m=rand(3,4)
3×4 Array{Float64,2}:
0.889018 0.500847 0.539856 0.828231
0.492425 0.582958 0.521406 0.754102
0.28227 0.834333 0.669967 0.0939701
Now I check that
julia> size(m,1), size(m,2)
(3, 4)
as expected.
However, I am surprised by this:
julia> size(m,3), size(m,2018)
(1, 1)
-> I would have expected (0,0) or an error message
Looking the Julia code confirms this behavior:
size(t::AbstractArray{T,N}, d) where {T,N} = d <= N ? size(t)[d] : 1
Moreover:
julia> m[2,1,1,1,1]
0.4924252391289974
-> I would have expected an out of bounds error
So my question is: "what is the rationale?"
( I do not thing it is a bug, I use Julia version 0.6.2)
I believe it's for broadcasting.
julia> m=rand(3,4)
3×4 Array{Float64,2}:
0.139323 0.663912 0.994985 0.517332
0.423913 0.121753 0.0327054 0.0754665
0.392672 0.47006 0.351121 0.787318
julia> size(m)
(3, 4)
julia> n = rand(3)
3-element Array{Float64,1}:
0.716752
0.98755
0.661226
julia> m .* n
3×4 Array{Float64,2}:
0.09986 0.475861 0.713157 0.370799
0.418636 0.120237 0.0322983 0.074527
0.259645 0.310816 0.23217 0.520595
Notice that n is of one dimension less, so it's size 1 in the 2nd dimension and thus applies column-wise. Scalars in broadcast are treated differently and are generally inlined into the fused broadcasting function which you cannot do with a mutable type, so the size 1 = expand in higher dimensions rule for broadcast is a nice way to implement this.

Julia Approach to python equivalent list of lists

I just started tinkering with Julia and I'm really getting to like it. However, I am running into a road block. For example, in Python (although not very efficient or pythonic), I would create an empty list and append a list of a known size and type, and then convert to a NumPy array:
Python Snippet
a = []
for ....
a.append([1.,2.,3.,4.])
b = numpy.array(a)
I want to be able to do something similar in Julia, but I can't seem to figure it out. This is what I have so far:
Julia snippet
a = Array{Float64}[]
for .....
push!(a,[1.,2.,3.,4.])
end
The result is an n-element Array{Array{Float64,N},1} of size (n,), but I would like it to be an nx4 Array{Float64,2}.
Any suggestions or better way of doing this?
The literal translation of your code would be
# Building up as rows
a = [1. 2. 3. 4.]
for i in 1:3
a = vcat(a, [1. 2. 3. 4.])
end
# Building up as columns
b = [1.,2.,3.,4.]
for i in 1:3
b = hcat(b, [1.,2.,3.,4.])
end
But this isn't a natural pattern in Julia, you'd do something like
A = zeros(4,4)
for i in 1:4, j in 1:4
A[i,j] = j
end
or even
A = Float64[j for i in 1:4, j in 1:4]
Basically allocating all the memory at once.
Does this do what you want?
julia> a = Array{Float64}[]
0-element Array{Array{Float64,N},1}
julia> for i=1:3
push!(a,[1.,2.,3.,4.])
end
julia> a
3-element Array{Array{Float64,N},1}:
[1.0,2.0,3.0,4.0]
[1.0,2.0,3.0,4.0]
[1.0,2.0,3.0,4.0]
julia> b = hcat(a...)'
3x4 Array{Float64,2}:
1.0 2.0 3.0 4.0
1.0 2.0 3.0 4.0
1.0 2.0 3.0 4.0
It seems to match the python output:
In [9]: a = []
In [10]: for i in range(3):
a.append([1, 2, 3, 4])
....:
In [11]: b = numpy.array(a); b
Out[11]:
array([[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4]])
I should add that this is probably not what you actually want to be doing as the hcat(a...)' can be expensive if a has many elements. Is there a reason not to use a 2d array from the beginning? Perhaps more context to the question (i.e. the code you are actually trying to write) would help.
The other answers don't work if the number of loop iterations isn't known in advance, or assume that the underlying arrays being merged are one-dimensional. It seems Julia lacks a built-in function for "take this list of N-D arrays and return me a new (N+1)-D array".
Julia requires a different concatenation solution depending on the dimension of the underlying data. So, for example, if the underlying elements of a are vectors, one can use hcat(a) or cat(a,dims=2). But, if a is e.g a 2D array, one must use cat(a,dims=3), etc. The dims argument to cat is not optional, and there is no default value to indicate "the last dimension".
Here is a helper function that mimics the np.array functionality for this use case. (I called it collapse instead of array, because it doesn't behave quite the same way as np.array)
function collapse(x)
return cat(x...,dims=length(size(x[1]))+1)
end
One would use this as
a = []
for ...
... compute new_a...
push!(a,new_a)
end
a = collapse(a)

Fortran-like arrays such as FArray(Float64, -1:1,-7:7,-128:512) in Julia

Generally having 1-based array for Julia is a good decision, but sometimes it is desirable to have Fortran-like array with indices that span some subranges of ℤ:
julia> x = FArray(Float64, -1:1,-7:7,-128:512)
where it would be useful:
in the codes accompanying the book Numerical Solution of Hyperbolic Partial Differential Equations by prof. John A. Trangenstein these negative indices are used intensively for ghost cells for boundary conditions.
The same is true for Clawpack (stands for “Conservation Laws Package”) by prof. Randall J. LeVeque http://depts.washington.edu/clawpack/ and there are many other codes where such indices would be natural.
So such auxiliary class would be useful for speedy translation of such codes.
I just started to implement such an auxiliary type but as I'm quite new to Julia your help would be greatly appreciated.
I started with:
type FArray
ranges
array::Array
function FArray(T, r::Range1{Int}...)
dims = map((x) -> length(x), r)
array = Array(T, dims)
new(r, array)
end
end
Output:
julia> include ("FortranArray.jl")
julia> x = FArray(Float64, -1:1,-7:7,-128:512)
FArray((-1:1,-7:7,-128:512),3x15x641 Array{Float64,3}:
[:, :, 1] =
6.90321e-310 2.6821e-316 1.96042e-316 0.0 0.0 0.0 9.84474e-317 … 1.83233e-316 2.63285e-316 0.0 9.61618e-317 0.0
6.90321e-310 6.32404e-322 2.63285e-316 0.0 0.0 0.0 2.63292e-316 2.67975e-316
...
[:, :, 2] =
...
As I'm completely new to Julia any recommendations especially that lead to more efficient would be greatly appreciated.
[Edit]
The topic has been discussed here:
https://groups.google.com/forum/#!topic/julia-dev/NOF6MA6tb9Y
During the discussion two ways to have Julia arrays with arbitrary base were elaborated:
SubArray-based, sample usage is with a helper function:
function farray(T, r::Range1{Int64}...)
dims = map((x) -> length(x), r)
array = Array(T, dims)
sub_indices = map((x) -> -minimum(x) + 2 : maximum(x), r)
sub(array, sub_indices)
end
julia> y[-1,-7,-128] = 777
777
julia> y[-1,-7,-128] + 33
810.0
julia> y[-2,-7,-128]
ERROR: BoundsError()
in getindex at subarray.jl:200
julia> y[2,-7,-128]
2.3977385e-316
Please note, that bounds are not checked fully more details are here:
https://github.com/JuliaLang/julia/issues/4044
At the moment SubArray has performance issues but eventually its performance might be improved, see also:
https://github.com/JuliaLang/julia/issues/5117
https://github.com/JuliaLang/julia/issues/3496
Another approach that has better performance at the moment, besides checks both bounds:
type FArray{T<:Number, N, A<:AbstractArray} <: AbstractArray
ranges
offsets::NTuple{N,Int}
array::A
function FArray(r::Range1{Int}...)
dims = map((x) -> length(x), r)
array = Array(T, dims)
offs = map((x) -> 1 - minimum(x), r)
new(r, offs, array)
end
end
FArray(T, r::Range1{Int}...) = FArray{T, length(r,), Array{T, length(r,)}}(r...)
getindex{T<:Number}(FA::FArray{T}, i1::Int) = FA.array[i1+FA.offsets[1]]
getindex{T<:Number}(FA::FArray{T}, i1::Int, i2::Int) = FA.array[i1+FA.offsets[1], i2+FA.offsets[2]]
getindex{T<:Number}(FA::FArray{T}, i1::Int, i2::Int, i3::Int) = FA.array[i1+FA.offsets[1], i2+FA.offsets[2], i3+FA.offsets[3]]
setindex!{T}(FA::FArray{T}, x, i1::Int) = arrayset(FA.array, convert(T,x), i1+FA.offsets[1])
setindex!{T}(FA::FArray{T}, x, i1::Int, i2::Int) = arrayset(FA.array, convert(T,x), i1+FA.offsets[1], i2+FA.offsets[2])
setindex!{T}(FA::FArray{T}, x, i1::Int, i2::Int, i3::Int) = arrayset(FA.array, convert(T,x), i1+FA.offsets[1], i2+FA.offsets[2], i3+FA.offsets[3])
getindex and setindex! methods for FArray were inspired by base/array.jl code.
Use cases:
julia> y = FArray(Float64, -1:1,-7:7,-128:512);
julia> y[-1,-7,-128] = 777
777
julia> y[-1,-7,-128] + 33
810.0
julia> y[-1,2,3]
0.0
julia> y[-2,-7,-128]
ERROR: BoundsError()
in getindex at FortranArray.jl:27
julia> y[2,-7,-128]
ERROR: BoundsError()
in getindex at FortranArray.jl:27
There are now two packages that provide this kind of functionality. For arrays with arbitrary start indices, see https://github.com/alsam/OffsetArrays.jl. For even more flexibility see https://github.com/mbauman/AxisArrays.jl, where indices do not have to be integers.

Resources