Initialize array of arrays for construction on the fly - arrays

How do you declare an array that contain arrays in julia?
I have a=Int32[] which creates an empty array of Int32 (of course), but I would like later to construct on the fly something like
if ...
push!(a, [r,s]) # (*)
...
where r and s are integers. I tried a=Int32[Int32[]] but it does not work when doing (*). I don't have the specific shape of a, so I need to declare it without this restriction.

Int32[] creates a Vector{Int32} which is a Vector with element type Int32. You want a Vector with element type Vector{Int32}, so you can use Vector{Vector{Int32}}() or Vector{Int32}[]. Note that Vector{T} is an alias for Array{T,1}, aka an Array with element type T and 1 dimension, so when Julia prints out the type, it won't use the word Vector.
julia> v=Vector{Vector{Int32}}()
0-element Array{Array{Int32,1},1}
julia> push!(v,[1,2,3])
1-element Array{Array{Int32,1},1}:
Int32[1, 2, 3]
or
julia> x=Vector{Int32}[]
0-element Array{Array{Int32,1},1}
julia> push!(x,[4,5,6])
1-element Array{Array{Int32,1},1}:
Int32[4, 5, 6]

Related

Two different types of arrays of arrays

I'm confused about the different types of arrays of arrays. Consider these two examples
a = Array{Float64}[]
push!(a,[1, 2])
push!(a,[3, 4])
push!(a,[1 2; 3 4])
b = Array[[1.0, 2.0], [3.0,4.0], [1.0 2.0; 3.0 4.0]]
I'm not sure how a and b differ. Suppose I intend to run a for loop over each of the elements in a or b, and multiple each element by 2. That is,
for i in 1:3 a[i] = a[i]*2 end
for i in 1:3 b[i] = b[i]*2 end
I time the run time of both lines respectively, but they are equally fast. Are a and b the same? If so, why does a even exist? It looks fairly complicated, because typeof(a) yields "Array{Array{Float64,N} where N,1}". What does where do here?
Both a and b are vectors but they allow different type of elements. You can check this by writing:
julia> typeof(a)
Array{Array{Float64,N} where N,1}
julia> typeof(b)
Array{Array,1}
Now Array isa parametric type taking two parameters. The first parameter is type of elements that it allows. The second parameter is the dimension. You can see that in both cases the dimension is 1 which means both a and b are vectors. You can also check it using the ndims function:
julia> ndims(a)
1
julia> ndims(b)
1
The first parameter is allowed element type. In the case of a it is Array{Float64,N} where N while in the case of b just Array is printed. Before I explain how to read them note that the first parameter can be extracted using the eltype function:
julia> eltype(a)
Array{Float64,N} where N
julia> eltype(b)
Array
You can see that both a and b allow Array to be stored. First let me explain how to read Array{Float64, N} where N. It means that a allows storing of arrays of Float64 of any dimension. Actually you could have written in a shorter way like this Array{Float64} as you can check that:
julia> (Array{Float64,N} where N) === Array{Float64}
true
The reason is that if you do not put a restriction on a tail parameter it can be dropped in syntax. The where N part is a restriction on the parameter. In this case there is no restriction on the second parameter.
Now we can turn to b. You see its eltype is just Array, so both parameters are dropped, thus there are no restrictions on them as explained above. So Array is the same as Array{T, N} where {T,N} as you can see here:
julia> (Array{T, N} where {T,N}) === Array
true
So the difference is that a can store arrays of any dimension but they must have Float64 element type, while b can store arrays of any dimension and any element type. The distinction, in this case, has no performance impact as you have noted, but will have an impact on what can be stored in a and b. Here are some examples.
In this case they work the same, as you try to store an Int in them, but they allow only arrays:
julia> a[1] = 1
ERROR: MethodError: Cannot `convert` an object of type
julia> b[1] = 1
ERROR: MethodError: Cannot `convert` an object of type
but here they differ:
julia> a[1] = ["a"]
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Float64
julia> b[1] = ["a"]
1-element Array{String,1}:
"a"
julia> a
3-element Array{Array{Float64,N} where N,1}:
[1.0, 2.0]
[3.0, 4.0]
[1.0 2.0; 3.0 4.0]
julia> b
3-element Array{Array,1}:
["a"]
[3.0, 4.0]
[1.0 2.0; 3.0 4.0]
As you can see you can store an array of String in b, but this is not allowed to be stored in a.
Two additional comments (both topics are a bit complex so I leave out the details, but just give you hints what is going on):
You are allowed not to specify element type of an array when defining it. In this case Julia will automatically pick its element type:
julia> [[1.0, 2.0], [3.0,4.0], [1.0 2.0; 3.0 4.0]]
3-element Array{Array{Float64,N} where N,1}:
[1.0, 2.0]
[3.0, 4.0]
[1.0 2.0; 3.0 4.0]
The performance impact of the choice of element type of an array depends on the fact:
if the type is abstract (that can be checked by isabstracttype; this impact type inference made by the compiler),
if it is bits type (that can be checked by isbitstype; this impacts the storage layout),
if the element type is Union (small unions can are handled more efficiently).
In your case both element types are abstract and non-bits so the performance will be the same.

Julia: Initializing numeric arrays of different types

I am trying to build a two element array in Julia, where each sub-array has a different type (one is a vector of Int64s, the other is an array of Float32s).
The code belows automatically converts the element that I want to be an Int64 into a Float32, which is what I don't want:
my_multitype_array = [ collect(1:5), rand(Float32,3) ]
The resulting array automatically converts the Int64s in the first array (defined via collect(1:5)) into a Float32, and the resulting my_multitype_array has type 2-element Array{Array{Float32,1}}. How do I force it to make the first sub-array remain Int64s? Do I need to perhaps pre-define my_multitype_array to be an empty array with two elements of the desired types, before filling it out with values?
And finally, once I do have the desired array with different types, how would I refer to it, when pre-stating its type in a function? See below for what I mean:
function foo_function(first_scalar_arg::Float32, multiple_array_arg::Array{Array{Float32,1}})
# do stuff
return
end
Instead of ::Array{Array{Float32,1}}, would I write ::Array{Array{Any,1}} or something?
I think that the following code matches better what was asked in the question:
julia> a = Union{Array{Int},Array{Float64}}[[1,2,3],rand(2,2)]
2-element Array{Union{Array{Float64,N} where N, Array{Int64,N} where N},1}:
[1, 2, 3]
[0.834902264215698 0.42258382777543124; 0.5856562680004389 0.6654033155981287]
This creates an actual data structure which knows that it contains either Float64 or Int arrays.
Some usage
julia> a[1]
3-element Array{Int64,1}:
1
2
3
julia> a[2]
2×2 Array{Float64,2}:
0.834902 0.422584
0.585656 0.665403
And manipulating the structure:
julia> push!(a, [1, 1]); #works
julia> push!(a, [true, false]);
ERROR: MethodError: no method matching Union{Array{Float64,N} where N, Array{Int64,N} where N}(::Array{Bool,1})
How to instantiate a vector of different types:
If you type the vector in a terminal, it will be promoted to the largest common type:
julia> [[1], [1.0]]
2-element Array{Array{Float64,1},1}:
[1.0]
[1.0]
The reason for that is that you don't specify the type of the outer vector, so Julia will try to infer the type based on the contents. More specific types are always more efficient, so if the vector types can be converted to a single type that can represent all the inner vectors, this will be done (through the promote mechanism). To avoid it, you need to manually specify the outer vector type e.g.:
julia> Any[[1], [1.0]]
2-element Array{Any,1}:
[1]
[1.0]
How to refer to vectors of differently-typed vectors
When you think about it, "vectors of differently-typed vectors" is not a single type, but an infinite set of types. These kind of types are called "unionall types" in Julia, and are represented by the where keyword. In this case, you want Vector{T} where T <: Vector.
But wait! Then how come:
julia> Any[[1], [1.0]] isa Vector{T} where T <: Vector
false
Well, a vector that can contain any element is not really a vector of vectors. So here you have two options:
Either relax your function signature by just removing the type annotations or relatixing them significantly (this is preferred because the value you pass in may actually be a vector of vectors even if its type is e.g. Vector{Any}):
function foo_function(first_scalar_arg, multiple_array_arg::AbstractArray)
# do stuff
return
end
Or else be vigilant that you make sure to construct a "vector of vectors" initially:
julia> Vector[[1], [1.0]]
2-element Array{Array{T,1} where T,1}:
[1]
[1.0]
julia> Vector[[1], [1.0]] isa Vector{T} where T <: Vector
true
To expand a little on #Przemyslaw Szufel's answer...
Creating vectors with elements of mixed types is tricky, as you've seen, since the literal array constructor attempts to promote the elements to a common type. There is a special syntax to get around that, which is described in the manual here.
In your case, you can construct your vector of vectors as follows:
julia> Union{Vector{Int64}, Vector{Float32}}[[1, 2], [1.0f0, 2.0f0]]
2-element Array{Union{Array{Float32,1}, Array{Int64,1}},1}:
[1, 2]
Float32[1.0, 2.0]
The prefix to the literal array constructor specifies the element type of the array. So in this case, the element type of the vector is constrained to be
Union{Vector{Int64}, Vector{Float32}}
In other words, the elements of the outer vector must be either vectors of Int64 or vectors of Float32.

List of different sized arrays in Julia

I am trying to return an array of different sized arrays in a Julia function.
In the function, the arrays will be initialized and, in a loop, they will have elements, that are other arrays, pushed to end of the array at each iteration. But I am getting the following error:
MethodError: no method matching push!(::Type{Array{Array{Float64,1},1}}, ::Array{Float64,1})
I am initializing an array of arrays:
x = Array{Array{Float64,1},1}
But when a push! other array, I get the error:
push!(x, y)
In python I would just append the new arrays to a list and return the list, how can I accomplish it in Julia?
Your statement:
julia> x = Array{Array{Float64,1},1}
Array{Array{Float64,1},1}
assigns to x name of the type.
In order to create an instance of this type add () after it:
julia> x = Array{Array{Float64,1},1}()
0-element Array{Array{Float64,1},1}
and now you can push! to it:
julia> push!(x, [2.5, 3.5])
1-element Array{Array{Float64,1},1}:
[2.5, 3.5]
Note that you could have initiated x with an empty vector accepting vectors of Float64 in the following way:
julia> x = Vector{Float64}[]
0-element Array{Array{Float64,1},1}
We use two features here:
Vector{Float64} is a shorthand for Array{Float64, 1}.
If you create an empty vector using [] syntax you can prepend a type of its elements in front of it just like I did in the example.

Are Julia arrays homogeneous or not?

On the one hand, if I have x=[1,2,3], then I cannot add "foo" to x, but if I start with x=[1,2,3,"foo"], then the union type is Any, and I can add whatever I want to my array. Is it correct to say that Julia arrays are homogeneous? Because I can just create array of Any.
Julia will by default restrict a given array to as specific an element type (eltype) as possible. However, julia has a special syntax to make an array with whatever eltype you like. So, to create an array with eltype T, you just write T[x, y, z] instead of [x, y, z]. For your example, this would be
julia> v = Any[1,2,3]
p3-element Array{Any,1}:
1
2
3
julia> push!(v, "foo")
4-element Array{Any,1}:
1
2
3
"foo"
The reason for this behaviour is that if we can restrict an array to be of a specific, concrete type, there are some major performance optimizations that can be made. If you have an array with eltype Any, then the contents cannot be densely packed in memory.

Convert array of colors to array of floats

I have an array of colors, which I want to convert to a matrix of numbers:
using Colors
cols = [RGB{Float64}(rand(), rand(), rand()) for i in 1:6]
6-element Array{ColorTypes.RGB{Float64},1}:
RGB{Float64}(0.836012,0.505908,0.249548)
RGB{Float64}(0.383172,0.105153,0.361422)
RGB{Float64}(0.680616,0.974232,0.942787)
RGB{Float64}(0.804829,0.825503,0.990222)
RGB{Float64}(0.0404051,0.569093,0.772053)
RGB{Float64}(0.872298,0.704112,0.473588)
converted to:
6×3 Array{Float64,2}:
0.836012 0.505908 0.249548
0.383172 0.105153 0.361422
0.680616 0.974232 0.942787
0.804829 0.825503 0.990222
0.0404051 0.569093 0.772053
0.872298 0.704112 0.473588
How would I do that?
Use reinterpret. It "constructs an array with the same binary data as the given array, but with the specified element type." That means that it reads the data in the same order — and remember that Julia is column major. It also doesn't know what shape the returned array should be, so by default it's just a vector:
julia> reinterpret(Float64, cols)
18-element Array{Float64,1}:
0.836012
0.505908
0.249548
0.383172
0.105153
⋮
You can see that it's pulled out the floating point values and placed them all in a flat vector [c₁,c₂] becomes [r₁, g₁, b₁, r₂, g₂, b₂]. So you want to first get a 3x6 array that respects this structure:
julia> fs = reinterpret(Float64, cols, (3, length(cols)))
3x6 Array{Float64,2}:
0.836012 0.383172 0.680616 0.804829 0.0404051 0.872298
0.505908 0.105153 0.974232 0.825503 0.569093 0.704112
0.249548 0.361422 0.942787 0.990222 0.772053 0.473588
Now you can get to the shape you want by taking the transpose if you need it:
julia> fs'
6x3 Array{Float64,2}:
0.836012 0.505908 0.249548
0.383172 0.105153 0.361422
0.680616 0.974232 0.942787
0.804829 0.825503 0.990222
0.0404051 0.569093 0.772053
0.872298 0.704112 0.473588
One way is:
[j(cols[i]) for i=1:6,j in [red,green,blue]]

Resources