On the one hand, if I have x=[1,2,3], then I cannot add "foo" to x, but if I start with x=[1,2,3,"foo"], then the union type is Any, and I can add whatever I want to my array. Is it correct to say that Julia arrays are homogeneous? Because I can just create array of Any.
Julia will by default restrict a given array to as specific an element type (eltype) as possible. However, julia has a special syntax to make an array with whatever eltype you like. So, to create an array with eltype T, you just write T[x, y, z] instead of [x, y, z]. For your example, this would be
julia> v = Any[1,2,3]
p3-element Array{Any,1}:
1
2
3
julia> push!(v, "foo")
4-element Array{Any,1}:
1
2
3
"foo"
The reason for this behaviour is that if we can restrict an array to be of a specific, concrete type, there are some major performance optimizations that can be made. If you have an array with eltype Any, then the contents cannot be densely packed in memory.
Related
In Julia, if I make x = rand(10,2), then
>> typeof(x)
Matrix{Float64} (alias for Array{Float64, 2})
How do I access the type parameters, i.e., how do I obtain that the array x is an array of Float64 and 2?
(BTW: You are not looking for a 'subtype', as your title says, but for 'type parameters'.)
The element type is easy to get with eltype:
jl> eltype(x)
Float64
The dimensionality can be retrieved from the size of the array:
jl> length(size(x))
2
Edit: Better to use ndims:
jl> ndims(x)
2
If you don't have access to x itself, but only its type, eltype still works:
jl> T = typeof(x);
jl> eltype(T)
Float64
The dimensionality is a bit more difficult. You can inspect the properties of the type variable (but I don't recommend that, since this is an internal implementation detail of the type, and may not be stable):
jl> T.parameters
svec(Float64, 2)
jl> T.parameters[2]
A better way is probably to make a function to get this for you:
jl> dims(::Type{<:AbstractArray{T, N}}) where {T, N} = N
dims (generic function with 3 methods)
jl> dims(T)
2
Edit: You can use ndims with type variables too:
jl> ndims(T)
2
So, actually, the answer is: eltype for the element type, and ndims for the dimensionality, both when you have an array and when you have the type of an array.
I'm confused about the different types of arrays of arrays. Consider these two examples
a = Array{Float64}[]
push!(a,[1, 2])
push!(a,[3, 4])
push!(a,[1 2; 3 4])
b = Array[[1.0, 2.0], [3.0,4.0], [1.0 2.0; 3.0 4.0]]
I'm not sure how a and b differ. Suppose I intend to run a for loop over each of the elements in a or b, and multiple each element by 2. That is,
for i in 1:3 a[i] = a[i]*2 end
for i in 1:3 b[i] = b[i]*2 end
I time the run time of both lines respectively, but they are equally fast. Are a and b the same? If so, why does a even exist? It looks fairly complicated, because typeof(a) yields "Array{Array{Float64,N} where N,1}". What does where do here?
Both a and b are vectors but they allow different type of elements. You can check this by writing:
julia> typeof(a)
Array{Array{Float64,N} where N,1}
julia> typeof(b)
Array{Array,1}
Now Array isa parametric type taking two parameters. The first parameter is type of elements that it allows. The second parameter is the dimension. You can see that in both cases the dimension is 1 which means both a and b are vectors. You can also check it using the ndims function:
julia> ndims(a)
1
julia> ndims(b)
1
The first parameter is allowed element type. In the case of a it is Array{Float64,N} where N while in the case of b just Array is printed. Before I explain how to read them note that the first parameter can be extracted using the eltype function:
julia> eltype(a)
Array{Float64,N} where N
julia> eltype(b)
Array
You can see that both a and b allow Array to be stored. First let me explain how to read Array{Float64, N} where N. It means that a allows storing of arrays of Float64 of any dimension. Actually you could have written in a shorter way like this Array{Float64} as you can check that:
julia> (Array{Float64,N} where N) === Array{Float64}
true
The reason is that if you do not put a restriction on a tail parameter it can be dropped in syntax. The where N part is a restriction on the parameter. In this case there is no restriction on the second parameter.
Now we can turn to b. You see its eltype is just Array, so both parameters are dropped, thus there are no restrictions on them as explained above. So Array is the same as Array{T, N} where {T,N} as you can see here:
julia> (Array{T, N} where {T,N}) === Array
true
So the difference is that a can store arrays of any dimension but they must have Float64 element type, while b can store arrays of any dimension and any element type. The distinction, in this case, has no performance impact as you have noted, but will have an impact on what can be stored in a and b. Here are some examples.
In this case they work the same, as you try to store an Int in them, but they allow only arrays:
julia> a[1] = 1
ERROR: MethodError: Cannot `convert` an object of type
julia> b[1] = 1
ERROR: MethodError: Cannot `convert` an object of type
but here they differ:
julia> a[1] = ["a"]
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Float64
julia> b[1] = ["a"]
1-element Array{String,1}:
"a"
julia> a
3-element Array{Array{Float64,N} where N,1}:
[1.0, 2.0]
[3.0, 4.0]
[1.0 2.0; 3.0 4.0]
julia> b
3-element Array{Array,1}:
["a"]
[3.0, 4.0]
[1.0 2.0; 3.0 4.0]
As you can see you can store an array of String in b, but this is not allowed to be stored in a.
Two additional comments (both topics are a bit complex so I leave out the details, but just give you hints what is going on):
You are allowed not to specify element type of an array when defining it. In this case Julia will automatically pick its element type:
julia> [[1.0, 2.0], [3.0,4.0], [1.0 2.0; 3.0 4.0]]
3-element Array{Array{Float64,N} where N,1}:
[1.0, 2.0]
[3.0, 4.0]
[1.0 2.0; 3.0 4.0]
The performance impact of the choice of element type of an array depends on the fact:
if the type is abstract (that can be checked by isabstracttype; this impact type inference made by the compiler),
if it is bits type (that can be checked by isbitstype; this impacts the storage layout),
if the element type is Union (small unions can are handled more efficiently).
In your case both element types are abstract and non-bits so the performance will be the same.
I am trying to build a two element array in Julia, where each sub-array has a different type (one is a vector of Int64s, the other is an array of Float32s).
The code belows automatically converts the element that I want to be an Int64 into a Float32, which is what I don't want:
my_multitype_array = [ collect(1:5), rand(Float32,3) ]
The resulting array automatically converts the Int64s in the first array (defined via collect(1:5)) into a Float32, and the resulting my_multitype_array has type 2-element Array{Array{Float32,1}}. How do I force it to make the first sub-array remain Int64s? Do I need to perhaps pre-define my_multitype_array to be an empty array with two elements of the desired types, before filling it out with values?
And finally, once I do have the desired array with different types, how would I refer to it, when pre-stating its type in a function? See below for what I mean:
function foo_function(first_scalar_arg::Float32, multiple_array_arg::Array{Array{Float32,1}})
# do stuff
return
end
Instead of ::Array{Array{Float32,1}}, would I write ::Array{Array{Any,1}} or something?
I think that the following code matches better what was asked in the question:
julia> a = Union{Array{Int},Array{Float64}}[[1,2,3],rand(2,2)]
2-element Array{Union{Array{Float64,N} where N, Array{Int64,N} where N},1}:
[1, 2, 3]
[0.834902264215698 0.42258382777543124; 0.5856562680004389 0.6654033155981287]
This creates an actual data structure which knows that it contains either Float64 or Int arrays.
Some usage
julia> a[1]
3-element Array{Int64,1}:
1
2
3
julia> a[2]
2×2 Array{Float64,2}:
0.834902 0.422584
0.585656 0.665403
And manipulating the structure:
julia> push!(a, [1, 1]); #works
julia> push!(a, [true, false]);
ERROR: MethodError: no method matching Union{Array{Float64,N} where N, Array{Int64,N} where N}(::Array{Bool,1})
How to instantiate a vector of different types:
If you type the vector in a terminal, it will be promoted to the largest common type:
julia> [[1], [1.0]]
2-element Array{Array{Float64,1},1}:
[1.0]
[1.0]
The reason for that is that you don't specify the type of the outer vector, so Julia will try to infer the type based on the contents. More specific types are always more efficient, so if the vector types can be converted to a single type that can represent all the inner vectors, this will be done (through the promote mechanism). To avoid it, you need to manually specify the outer vector type e.g.:
julia> Any[[1], [1.0]]
2-element Array{Any,1}:
[1]
[1.0]
How to refer to vectors of differently-typed vectors
When you think about it, "vectors of differently-typed vectors" is not a single type, but an infinite set of types. These kind of types are called "unionall types" in Julia, and are represented by the where keyword. In this case, you want Vector{T} where T <: Vector.
But wait! Then how come:
julia> Any[[1], [1.0]] isa Vector{T} where T <: Vector
false
Well, a vector that can contain any element is not really a vector of vectors. So here you have two options:
Either relax your function signature by just removing the type annotations or relatixing them significantly (this is preferred because the value you pass in may actually be a vector of vectors even if its type is e.g. Vector{Any}):
function foo_function(first_scalar_arg, multiple_array_arg::AbstractArray)
# do stuff
return
end
Or else be vigilant that you make sure to construct a "vector of vectors" initially:
julia> Vector[[1], [1.0]]
2-element Array{Array{T,1} where T,1}:
[1]
[1.0]
julia> Vector[[1], [1.0]] isa Vector{T} where T <: Vector
true
To expand a little on #Przemyslaw Szufel's answer...
Creating vectors with elements of mixed types is tricky, as you've seen, since the literal array constructor attempts to promote the elements to a common type. There is a special syntax to get around that, which is described in the manual here.
In your case, you can construct your vector of vectors as follows:
julia> Union{Vector{Int64}, Vector{Float32}}[[1, 2], [1.0f0, 2.0f0]]
2-element Array{Union{Array{Float32,1}, Array{Int64,1}},1}:
[1, 2]
Float32[1.0, 2.0]
The prefix to the literal array constructor specifies the element type of the array. So in this case, the element type of the vector is constrained to be
Union{Vector{Int64}, Vector{Float32}}
In other words, the elements of the outer vector must be either vectors of Int64 or vectors of Float32.
I have an array
array1 = Array{Int,2}(undef, 2, 3)
Is there a way to quickly make a new array that's the same size as the first one? E.g. something like
array2 = Array{Int,2}(undef, size(array1))
current I have to do this which is pretty cumbersome, and even worse for higher dimension arrays
array2 = Array{Int,2}(undef, size(array1)[1], size(array1)[2])
What you're looking for is similar(array1).
You can even change up the array type by passing in a type, e.g.
similar(array1, Float64)
similar(array1, Int64)
Using similar is a great solution. But the reason your original attempt doesn't work is the number 2 in the type parameter signature: Array{Int, 2}. The number 2 specifies that the array must have 2 dimensions. If you remove it you can have exactly as many dimensions as you like:
julia> a = rand(2,4,3,2);
julia> b = Array{Int}(undef, size(a));
julia> size(b)
(2, 4, 3, 2)
This works for other array constructors too:
zeros(size(a))
ones(size(a))
fill(5, size(a))
# etc.
How do you declare an array that contain arrays in julia?
I have a=Int32[] which creates an empty array of Int32 (of course), but I would like later to construct on the fly something like
if ...
push!(a, [r,s]) # (*)
...
where r and s are integers. I tried a=Int32[Int32[]] but it does not work when doing (*). I don't have the specific shape of a, so I need to declare it without this restriction.
Int32[] creates a Vector{Int32} which is a Vector with element type Int32. You want a Vector with element type Vector{Int32}, so you can use Vector{Vector{Int32}}() or Vector{Int32}[]. Note that Vector{T} is an alias for Array{T,1}, aka an Array with element type T and 1 dimension, so when Julia prints out the type, it won't use the word Vector.
julia> v=Vector{Vector{Int32}}()
0-element Array{Array{Int32,1},1}
julia> push!(v,[1,2,3])
1-element Array{Array{Int32,1},1}:
Int32[1, 2, 3]
or
julia> x=Vector{Int32}[]
0-element Array{Array{Int32,1},1}
julia> push!(x,[4,5,6])
1-element Array{Array{Int32,1},1}:
Int32[4, 5, 6]