I have an array of multi-dim arrays Array{Array{Float64,3},1} and what I want is a single 4 dimensional array Array{Float64,4}.
I have gone through the other responses
concatenate array in julia
Concatenating arrays in Julia
Multidimensional Array Comprehension in Julia
But no combination of cat and reshape seems to do the trick.
There must be a good idiomatic way... what is it?
Your answer is correct and generic. Note, however, that assuming your inner arrays have the same size (not just same dimensionality), there is also the following faster way:
julia> matrix = [rand(1,2,3) for _ in 1:4]; # some test data
julia> #btime a = cat($matrix..., dims=4); # your solution
11.519 μs (80 allocations: 3.83 KiB)
julia> #btime b = reshape(collect(Iterators.flatten($matrix)), (1,2,3,4)); # much faster solution
611.960 ns (55 allocations: 2.27 KiB)
julia> a == b
true
Sorry to bother you, I figured it out soon after posting
julia> typeof(matrix)
Array{Array{Float64,3},1}
julia> typeof(matrix[1])
Array{Float64,3}
julia> typeof(cat(matrix...,dims=4))
Array{Float64,4}
Related
Related:
How to convert from string to array?
This is a follow-up question. How would I make a list of all the digits in this number (currently as a string)?
"123" -> [1,2,3]
There are no delimiters here so how should I go about doing this?
Note as of now I am using the latest version of Julia, v1.8.3 so parse doesn't seem to work in the other question's answers. Error when I use parse():
ERROR: LoadError: MethodError: no method matching parse(::SubString{String})
Closest candidates are:
parse(::Type{T}, ::AbstractString) where T<:Complex at parse.jl:381
parse(::Type{Sockets.IPAddr}, ::AbstractString) at ~/usr/share/julia/stdlib/v1.8/Sockets/src/IPAddr.jl:246
parse(::Type{T}, ::AbstractChar; base) where T<:Integer at parse.jl:40
...
Stacktrace:
[1] iterate
# ./generator.jl:47 [inlined]
[2] _collect
# ./array.jl:807 [inlined]
[3] collect_similar
# ./array.jl:716 [inlined]
[4] map
# ./abstractarray.jl:2933 [inlined]
[5] top-level scope
# ~/proc/self/fd/0:1
in expression starting at /proc/self/fd/0:1
exit status 1
Easy peasy like this:
function str2vec(s::String)
return map(x->parse(Int,x), split(s,""))
end
julia> str2vec("124")
3-element Vector{Int64}:
1
2
4
Or by broadcasting:
julia> parse.(Int, split("124",""))
3-element Vector{Int64}:
1
2
4
By piping functions:
julia> "124" |> x->split(x, "") |> x->parse.(Int, x)
3-element Vector{Int64}:
1
2
4
Utilizing the eachsplit function, which is a lazy function and returns a generator object (introduced in Julia 1.8):
julia> eachsplit("124", "") |> x->parse.(Int, x)
3-element Vector{Int64}:
1
2
4
According to Dan's advice, you try another ways:
Using the Int8 on the collected chars:
julia> Int8.(collect("124")).-48
3-element Vector{Int64}:
1
2
4
Using the Iterators.map:
julia> collect(Iterators.map(x->Int8(x)-48,"124"))
3-element Vector{Int64}:
1
2
4
Also, one can consider the DNF's proposal:
julia> [Int(x)-48 for x in "124"]
3-element Vector{Int64}:
1
2
4
Benchmarking
julia> using BenchmarkTools
julia> #btime str2vec("124");
#btime parse.(Int, split("124",""));
#btime "124" |> x->split(x, "") |> x->parse.(Int, x);
#btime eachsplit("124", "") |> x->parse.(Int, x);
#btime Int8.(collect("124")).-48;
#btime collect(Iterators.map(x->Int8(x)-48,"123"));
#btime [Int(x)-48 for x in "123"]
681.250 ns (11 allocations: 864 bytes)
675.460 ns (11 allocations: 864 bytes)
679.747 ns (11 allocations: 864 bytes)
1.280 μs (14 allocations: 816 bytes)
92.412 ns (2 allocations: 160 bytes)
61.711 ns (1 allocation: 80 bytes)
45.152 ns (1 allocation: 80 bytes)
You can also use the inbuilt digits function.
By default, it returns the digits last-to-first:
julia> digits(parse(Int, "1234"))
4-element Vector{Int64}:
4
3
2
1
You can reverse! the result if you want them in the same order as in the string:
julia> digits(parse(Int, "1234")) |> reverse!
4-element Vector{Int64}:
1
2
3
4
This runs much faster than parseing each digit individually. The Int8(...) .- 48 method is still faster, but it fails silently if the input string happens to be invalid, which could be dangerous further down the line. Since we're using parse here, this method reports the error correctly in such cases.
julia> Int8.(collect("invalid")).-48
7-element Vector{Int64}:
57
62
70
49
60
57
52
julia> digits(parse(Int, "invalid")) |> reverse!
ERROR: ArgumentError: invalid base 10 digit 'i' in "invalid"
Both other answers are very good, but they have forgotten about comprehensions. Using a comprehension gives both the fastest safe solution, and the absolute fastest solution, tied with the Iterators.map.
Fastest unsafe (based on the answer by #Shayan with input from #DanGetz):
julia> #btime [Int(c)-48 for c in "123"]
34.372 ns (1 allocation: 80 bytes)
3-element Vector{Int64}:
1
2
3
The above will silently return the wrong answer for invalid inputs, as noted by #SundarR.
Here's an even nicer and more intuitive version of the above, which is the same under the hood:
[c - '0' for c in "123"]
It works because Int('0') equals 48, and subtraction of Chars yields an Int.
Fastest safe solution (based on #SundarR's answer):
julia> #btime [parse(Int, c) for c in "123"]
47.822 ns (1 allocation: 80 bytes)
3-element Vector{Int64}:
1
2
3
julia> [parse(Int, c) for c in "invalid"]
ERROR: ArgumentError: invalid base 10 digit 'i'
I would probably recommend the latter in most cases.
One more thing you may or may not be aware of: You can create a generator instead of a vector, in case you don't actually need the vector itself, but want to iterate over the converted numbers for some other purpose. The syntax is almost identical to an array comprehension, just use () instead:
g = (parse(Int, c) for c in "123")
for val in g
println(val, " squared equals ", val^2)
end
1 squared equals 1
2 squared equals 4
3 squared equals 9
This will not allocate an intermediate temporary vector, and creating the generator is essentially free:
julia> #btime (parse(Int, c) for c in "123")
1.900 ns (0 allocations: 0 bytes)
The computational cost is paid during iteration instead. This is similar to using Iterators.map without collect, but arguably has nicer syntax.
If I have a vector of sets, say,
vec_of_sets = [Set(vec1), Set(vec2), ..., Set(vecp)]
how do I obtain a set equal to the union of sets in the vector? That is, how can I write the following efficiently?
S1 = Set(vec1);
union!(S1, Set(vec2))
union!(S1, Set(vec3))
...
union!(S1, Set(vecp))
I don't really know where to start!
Thanks in advance.
Edit: I have tried a solution using generating functions but it doesn't work:
union(j for j in vec_of_sets)
The best and fastest approach is:
Set(Iterators.flatten(vec_of_sets))
It is around twice as fast as other possible approaches proposed at the other post and has makes than half memory allocations.
Here are some benchmarks:
julia> v = [Set(1:3), Set(2:6), Set(4:8)];
julia> #btime Set(Iterators.flatten($v));
270.492 ns (4 allocations: 400 bytes)
julia> #btime reduce(union, $v);
550.000 ns (11 allocations: 1.25 KiB)
julia> #btime union($v...);
506.250 ns (11 allocations: 944 bytes)
julia> #btime union((j for j in $v)...);
699.286 ns (15 allocations: 1.03 KiB)
I guess you should use reduce:
reduce(union, vec_of_sets)
but you could also use splatting (with ...):
union(vec_of_sets...)
FWIW, you could have used splitting with your attempt, too:
union((j for j in vec_of_sets)...)
Allocating an array of Union{T, Missing} is very expensive in Julia. Is there any workaround it?
julia> #time Vector{Union{Missing, Int}}(undef, 10^7);
0.031052 seconds (2 allocations: 85.831 MiB)
julia> #time Vector{Union{Int}}(undef, 10^7);
0.000027 seconds (3 allocations: 76.294 MiB)
Because if you make a Union of Missing with a bitstype like Int then Julia sets the flag that such a vector initially stores missing in each of its entries:
julia> Vector{Union{Missing, Int}}(undef, 10^7)
10000000-element Vector{Union{Missing, Int64}}:
missing
missing
⋮
missing
missing
If you used non-bitstype then such a flag for each entry does not have to be set as you can see here:
julia> Vector{Union{Missing, String}}(undef, 10^7)
10000000-element Vector{Union{Missing, String}}:
#undef
#undef
⋮
#undef
#undef
and in consequence the performance is the same:
julia> #btime Vector{Union{String}}(undef, 10^7);
11.672 ms (3 allocations: 76.29 MiB)
julia> #btime Vector{Union{Missing, String}}(undef, 10^7);
11.480 ms (2 allocations: 76.29 MiB)
The difference is that union arrays get zero-initialized. You can see the code that decides this here:
https://github.com/JuliaLang/julia/blob/3f024fd0ab9e68b37d29fee6f2a9ab19819102c5/src/array.c#L191
This ends up as a call to memset:
https://github.com/JuliaLang/julia/blob/3f024fd0ab9e68b37d29fee6f2a9ab19819102c5/src/array.c#L144-L145
So as a check, we can compare zeros vs allocating the union array:
julia> #time Vector{Union{Missing, Int}}(undef, 10^7);
0.020609 seconds (2 allocations: 85.831 MiB)
julia> #time zeros(Int, 10^7);
0.018375 seconds (2 allocations: 76.294 MiB)
Quite comparable timings.
However, I don't think this performance difference should end up mattering in your application unless you have structured it in a quite strange way. There is very little work you can do with that array until the allocation time becomes insignificant. For example, just setting the values of the uninitialized array makes the timing vs the union array quite similar:
julia> function f()
a = Vector{Int}(undef, 10^7)
for i in eachindex(a)
a[i] = 1
end
a
end;
julia> function f_union()
a = Vector{Union{Missing, Int}}(undef, 10^7)
for i in eachindex(a)
a[i] = 1
end
a
end;
julia> #time f();
0.015566 seconds (2 allocations: 76.294 MiB)
julia> #time f_union();
0.026414 seconds (2 allocations: 85.831 MiB)
We had the same problem and as a workaround we used
x = Vector{Union{T,Missing}}(undef,1)
resize!(x, newlen)
I would like to use a function (I'm sure there is one) in Julia which takes an Array (or similar type) and a type (e.g. nothing) as input, checks each element in the array to see whether the element is of that type and returns the indices of the elements in the Array which are of that type. For example :
typeToFind = nothing
A = [1,2,3,nothing,5]
idx = find(x->x == typeToFind,A)
Similar to MATLAB basically. I found some suggestions to use find, but seems its deprecated - Julia complains when I try to use it. I presume there must be a function of this kind in Julia, though I could of course write some pretty quick code to do the above.
find was replaced by findall, so you should try:
julia> findall(x->typeof(x)==Nothing, A)
## which returns:
1-element Array{Int64,1}:
4
julia> findall(x->typeof(x)==Nothing, A)
## which returns:
4-element Array{Int64,1}:
1
2
3
5
Using findall(x->typeof(x)==Nothing, A) solves the problem, but it might be better to use x->isa(x, T) for some type T. The reason for this is that typeof(x) will not work for abstract types, since typeof(x) always returns a concrete type.
Here's a usecase:
A = Any[1,UInt8(2),3.1,nothing,Int32(5)]
findall(x->isa(x, Int), A)
1-element Array{Int64,1}:
1
findall(x->isa(x, UInt8), A)
1-element Array{Int64,1}:
2
findall(x->isa(x, Integer), A) # Integer is an abstract type
3-element Array{Int64,1}:
1
2
5
findall(x->typeof(x)==Integer, A)
0-element Array{Int64,1} # <- Doesn't work!
It also appears to be faster:
julia> #btime findall(x->typeof(x)==Nothing, $A)
356.794 ns (6 allocations: 272 bytes)
1-element Array{Int64,1}:
4
julia> #btime findall(x->isa(x, Nothing), $A)
120.255 ns (6 allocations: 272 bytes)
1-element Array{Int64,1}:
4
In Python, Numpy arrays can be reversed using the standard [::-1] i.e.
A = np.diag(np.arange(1,3))
A[::, ::-1]
A[::-1]
A[::-1, ::-1]
Julia does not support [::-1] and the reverse method only works on 1D arrays and only 1D columns (where as rows are 2D by default).
Is there an alternative I'm missing?
Try the following, which is essentially the same as the numpy version:
julia> X = rand(3,3)
3x3 Array{Float64,2}:
0.782622 0.996359 0.335781
0.719058 0.188848 0.985693
0.455355 0.910717 0.870187
julia> X[end:-1:1,end:-1:1]
3x3 Array{Float64,2}:
0.870187 0.910717 0.455355
0.985693 0.188848 0.719058
0.335781 0.996359 0.782622
In Julia 1.0, to reverse a (column) vector:
julia> reverse([1, 2, 3])
3-element Array{Int64,1}:
3
2
1
For reversing rows, just state that you want to flip the second dimension:
julia> reverse([1 2 3], dims=2)
1×3 Array{Int64,2}:
3 2 1
EDIT: Alternatively, you can also index in reverse using end:-1:1, and that way also allows you to request a view instead of a copy:
julia> a = reshape(randperm(9), 3, 3)
3×3 Matrix{Int64}:
4 7 9
5 2 1
3 6 8
julia> #view a[:, end:-1:1]
3×3 view(::Matrix{Int64}, :, 3:-1:1) with eltype Int64:
9 7 4
1 2 5
8 6 3
Following up #IainDunning's answer, an important difference between numpy and Julia here is that X[:,end:-1:1] in Julia returns a copy and in numpy X[:,::-1] will return a view of the same data (no copy is made).
I'm just learning Julia myself, but it seems like you can accomplish something similar in Julia using sub(X, :, size(X)[2]:-1:1), which returns Julia's equivalent of a view (SubArray). Interestingly, you can't use the end keyword in this construct as far as I can see, and instead you must pass in the actual end index in the dimension.
You can use the function flipdim(mat, d).
Ref: http://docs.julialang.org/en/release-0.4/stdlib/arrays/
Try this set of functions:
function reverser(x::AbstractArray, dims::AbstractVector{<:Integer})
y = copy(x)
for d in dims
y = reverse(y, dims=d)
end
return y
end
reverser(x::AbstractArray) = reverser(x, 1:ndims(x)) # all dimensions
reverser(x::AbstractArray, d::Integer) = reverser(x, [1])
Julia 1.6 supports reversing any or all of the dimensions of an arbitrary multidimensional array (implemented here). To reverse all of the dimensions, you can simply do reverse(X).