Reducer for non-parallel for loops/multiline comprehensions - loops

Julia has a parallel macro for for loops, which allows things like:
s = #sync #parallel vcat for i in 1:9
k = iseven(i) ? i÷2 : 3i+1
k^2
end
and since the reducer specified is vcat, we get back an array of numbers.
Is it possible to do something like this with a normal for loop (without having to explicitly initialize and push! into the array)?
Since I'm only looking to reduce using vcat, another way to ask this question is: is there a neat readable multiline form of array comprehensions? It's possible to stretch to usual comprehension syntax like this:
s = [
(k = iseven(i) ? i÷2 : 3i+1;
k^2)
for i in 1:9
]
but that seems messy and less readable compared to the #parallel vcat for syntax. Is there a better way of doing multiline comprehensions?

Extending on #Gnimuc's answer, I think mapreduce plus do-syntax is pretty nice:
julia> mapreduce(vcat, 1:9) do i
k = iseven(i) ? i÷2 : 3i+1
k^2
end
9-element Array{Int64,1}:
16
1
100
4
256
9
484
16
784

The short answer is to write multiline functions(or do-blocks as #phg reminds) with a single line array comprehension or map/mapreduce:
s = [
(k = iseven(i) ? i÷2 : 3i+1;
k^2)
for i in 1:9
]
This example is pure comprehension, no reducer is involved. Array comprehension is usually written in one line, for example, s = [iseven(i) ? i÷2 : 3i+1 |> x->x^2 for i in 1:9]. As #phg suggested, multi-line functions can be enclosed in a do-block:
julia> map(1:9) do x
k = iseven(x) ? x÷2 : 3x+1
k^2
end
However, no reducer such as vcat is needed in this case, but if the output of f in the above example is a vector:
julia> function f(x)
k = iseven(x) ? x÷2 : 3x+1
[k^2]
end
f (generic function with 1 method)
julia> s = [f(i) for i in 1:9]
9-element Array{Array{Int64,1},1}:
[16]
[1]
[100]
[4]
[256]
[9]
[484]
[16]
[784]
array comprehension will give you an array of vectors. This time you need to use mapreduce instead:
julia> mapreduce(f, vcat, 1:9)
9-element Array{Int64,1}:
16
1
100
4
256
9
484
16
784

Related

Is there a way to turn a array into a integer or float?

I'm trying to change an array with int into a single int in Julia 1.5.4 like that:
x = [1,2,3]
Here i would try or use a code/command (here: example())
x_new = example(x)
println(x_new)
typeof(x_new)
Ideal output would be something like this :
123
Int32
I already tried to solve this problem with parse() or push!() or something like this. But nothing worked well.
I couldn't find a similar problem...
You can find an issue about adding this functionality to Julia here: https://github.com/JuliaLang/julia/issues/40393
Bottom line, you don't want to use strings, and you should avoid unnecessary exponentiation, both of which will be really slow.
A very brief solution is
evalpoly(10, reverse([1,2,3]))
Spelling it out a bit more, you can do this
function joindigits(xs)
val = 0
for x in xs
val = 10*val + x
end
return val
end
Is this what you need?
julia> x = [1,2,3]
3-element Vector{Int64}:
1
2
3
julia> list2int(x) = sum(10 .^ (length(x)-1:-1:0) .* x)
list2int (generic function with 1 method)
julia> list2int(x)
123
You are looking for string concatenation and then parsing:
x_new = parse(Int64, string(x...))
Another interesting way to convert many small numbers to a bigger one is to combine raw bytes:
julia> reinterpret(Int16, [Int8(2),Int8(3)])
1-element reinterpret(Int16, ::Vector{Int8}):
770
Note that 770 = 256*3 + 2
Or for actual Ints:
julia> reinterpret(Int128, [10,1])
1-element reinterpret(Int128, ::Vector{Int64}):
18446744073709551626
(note that result is exactly Int128(2)^64+10)

Sorted version of in

I have an array of times event_times and I want to check if t in event_times. However, I know that event_times is sorted. Is there a way to make use of that to make the search faster?
An idiomatic Julian way would be an elaboration of:
struct SortedVector{T,V<:AbstractVector} <: AbstractVector{T}
v::V
SortedVector{T,V}(v::AbstractVector{T}) where {T, V} = new(v)
# check sorted in inner constructor??
end
SortedVector(v::AbstractVector{T}) where T = SortedVector{T,typeof(v)}(v)
#inline Base.size(sv::SortedVector) = size(sv.v)
#inline Base.getindex(sv::SortedVector,i) = sv.v[i]
#inline Base.in(e::T,sv::SortedVector{T}) where T = !isempty(searchsorted(sv.v,e))
And then:
julia> v = SortedVector(sort(rand(1:10,10)))
10-element SortedVector{Int64,Array{Int64,1}}:
1
4
5
5
6
6
6
7
7
10
julia> 3 in v
false
julia> 1 in v
true
If I recall correctly David Sanders had an implementation with this name. Perhaps looking at https://github.com/JuliaIntervals/IntervalOptimisation.jl/blob/889bf43e8a514e696869baaa6af1300ace87b90b/src/SortedVectors.jl would promote reuse.
Following #ColinTBowers's hint, you can use the fact that searchsorted returns a range which is empty iff t is not in event_times. Thus !isempty(searchsorted(event_times,t)) is a fast method to get the answer.

Killing a For loop in Julia array comprehension

I have the following line of code in Julia:
X=[(i,i^2) for i in 1:100 if i^2%5==0]
Basically, it returns a list of tuples (i,i^2) from i=1 to 100 if the remainder of i^2 and 5 is zero. What I want to do is, in the array comprehension, break out of the for loop if i^2 becomes larger than 1000. However, if I implement
X=[(i,i^2) for i in 1:100 if i^2%5==0 else break end]
I get the error: syntax: expected "]".
Is there any way to easily break out of this for loop inside the array? I've tried looking online, but nothing came up.
It's a "fake" for-loop, so you can't break it. Take a look at the lowered code below:
julia> foo() = [(i,i^2) for i in 1:100 if i^2%5==0]
foo (generic function with 1 method)
julia> #code_lowered foo()
LambdaInfo template for foo() at REPL[0]:1
:(begin
nothing
#1 = $(Expr(:new, :(Main.##1#3)))
SSAValue(0) = #1
#2 = $(Expr(:new, :(Main.##2#4)))
SSAValue(1) = #2
SSAValue(2) = (Main.colon)(1,100)
SSAValue(3) = (Base.Filter)(SSAValue(1),SSAValue(2))
SSAValue(4) = (Base.Generator)(SSAValue(0),SSAValue(3))
return (Base.collect)(SSAValue(4))
end)
The output shows that array comprehension is implemented via Base.Generator which takes an iterator as input. It only supports the [if cond(x)::Bool] "guard" for now, so there is no way to use break here.
For your specific case, a workaround is to use isqrt:
julia> X=[(i,i^2) for i in 1:isqrt(1000) if i^2%5==0]
6-element Array{Tuple{Int64,Int64},1}:
(5,25)
(10,100)
(15,225)
(20,400)
(25,625)
(30,900)
I don't think so. You could always just
tmp(i) = (j = i^2; j > 1000 ? false : j%5==0)
X=[(i,i^2) for i in 1:100 if tmp(i)]
Using a for loop is considered idiomatic in Julia and could be more readable in this instance. Also, it could be faster.
Specifically:
julia> using BenchmarkTools
julia> tmp(i) = (j = i^2; j > 1000 ? false : j%5==0)
julia> X1 = [(i,i^2) for i in 1:100 if tmp(i)];
julia> #btime [(i,i^2) for i in 1:100 if tmp(i)];
471.883 ns (7 allocations: 528 bytes)
julia> X2 = [(i,i^2) for i in 1:isqrt(1000) if i^2%5==0];
julia> #btime [(i,i^2) for i in 1:isqrt(1000) if i^2%5==0];
281.435 ns (7 allocations: 528 bytes)
julia> function goodsquares()
res = Vector{Tuple{Int,Int}}()
for i=1:100
if i^2%5==0 && i^2<=1000
push!(res,(i,i^2))
elseif i^2>1000
break
end
end
return res
end
julia> X3 = goodsquares();
julia> #btime goodsquares();
129.123 ns (3 allocations: 304 bytes)
So, another 2x improvement is nothing to disregard and the long function gives plenty of room for illuminating comments.

How can the speed of nested arrays in Julia be improved?

The following function nested_arrays generates (surprisingly) a nested array of "depth" n. However, when running with even small values of n (2, 3, etc.), it takes a reasonably long time to run and display the output.
julia> nested_arrays(n) = n == 1 ? [1] : [nested_arrays(n - 1)]
nested_arrays (generic function with 1 method)
julia> nested_arrays(1)
1-element Array{Int64,1}:
1
julia> nested_arrays(2)
1-element Array{Array{Int64,1},1}:
[1]
julia> nested_arrays(3)
1-element Array{Array{Array{Int64,1},1},1}:
Array{Int64,1}[[1]]
julia> nested_arrays(10)
1-element Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1}:
Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1}[Array{Array{Array{Array{Array{Int64,1},1},1},1},1}[Array{Array{Array{Array{Int64,1},1},1},1}[Array{Array{Array{Int64,1},1},1}[Array{Array{Int64,1},1}[Array{Int64,1}[[1]]]]]]]]]
Interestingly, when using the #time macro or a ; at the end of the line, the result is taking relatively little of the time to calculate. Instead, the actual displaying of the result in the REPL takes the majority of the time.
This strange behavior is not shown in, for example, Python.
In [1]: def nested_lists(n):
...: if n == 1:
...: return [1]
...: return [nested_lists(n - 1)]
...:
In [2]: nested_lists(10)
Out[2]: [[[[[[[[[[1]]]]]]]]]]
In [3]: %time nested_lists(100)
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 37.7 µs
Out[3]: [[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[1]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
Why is this function so slow in Julia? Is Julia recompiling the display function for different types T in Array{T, 1}? If so, why is this?
Can the speed of this code be improved, or should this just not be done in Julia? My main concern for this in a practical sense would be, for example, loading a complex, nested JSON file, where simply using an n-dimensional array would not be possible.
Yes, this is entirely due to compilation time. You can see this by #time-ing the display. The second time you display it is fast:
julia> nested_arrays(n) = n == 1 ? [1] : [nested_arrays(n - 1)]
nested_arrays (generic function with 1 method)
julia> #time display(nested_arrays(15));
1-element Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1},1},1}:
Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1}[Array{Array{Array{Array{Array{Int64,1},1},1},1},1}[Array{Array{Array{Array{Int64,1},1},1},1}[Array{Array{Array{Int64,1},1},1}[Array{Array{Int64,1},1}[Array{Int64,1}[[1]]]]]]]]]]]]]]
11.682721 seconds (8.83 M allocations: 371.698 MB, 1.82% gc time)
julia> #time display(nested_arrays(15));
1-element Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1},1},1}:
Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1}[Array{Array{Array{Array{Array{Int64,1},1},1},1},1}[Array{Array{Array{Array{Int64,1},1},1},1}[Array{Array{Array{Int64,1},1},1}[Array{Array{Int64,1},1}[Array{Int64,1}[[1]]]]]]]]]]]]]]
0.001688 seconds (2.38 k allocations: 102.766 KB)
So why is this so slow? The display here recursively walks through all the arrays and prints them nested inside each other. This is recursively calling show with 14 different types — one with 14 nested arrays, then its element with 13 nested arrays, then its element with 12… and so on! Each of those show methods gets independently compiled. Compiling specialized methods for specific element types is a key part of how Julia is able to produce very efficient code. It means that it's able to specialize every single operation done on each element without any runtime type checking or dispatch. Unfortunately in this case, it gets in the way.
You can work around this with an Any[] array; in the context of a JSON file this makes quite a lot of sense since you don't know if it'll contain strings or arrays or numbers, etc. This is much faster since it only needs to compile the show method for an Any[] array once, and then it recursively uses it.
# new session
julia> nested_arrays(n) = n == 1 ? Any[1] : Any[nested_arrays(n - 1)]
nested_arrays (generic function with 1 method)
julia> #time display(nested_arrays(15));
1-element Array{Any,1}:
Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[1]]]]]]]]]]]]]]
1.571632 seconds (767.12 k allocations: 32.472 MB, 1.04% gc time)
julia> #time display(nested_arrays(15));
1-element Array{Any,1}:
Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[1]]]]]]]]]]]]]]
0.000606 seconds (839 allocations: 30.859 KB)
julia> #time display(nested_arrays(100));
1-element Array{Any,1}:
Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[1]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
0.002523 seconds (17.76 k allocations: 579.297 KB)

Creating an array with "hidden" extra indices

type ExtendedJumpArray{T,T2} <: AbstractArray{Float64,1}
u::T
jump_u::T2
end
Base.length(A::ExtendedJumpArray) = length(A.u)
Base.size(A::ExtendedJumpArray) = (length(A),)
function Base.getindex(A::ExtendedJumpArray,i::Int)
i <= length(A.u) ? A.u[i] : A.jump_u[i-length(A.u)]
end
function Base.setindex!(A::ExtendedJumpArray,v,i::Int)
i <= length(A.u) ? (A.u[i] = v) : (A.jump_u[i-length(A.u)] = v)
end
similar(A::ExtendedJumpArray) = deepcopy(A)
indices(A::ExtendedJumpArray) = Base.OneTo(length(A.u) + length(A.jump_u))
I thought I was the cool kid on the block, creating an array which could index past its length (I am doing it for a specific reason). But Julia apparently doesn't like this:
julia> ExtendedJumpArray([0.2],[-2.0])
Error showing value of type ExtendedJumpArray{Array{Float64,1},Array{Float64,1}}:
ERROR: MethodError: no method matching inds2string(::Int64)
Closest candidates are:
inds2string(::Tuple{Vararg{AbstractUnitRange,N}}) at show.jl:1485
in _summary(::ExtendedJumpArray{Array{Float64,1},Array{Float64,1}}, ::Int64) at .\show.jl:1490
in #showarray#330(::Bool, ::Function, ::IOContext{Base.Terminals.TTYTerminal}, ::ExtendedJumpArray{Array{Float64,1},Array{Float64,1}}, ::Bool) at .\show.jl:1599
in display(::Base.REPL.REPLDisplay{Base.REPL.LineEditREPL}, ::MIME{Symbol("text/plain")}, ::ExtendedJumpArray{Array{Float64,1},Array{Float64,1}}) at .\REPL.jl:132
in display(::Base.REPL.REPLDisplay{Base.REPL.LineEditREPL}, ::ExtendedJumpArray{Array{Float64,1},Array{Float64,1}}) at .\REPL.jl:135
in display(::ExtendedJumpArray{Array{Float64,1},Array{Float64,1}}) at .\multimedia.jl:143
in print_response(::Base.Terminals.TTYTerminal, ::Any, ::Void, ::Bool, ::Bool, ::Void) at .\REPL.jl:154
in print_response(::Base.REPL.LineEditREPL, ::Any, ::Void, ::Bool, ::Bool) at .\REPL.jl:139
in (::Base.REPL.##22#23{Bool,Base.REPL.##33#42{Base.REPL.LineEditREPL,Base.REPL.REPLHistoryProvider},Base.REPL.LineEditREPL,Base.LineEdit.Prompt})(::Base.LineEdit.MIState, ::Base.AbstractIOBuffer{Array{UInt8,1}}, ::Bool) at .\REPL.jl:652
in run_interface(::Base.Terminals.TTYTerminal, ::Base.LineEdit.ModalInterface) at .\LineEdit.jl:1579
in run_frontend(::Base.REPL.LineEditREPL, ::Base.REPL.REPLBackendRef) at .\REPL.jl:903
in run_repl(::Base.REPL.LineEditREPL, ::Base.##932#933) at .\REPL.jl:188
in _start() at .\client.jl:360
Is there an easy way to do this without breaking the show methods, and whatever else may be broken? Or is there a better way to do this in general?
Indices needs to return a tuple, just like size.
julia> Base.similar(A::ExtendedJumpArray) = deepcopy(A)
julia> Base.indices(A::ExtendedJumpArray) = (Base.OneTo(length(A.u) + length(A.jump_u)),)
julia> ExtendedJumpArray([0.2],[-2.0])
2-element ExtendedJumpArray{Array{Float64,1},Array{Float64,1}}:
0.2
-2.0
julia> length(ans)
1
Having indices and size disagree in the dimensionality of an array, though, is likely to end with confusion and strife. Some functions use size, whereas others use indices. See display vs. length above.

Resources