Iteration index in a Julia for loop - loops

I have to code a very simple for loop in Julia, which I reproduce below:
result=fill([],6,1)
E=rand(5,5)
D=3.27
k=2
for s in [0.5,0.75,1,1.25,1.5,2]
result[??]=exp.(-(E.^2/D)/(2*s*k))
end
At each iteration, I want that the i-th element of result is filled with the result of the function, which uses the i-th element of the array [0.5,0.75,1,1.25,1.5,2]. So I don't know what to put inside of the brackets [??].
So far, I tried
for (index, value) in enumerate([0.5,0.75,1,1.25,1.5,2])
result["$index"]=exp.(-(E.^2/D)/(2* "$value" *k))
end
but it doesn't work. Any hint?

You're currently initialising the results to be a one dimensional array, but they're actually two dimension. So you need to switch the results as follows
result = fill(Array{Float64}(undef,0,0),6,1)
You shouldn't need to do any conversion of the types and the following will just work.
for (index, value) in enumerate([0.5,0.75,1,1.25,1.5,2])
result[index]=exp.(-(E.^2/D)/(2*value*k))
end
Rather than initialising the results, you can just map across the values as well which becomes a bit easier to read.
result = map(x -> exp.(-(E.^2/D)/(2*x*k)), [0.5, 0.75, 1, 1.25, 1.5, 2])
Some comments on performance
using BenchmarkTools
function t1()
result=fill(Array{Float64}(undef,0,0),6,1)
E=rand(5,5)
D=3.27
k=2
for (index, value) in enumerate([0.5,0.75,1,1.25,1.5,2])
result[index]=exp.(-(E.^2/D)/(2*value*k))
end
end
function t2()
E=rand(5,5)
D=3.27
k=2
result = map(x -> exp.(-(E.^2/D)/(2*x*k)), [0.5, 0.75, 1, 1.25, 1.5, 2])
end
#btime t1() # 4.904 μs (49 allocations: 9.66 KiB)
#btime t2() # 4.812 μs (50 allocations: 9.64 KiB)
As you can see, no real difference in the performance. If you want to improve performance then it's easiest to try and pull the constants out of the inner loop.
function t3()
E=rand(5,5)
D=3.27
k=2
f = -(E.^2/D)/(2*k)
result = map(x -> exp.(f/x), [0.5, 0.75, 1, 1.25, 1.5, 2])
end
#btime t3() # 3.168 μs (31 allocations: 5.53 KiB)

Assuming result should be a vector of matrices:
els = [0.5,0.75,1,1.25,1.5,2]
result=Vector{Matrix{Float64}}(undef, length(els))
E=rand(5,5)
D=3.27
k=2
for s in 1:length(els)
result[s]=exp.(-(E.^2/D)/(2*s*k))
end

Related

Efficient method for checking monotonicity in array Julia?

Trying to come up with a fast way to make sure a is monotonic in Julia.
The slow (and obvious) way to do it that I have been using is something like this:
function check_monotonicity(
timeseries::Array,
column::Int
)
current = timeseries[1,column]
for row in 1:size(timeseries, 1)
if timeseries[row,column] > current
return false
end
current = copy(timeseries[row,column])
end
return true
end
So that it works something like this:
julia> using Distributions
julia>mono_matrix = hcat(collect(0:0.1:10), rand(Uniform(0.4,0.6),101),reverse(collect(0.0:0.1:10.0)))
101×3 Matrix{Float64}:
0.0 0.574138 10.0
0.1 0.490671 9.9
0.2 0.457519 9.8
0.3 0.567687 9.7
⋮
9.8 0.513691 0.2
9.9 0.589585 0.1
10.0 0.405018 0.0
julia> check_monotonicity(mono_matrix, 2)
false
And then for the opposite example:
julia> check_monotonicity(mono_matrix, 3)
true
Does anyone know a more efficient way to do this for long time series?
Your implementation is certainly not slow! It is very nearly optimally fast. You should definitely get rid of the copy. Though it doesn't hurt when the matrix elements are just plain data, it can be bad in other cases, perhaps for BigInt for example.
This is close to your original effort, but also more robust with respect to indexing and array types:
function ismonotonic(A::AbstractMatrix, column::Int, cmp = <)
current = A[begin, column] # begin instead of 1
for i in axes(A, 1)[2:end] # skip the first element
newval = A[i, column] # don't use copy here
cmp(newval, current) && return false
current = newval
end
return true
end
Another tip: You don't need to use collect. In fact, you should almost never use collect. Do this instead (I removed Uniform since I don't have Distributions.jl):
mono_matrix = hcat(0:0.1:10, rand(101), reverse(0:0.1:10)) # or 10:-0.1:0
Or perhaps this is better, since you have more control over the numer of elements in the range:
mono_matrix = hcat(range(0, 10, 101), rand(101), range(10, 0, 101))
Then you can use it like this:
1.7.2> ismonotonic(mono_matrix, 3)
false
1.7.2> ismonotonic(mono_matrix, 3, >=)
true
1.7.2> ismonotonic(mono_matrix, 1)
true
In mathematics typically we define a series to be monotonic if it is non-decreasing or non-increasing. If this is what you want then do:
issorted(view(mono_matrix, :, 2), rev=true)
to check if it is non-increasing, and:
issorted(view(mono_matrix, :, 2))
to check if it is non-decreasing.
If you want a decreasing check do:
issorted(view(mono_matrix, :, 3), rev=true, lt = <=)
for decreasing, but treating 0.0 and -0.0 as equal or
issorted(view(mono_matrix, :, 3), lt = <=)
for increasing, but treating 0.0 and -0.0 as equal.
If you want to distinguish 0.0 and -0.0 then do respectively:
issorted(view(mono_matrix, :, 3), rev=true, lt = (x, y) -> isequal(x, y) || isless(x, y))
issorted(view(mono_matrix, :, 3), lt = (x, y) -> isequal(x, y) || isless(x, y))

Julia: rational behind array size and index for "extra" dimensions?

I am using Julia from time to time, however I am surprised by the following behavior:
Let's define an 3x4 array
julia> m=rand(3,4)
3×4 Array{Float64,2}:
0.889018 0.500847 0.539856 0.828231
0.492425 0.582958 0.521406 0.754102
0.28227 0.834333 0.669967 0.0939701
Now I check that
julia> size(m,1), size(m,2)
(3, 4)
as expected.
However, I am surprised by this:
julia> size(m,3), size(m,2018)
(1, 1)
-> I would have expected (0,0) or an error message
Looking the Julia code confirms this behavior:
size(t::AbstractArray{T,N}, d) where {T,N} = d <= N ? size(t)[d] : 1
Moreover:
julia> m[2,1,1,1,1]
0.4924252391289974
-> I would have expected an out of bounds error
So my question is: "what is the rationale?"
( I do not thing it is a bug, I use Julia version 0.6.2)
I believe it's for broadcasting.
julia> m=rand(3,4)
3×4 Array{Float64,2}:
0.139323 0.663912 0.994985 0.517332
0.423913 0.121753 0.0327054 0.0754665
0.392672 0.47006 0.351121 0.787318
julia> size(m)
(3, 4)
julia> n = rand(3)
3-element Array{Float64,1}:
0.716752
0.98755
0.661226
julia> m .* n
3×4 Array{Float64,2}:
0.09986 0.475861 0.713157 0.370799
0.418636 0.120237 0.0322983 0.074527
0.259645 0.310816 0.23217 0.520595
Notice that n is of one dimension less, so it's size 1 in the 2nd dimension and thus applies column-wise. Scalars in broadcast are treated differently and are generally inlined into the fused broadcasting function which you cannot do with a mutable type, so the size 1 = expand in higher dimensions rule for broadcast is a nice way to implement this.

Killing a For loop in Julia array comprehension

I have the following line of code in Julia:
X=[(i,i^2) for i in 1:100 if i^2%5==0]
Basically, it returns a list of tuples (i,i^2) from i=1 to 100 if the remainder of i^2 and 5 is zero. What I want to do is, in the array comprehension, break out of the for loop if i^2 becomes larger than 1000. However, if I implement
X=[(i,i^2) for i in 1:100 if i^2%5==0 else break end]
I get the error: syntax: expected "]".
Is there any way to easily break out of this for loop inside the array? I've tried looking online, but nothing came up.
It's a "fake" for-loop, so you can't break it. Take a look at the lowered code below:
julia> foo() = [(i,i^2) for i in 1:100 if i^2%5==0]
foo (generic function with 1 method)
julia> #code_lowered foo()
LambdaInfo template for foo() at REPL[0]:1
:(begin
nothing
#1 = $(Expr(:new, :(Main.##1#3)))
SSAValue(0) = #1
#2 = $(Expr(:new, :(Main.##2#4)))
SSAValue(1) = #2
SSAValue(2) = (Main.colon)(1,100)
SSAValue(3) = (Base.Filter)(SSAValue(1),SSAValue(2))
SSAValue(4) = (Base.Generator)(SSAValue(0),SSAValue(3))
return (Base.collect)(SSAValue(4))
end)
The output shows that array comprehension is implemented via Base.Generator which takes an iterator as input. It only supports the [if cond(x)::Bool] "guard" for now, so there is no way to use break here.
For your specific case, a workaround is to use isqrt:
julia> X=[(i,i^2) for i in 1:isqrt(1000) if i^2%5==0]
6-element Array{Tuple{Int64,Int64},1}:
(5,25)
(10,100)
(15,225)
(20,400)
(25,625)
(30,900)
I don't think so. You could always just
tmp(i) = (j = i^2; j > 1000 ? false : j%5==0)
X=[(i,i^2) for i in 1:100 if tmp(i)]
Using a for loop is considered idiomatic in Julia and could be more readable in this instance. Also, it could be faster.
Specifically:
julia> using BenchmarkTools
julia> tmp(i) = (j = i^2; j > 1000 ? false : j%5==0)
julia> X1 = [(i,i^2) for i in 1:100 if tmp(i)];
julia> #btime [(i,i^2) for i in 1:100 if tmp(i)];
471.883 ns (7 allocations: 528 bytes)
julia> X2 = [(i,i^2) for i in 1:isqrt(1000) if i^2%5==0];
julia> #btime [(i,i^2) for i in 1:isqrt(1000) if i^2%5==0];
281.435 ns (7 allocations: 528 bytes)
julia> function goodsquares()
res = Vector{Tuple{Int,Int}}()
for i=1:100
if i^2%5==0 && i^2<=1000
push!(res,(i,i^2))
elseif i^2>1000
break
end
end
return res
end
julia> X3 = goodsquares();
julia> #btime goodsquares();
129.123 ns (3 allocations: 304 bytes)
So, another 2x improvement is nothing to disregard and the long function gives plenty of room for illuminating comments.

How can the speed of nested arrays in Julia be improved?

The following function nested_arrays generates (surprisingly) a nested array of "depth" n. However, when running with even small values of n (2, 3, etc.), it takes a reasonably long time to run and display the output.
julia> nested_arrays(n) = n == 1 ? [1] : [nested_arrays(n - 1)]
nested_arrays (generic function with 1 method)
julia> nested_arrays(1)
1-element Array{Int64,1}:
1
julia> nested_arrays(2)
1-element Array{Array{Int64,1},1}:
[1]
julia> nested_arrays(3)
1-element Array{Array{Array{Int64,1},1},1}:
Array{Int64,1}[[1]]
julia> nested_arrays(10)
1-element Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1}:
Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1}[Array{Array{Array{Array{Array{Int64,1},1},1},1},1}[Array{Array{Array{Array{Int64,1},1},1},1}[Array{Array{Array{Int64,1},1},1}[Array{Array{Int64,1},1}[Array{Int64,1}[[1]]]]]]]]]
Interestingly, when using the #time macro or a ; at the end of the line, the result is taking relatively little of the time to calculate. Instead, the actual displaying of the result in the REPL takes the majority of the time.
This strange behavior is not shown in, for example, Python.
In [1]: def nested_lists(n):
...: if n == 1:
...: return [1]
...: return [nested_lists(n - 1)]
...:
In [2]: nested_lists(10)
Out[2]: [[[[[[[[[[1]]]]]]]]]]
In [3]: %time nested_lists(100)
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 37.7 µs
Out[3]: [[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[1]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
Why is this function so slow in Julia? Is Julia recompiling the display function for different types T in Array{T, 1}? If so, why is this?
Can the speed of this code be improved, or should this just not be done in Julia? My main concern for this in a practical sense would be, for example, loading a complex, nested JSON file, where simply using an n-dimensional array would not be possible.
Yes, this is entirely due to compilation time. You can see this by #time-ing the display. The second time you display it is fast:
julia> nested_arrays(n) = n == 1 ? [1] : [nested_arrays(n - 1)]
nested_arrays (generic function with 1 method)
julia> #time display(nested_arrays(15));
1-element Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1},1},1}:
Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1}[Array{Array{Array{Array{Array{Int64,1},1},1},1},1}[Array{Array{Array{Array{Int64,1},1},1},1}[Array{Array{Array{Int64,1},1},1}[Array{Array{Int64,1},1}[Array{Int64,1}[[1]]]]]]]]]]]]]]
11.682721 seconds (8.83 M allocations: 371.698 MB, 1.82% gc time)
julia> #time display(nested_arrays(15));
1-element Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1},1},1}:
Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1}[Array{Array{Array{Array{Array{Int64,1},1},1},1},1}[Array{Array{Array{Array{Int64,1},1},1},1}[Array{Array{Array{Int64,1},1},1}[Array{Array{Int64,1},1}[Array{Int64,1}[[1]]]]]]]]]]]]]]
0.001688 seconds (2.38 k allocations: 102.766 KB)
So why is this so slow? The display here recursively walks through all the arrays and prints them nested inside each other. This is recursively calling show with 14 different types — one with 14 nested arrays, then its element with 13 nested arrays, then its element with 12… and so on! Each of those show methods gets independently compiled. Compiling specialized methods for specific element types is a key part of how Julia is able to produce very efficient code. It means that it's able to specialize every single operation done on each element without any runtime type checking or dispatch. Unfortunately in this case, it gets in the way.
You can work around this with an Any[] array; in the context of a JSON file this makes quite a lot of sense since you don't know if it'll contain strings or arrays or numbers, etc. This is much faster since it only needs to compile the show method for an Any[] array once, and then it recursively uses it.
# new session
julia> nested_arrays(n) = n == 1 ? Any[1] : Any[nested_arrays(n - 1)]
nested_arrays (generic function with 1 method)
julia> #time display(nested_arrays(15));
1-element Array{Any,1}:
Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[1]]]]]]]]]]]]]]
1.571632 seconds (767.12 k allocations: 32.472 MB, 1.04% gc time)
julia> #time display(nested_arrays(15));
1-element Array{Any,1}:
Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[1]]]]]]]]]]]]]]
0.000606 seconds (839 allocations: 30.859 KB)
julia> #time display(nested_arrays(100));
1-element Array{Any,1}:
Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[1]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
0.002523 seconds (17.76 k allocations: 579.297 KB)

Fortran-like arrays such as FArray(Float64, -1:1,-7:7,-128:512) in Julia

Generally having 1-based array for Julia is a good decision, but sometimes it is desirable to have Fortran-like array with indices that span some subranges of ℤ:
julia> x = FArray(Float64, -1:1,-7:7,-128:512)
where it would be useful:
in the codes accompanying the book Numerical Solution of Hyperbolic Partial Differential Equations by prof. John A. Trangenstein these negative indices are used intensively for ghost cells for boundary conditions.
The same is true for Clawpack (stands for “Conservation Laws Package”) by prof. Randall J. LeVeque http://depts.washington.edu/clawpack/ and there are many other codes where such indices would be natural.
So such auxiliary class would be useful for speedy translation of such codes.
I just started to implement such an auxiliary type but as I'm quite new to Julia your help would be greatly appreciated.
I started with:
type FArray
ranges
array::Array
function FArray(T, r::Range1{Int}...)
dims = map((x) -> length(x), r)
array = Array(T, dims)
new(r, array)
end
end
Output:
julia> include ("FortranArray.jl")
julia> x = FArray(Float64, -1:1,-7:7,-128:512)
FArray((-1:1,-7:7,-128:512),3x15x641 Array{Float64,3}:
[:, :, 1] =
6.90321e-310 2.6821e-316 1.96042e-316 0.0 0.0 0.0 9.84474e-317 … 1.83233e-316 2.63285e-316 0.0 9.61618e-317 0.0
6.90321e-310 6.32404e-322 2.63285e-316 0.0 0.0 0.0 2.63292e-316 2.67975e-316
...
[:, :, 2] =
...
As I'm completely new to Julia any recommendations especially that lead to more efficient would be greatly appreciated.
[Edit]
The topic has been discussed here:
https://groups.google.com/forum/#!topic/julia-dev/NOF6MA6tb9Y
During the discussion two ways to have Julia arrays with arbitrary base were elaborated:
SubArray-based, sample usage is with a helper function:
function farray(T, r::Range1{Int64}...)
dims = map((x) -> length(x), r)
array = Array(T, dims)
sub_indices = map((x) -> -minimum(x) + 2 : maximum(x), r)
sub(array, sub_indices)
end
julia> y[-1,-7,-128] = 777
777
julia> y[-1,-7,-128] + 33
810.0
julia> y[-2,-7,-128]
ERROR: BoundsError()
in getindex at subarray.jl:200
julia> y[2,-7,-128]
2.3977385e-316
Please note, that bounds are not checked fully more details are here:
https://github.com/JuliaLang/julia/issues/4044
At the moment SubArray has performance issues but eventually its performance might be improved, see also:
https://github.com/JuliaLang/julia/issues/5117
https://github.com/JuliaLang/julia/issues/3496
Another approach that has better performance at the moment, besides checks both bounds:
type FArray{T<:Number, N, A<:AbstractArray} <: AbstractArray
ranges
offsets::NTuple{N,Int}
array::A
function FArray(r::Range1{Int}...)
dims = map((x) -> length(x), r)
array = Array(T, dims)
offs = map((x) -> 1 - minimum(x), r)
new(r, offs, array)
end
end
FArray(T, r::Range1{Int}...) = FArray{T, length(r,), Array{T, length(r,)}}(r...)
getindex{T<:Number}(FA::FArray{T}, i1::Int) = FA.array[i1+FA.offsets[1]]
getindex{T<:Number}(FA::FArray{T}, i1::Int, i2::Int) = FA.array[i1+FA.offsets[1], i2+FA.offsets[2]]
getindex{T<:Number}(FA::FArray{T}, i1::Int, i2::Int, i3::Int) = FA.array[i1+FA.offsets[1], i2+FA.offsets[2], i3+FA.offsets[3]]
setindex!{T}(FA::FArray{T}, x, i1::Int) = arrayset(FA.array, convert(T,x), i1+FA.offsets[1])
setindex!{T}(FA::FArray{T}, x, i1::Int, i2::Int) = arrayset(FA.array, convert(T,x), i1+FA.offsets[1], i2+FA.offsets[2])
setindex!{T}(FA::FArray{T}, x, i1::Int, i2::Int, i3::Int) = arrayset(FA.array, convert(T,x), i1+FA.offsets[1], i2+FA.offsets[2], i3+FA.offsets[3])
getindex and setindex! methods for FArray were inspired by base/array.jl code.
Use cases:
julia> y = FArray(Float64, -1:1,-7:7,-128:512);
julia> y[-1,-7,-128] = 777
777
julia> y[-1,-7,-128] + 33
810.0
julia> y[-1,2,3]
0.0
julia> y[-2,-7,-128]
ERROR: BoundsError()
in getindex at FortranArray.jl:27
julia> y[2,-7,-128]
ERROR: BoundsError()
in getindex at FortranArray.jl:27
There are now two packages that provide this kind of functionality. For arrays with arbitrary start indices, see https://github.com/alsam/OffsetArrays.jl. For even more flexibility see https://github.com/mbauman/AxisArrays.jl, where indices do not have to be integers.

Resources