Custom searchsortedfirst method - arrays

I'm kinda new in Julia lang, so I'm still struggling with reading Julia documentation. Here is a piece of it and I am looking for explanation specifically the bolded part.
Base.Sort.searchsortedfirst — Function.
searchsortedfirst(a, x, [by=,] [lt=,]
[rev=false])
Returns the index of the first value in a greater than or equal to x,
according to the specified order. Returns length(a)+1 if x is greater
than all values in a. a is assumed to be sorted.
Website
My array looks like this:
A = Vector{Record}()
where
type Record
y::Int64
value::Float64
end
Now here is my problem. I would like to call above-mentioned method on my array and obtain Record where given x equals y in this Record (Record.y == x). Guess I have to write 'by' transfrom or 'lt' comparator? or both?
Any help would be appraciated :)

#crstnbr has provided a perfectly good answer for the case of one-off uses of searchsortedfirst. I thought it worth adding that there is also a more permanent solution. If your type Record exhibits a natural ordering, then just extend Base.isless and Base.isequal to your new type. The following example code shows how this works for some new type you might define:
struct MyType ; x::Float64 ; end #Define some type of my own
yvec = MyType.(sort!(randn(10))) #Build a random vector of my type
yval = MyType(0.0) #Build a value of my type
searchsortedfirst(yvec, yval) #ERROR: this use of searchsortedfirst will throw a MethodError since julia doesn't know how to order MyType
Base.isless(y1::MyType, y2::MyType)::Bool = y1.x < y2.x #Extend (aka overload) isless so it is defined for the new type
Base.isequal(y1::MyType, y2::MyType)::Bool = y1.x == y2.x #Ditto for isequal
searchsortedfirst(yvec, yval) #Now this line works
Some points worth noting:
1) In the step where I overload isless and isequal, I preface the method definition with Base.. This is because the isless and isequal functions are originally defined in Base, where Base refers to the core julia package that is automatically loaded every time you start julia. By prefacing with Base., I ensure that my new methods are added to the current set of methods for these two functions, rather than replacing them. Note, I could also achieve this by omitting Base. but including a line beforehand of import Base: isless, isequal. Personally, I prefer the way I've done it above (for the overly pedantic, you can also do both).
2) I can define isless and isequal however I want. It is my type and my method extensions. So you can choose whatever you think is the natural ordering for your new type.
3) The operators <. <=, ==, >=, >, all actually just call isless and isequal under the hood, so all of these operators will now work with your new type, eg MyType(1.0) > MyType(2.0) returns false.
4) Any julia function that uses the comparative operators above will now work with your new type, as long as the function is defined parametrically (which almost everything in Base is).

You can just define a custom less-than operation and give it to searchsortedfirst via lt keyword argument:
julia> type Record
y::Int64
value::Float64
end
julia> A = Vector{Record}()
0-element Array{Record,1}
julia> push!(A, Record(3,3.0))
1-element Array{Record,1}:
Record(3, 3.0)
julia> push!(A, Record(4,3.0))
2-element Array{Record,1}:
Record(3, 3.0)
Record(4, 3.0)
julia> push!(A, Record(5,3.0))
3-element Array{Record,1}:
Record(3, 3.0)
Record(4, 3.0)
Record(5, 3.0)
julia> searchsortedfirst(A, 4, lt=(r,x)->r.y<x)
2
Here, (r,x)->r.y<x is an anonymous function defining your custom less-than. It takes two arguments (the elements to be compared). The first will be the elements from A, the second is the fixed element to compare to.

Related

Optimizing custom fill of a 2d array in Julia

I'm a little new to Julia and am trying to use the fill! method to improve code performance on Julia. Currently, I read a 2d array from a file say read_array and perform row-operations on it to get a processed_array as follows:
function preprocess(matrix)
# Initialise
processed_array= Array{Float64,2}(undef, size(matrix));
#first row of processed_array is the difference of first two row of matrix
processed_array[1,:] = (matrix[2,:] .- matrix[1,:]) ;
#last row of processed_array is difference of last two rows of matrix
processed_array[end,:] = (matrix[end,:] .- matrix[end-1,:]);
#all other rows of processed_array is the mean-difference of other two rows
processed_array[2:end-1,:] = (matrix[3:end,:] .- matrix[1:end-2,:]) .*0.5 ;
return processed_array
end
However, when I try using the fill! method I get a MethodError.
processed_array = copy(matrix)
fill!(processed_array [1,:],d[2,:]-d[1,:])
MethodError: Cannot convert an object of type Matrix{Float64} to an object of type Float64
I'll be glad if someone can tell me what I'm missing and also suggest a method to optimize the code. Thanks in advance!
fill!(A, x) is used to fill the array A with a unique value x, so it's not what you want anyway.
What you could do for a little performance gain is to broadcast the assignments. That is, use .= instead of =. If you want, you can also use the #. macro to automatically add dots everywhere for you (for maybe cleaner/easier-to-read code):
function preprocess(matrix)
out = Array{Float64,2}(undef, size(matrix))
#views #. out[1,:] = matrix[2,:] - matrix[1,:]
#views #. out[end,:] = matrix[end,:] - matrix[end-1,:]
#views #. out[2:end-1,:] = 0.5 * (matrix[3:end,:] - matrix[1:end-2,:])
return out
end
For optimal performance, I think you probably want to write the loops explicitly and use multithreading with a package like LoopVectorization.jl for example.
PS: Note that in your code comments you wrote "cols" instead of "rows", and you wrote "mean" but take a difference. (Not sure it was intentional.)

Defining function for any array of integers

I want to define a function that takes as an input any array of dimension 2 that has integers (and only integers) as its elements. Although I know I don't have to specify the type of the arguments of a function in Julia, I would like to do it in order to speed it up.
With the type hierarchy, I can do this for a function that takes integers as input with the following code:
julia> function sum_two(x::Integer)
return x+2
end
sum_two (generic function with 1 method)
julia> sum_two(Int8(4))
6
julia> sum_two(Int16(4))
However, when I try to this for the type Array{Integer,2} I get the following error:
julia> function sum_array(x::Array{Integer,2})
return sum(x)
end
sum_array (generic function with 1 method)
julia> sum_array(ones(Int8,10,10))
ERROR: MethodError: no method matching sum_array(::Array{Int8,2})
Closest candidates are:
sum_array(::Array{Integer,2}) at REPL[4]:2
Stacktrace:
[1] top-level scope at none:0
I don't what I could do to solve this. One option would be to define the method for every lowest-level subtype of Integer in the following way:
function sum_array(x::Array{Int8,2})
return sum(x)
end
function sum_array(x::Array{UInt8,2})
return sum(x)
end
.
.
.
But it doesn't look very practical.
First of all: specifying the types of input arguments to a function does not speed up the code. This is a misunderstanding. You should specify concrete field types when you define structs, but for function signatures it makes no difference whatsoever to performance. You use it to control dispatch.
Now, to your question: Julia's type parameters are invariant, meaning that even if S<:T is true, A{S}<:A{T} is not true. You can read more about that here: https://docs.julialang.org/en/v1/manual/types/index.html#Parametric-Composite-Types-1
Therefore, ones(Int8,10,10), which is a Matrix{Int8} is not a subtype of Matrix{Integer}.
To get your code to work, you can do this:
function sum_array(x::Array{T, 2}) where {T<:Integer}
return sum(x)
end
or use this nice shortcut
function sum_array(x::Array{<:Integer, 2})
return sum(x)
end

Updating a static array, in a nested function without making a temporary array? <Julia>

I have been banging my head against a wall trying to use static arrays in julia.
https://github.com/JuliaArrays/StaticArrays.jl
They are fast but updating them is a pain. This is no surprise, they are meant to be immutable!
But it is continuously recommended to me that I use static arrays even though I have to update them. In my case, the static arrays are small, just length 3, and i have a vector of them, but I only update 1 length three SVector at a time.
Option 1
There is a really neat package called Setfield that allows you to do inplace updates of SVectors in Julia.
https://github.com/jw3126/Setfield.jl
The catch... it updates the local copy. So if you are in a nested function, it updates the local copy. So it comes with some book keeping since you have to inplace update the local copy and then return that copy and update the actual array of interest. You can't pass in your desired array and update it in place, at least, not that I can figure out! Now, I do not mind bookeeping, but I feel like updating a local copy, then returning the value, updating another local copy, and then returning the values and finally updating the actual array must come with a speed penalty. I could be wrong.
Option 2
It bugs me that in order to do an update a static array I must
exampleSVector::SVector{3,Float64} <-- just to make clear its type and size
exampleSVector = [value1, value2, value3]
This will update the desired array even if it is inside a function, which is nice and the goal, but if you do this inside a function it creates a temporary array. And this kills me because my function is in a loop that gets called 4+ million times, so this creates a ton of allocations and slows things down.
How do I update an SVector for the Option 2 scenario without creating a temporary array?
For the Option 1 scenario, can I update the actual array of interest rather than the local copy?
If this requires a simple example code, please say so in the comments, and I will make one. My thinking is that it is answerable without one, but I will make one if it is needed.
EDIT:
MCVE code - Option 1 works, option 2 does not.
using Setfield
using StaticArrays
struct Keep
dreaming::Vector{SVector{3,Float64}}
end
function INNER!(vec::SVector{3,Float64},pre::SVector{3,Float64})
# pretend series of calculations
for i = 1:3 # illustrate use of Setfield (used in real code for this)
pre = #set pre[i] = rand() * i * 1000
end
# more pretend calculations
x = 25.0 # assume more calculations equals x
################## OPTION 1 ########################
vec = #set vec = x * [ pre[1], pre[2], pre[3] ] # UNCOMMENT FOR FOR OPTION 1
return vec # UNCOMMENT FOR FOR OPTION 1
################## OPTION 2 ########################
#vec = x * [ pre[1], pre[2], pre[3] ] # UNCOMMENT FOR FOR OPTION 2
#nothing # UNCOMMENT FOR FOR OPTION 2
end
function OUTER!(always::Keep)
preAllocate = SVector{3}(0.0,0.0,0.0)
for i=1:length(always.dreaming)
always.dreaming[i] = INNER!(always.dreaming[i], preAllocate) # UNCOMMENT FOR FOR OPTION 1
#INNER!(always.dreaming[i], preAllocate) # UNCOMMENT FOR FOR OPTION 2
end
end
code = Keep([zero(SVector{3}) for i=1:5])
OUTER!(code)
println(code.dreaming)
I hope that I have understood your question correctly. It's a bit hard with a MWE like this, that does a lot of things that are mostly redundant and a bit confusing.
There seems to be two alternative interpretations here: Either you really need to update ('mutate') an SVector, but your MWE fails to demonstrate why. Or, you have convinced yourself that you need to mutate, but you actually don't.
I have decided to focus on alternative 2: You don't really need to 'mutate'. Rewriting your code from that point of view simplifies it greatly.
I couldn't find any reason for you to mutate any static vectors here, so I just removed that. The behaviour of the INNER! function with the inputs was very confusing. You provide two inputs but don't use either of them, so I removed those inputs.
function inner()
pre = #SVector [rand() * 1000i for i in 1:3]
x = 25
return pre .* x
end
function outer!(always::Keep)
always.dreaming .= inner.() # notice the dot in inner.()
end
code = Keep([zero(SVector{3}) for i in 1:5])
outer!(code)
display(code.dreaming)
This runs fast and with zero allocations. In general with StaticArrays, don't try to mutate things, just create new instances.
Even though it's not clear from your MWE, there may be some legitimate reason why you may want to 'mutate' an SVector. In that case you can use the setindex method of StaticArrays, you don't need Setfield.jl:
julia> v = rand(SVector{3})
3-element SArray{Tuple{3},Float64,1,3}:
0.4730258499237898
0.23658547518737905
0.9140206579322541
julia> v = setindex(v, -3.1, 2)
3-element SArray{Tuple{3},Float64,1,3}:
0.4730258499237898
-3.1
0.9140206579322541
To clarify: setindex (without a !) does not mutate its input, but creates a new instance with one index value changed.
If you really do need to 'mutate', perhaps you can make a new MWE that shows this. I would recommend that you try to simplify it a bit, because it is quite confusing now. For example, the inclusion of the type Keep seems entirely unnecessary and distracting. Just make a Vector of SVectors and show what you want to do with that.
Edit: Here's an attempt based on the comments below. As far as I understand it now, the question is about modifying a vector of SVectors. You cannot really mutate the SVectors, but you can replace them using a convenient syntax, setindex, where you can keep some of the elements and change some of the others:
oldvec = [zero(SVector{3}) for _ in 1:5]
replacevec = [rand(SVector{3}) for _ in 1:5]
Now we replace the second element of each element of oldvec with the corresponding one in replacevec. First a one-liner:
oldvec .= setindex.(oldvec, getindex.(replacevec, 2), 2)
Then an even faster one with a loop:
for i in eachindex(oldvec, replacevec)
#inbounds oldvec[i] = setindex(oldvec[i], replacevec[i][2], 2)
end
There are two types of static arrays - mutable (starting with M in type name) and immutable ones (starting with S) - just use the mutable ones! have a look at the example below:
julia> mut = MVector{3,Int64}(1:3);
julia> mut[1]=55
55
julia> mut
3-element MArray{Tuple{3},Int64,1,3}:
55
2
3
julia> immut = SVector{3,Int64}(1:3);
julia> inmut[1]=55
ERROR: setindex!(::SArray{Tuple{3},Int64,1,3}, value, ::Int) is not defined.
Let us see some simple benchmark (ordinary array, vs mutable static vs immutable static):
using BenchmarkTools
julia> ord = [1,2,3];
julia> #btime $ord.*$ord;
39.680 ns (1 allocation: 112 bytes)
3-element Array{Int64,1}:
1
4
9
julia> #btime $mut.*$mut
8.533 ns (1 allocation: 32 bytes)
3-element MArray{Tuple{3},Int64,1,3}:
3025
4
9
julia> #btime $immut.*$immut
2.133 ns (0 allocations: 0 bytes)
3-element SArray{Tuple{3},Int64,1,3}:
1
4
9

Creating nested arrays in swift

I'm trying to create nested arrays in Swift of data that's input in the following format:
(+ 5 (- 1 (+ 34 1)))
So the nested array would look like:
[+, 5, [-, 1, [+, 34, 1]]]
with arr[0] always being the operation (+,-,*, or /)
I'm very new to swift and don't really know where to start, so I'd really appreciate the help.
You can put integers, strings, and even functions in the same array by using an enumeration and its associated values. The example does not lend itself to an array but simply a recursive nesting of numbers and computations.
enum Input {
case Operator((Int, Int) -> Int)
case Number(Int)
}
Edit: Per Rob's suggestion.
enum Input {
case Operator((Int, Int) -> Int)
case Number(Int)
indirect case Computation(operator: Input, left: Input, right: Input)
}
This shows how to declare a recursive enumeration. I wish there was a way to limit the operator parameter to Input.Operator and parameters left and right to Input types not .Operator.
you could use array with type of AnyObject as below example.
let arr: [AnyObject] = ["+", 5, ["-", 1, ["/", 34, 1]]]
So in order to get operand you can get at index 0. Let say you want operand division "/" as sample above you can go at this index
arr[2][2][0]
As Price Ringo suggests, a recursive enum is definitely the right tool here. I would build it along these lines:
enum Operator {
case Add
case Subtract
}
enum Expression {
indirect case Operation(Operator, Expression, Expression)
case Value(Int)
}
let expression = Expression.Operation(.Add,
.Value(5),
.Operation(.Subtract,
.Value(1),
.Operation(.Add, .Value(34), .Value(1))))
If you wanted to, you could do it Price's way with functions, and that's probably pretty clever, but I suspect would actually be more of a headache to parse. But if you wanted to, it'd look like this:
enum Expression {
indirect case Operation((Int, Int) -> Int, Expression, Expression)
case Value(Int)
}
let expression = Expression.Operation(+,
.Value(5),
.Operation(-,
.Value(1),
.Operation(+, .Value(34), .Value(1))))
Alternately, you could create an Operator type rather than passing in functions. That's how I'd probably recommend going unless you really want to be able tow
I think you can't build the arrays with different types.. I mean you want to build an Array with Integers and the operators should be Strings... but you can't have both in an array.
check this link, it might be helpful http://totallyswift.com/nested-collections/
Your example looks like a classic Lisp expression. Reflecting that, have you considered the possibility that arrays are a force fit, and you might have an easier time of it with lists as the underlying representation? A list data type would be trivial to implement.

How do I change the data type of a Julia array from "Any" to "Float64"?

Is there a function in Julia that returns a copy of an array in a desired type, i.e., an equivalent of numpys astype function? I have an "Any" type array, and want to convert it to a Float array. I tried:
new_array = Float64(array)
but I get the following error
LoadError: MethodError: `convert` has no method matching
convert(::Type{Float64}, ::Array{Any,2})
This may have arisen from a call to the constructor Float64(...),
since type constructors fall back to convert methods.
Closest candidates are:
call{T}(::Type{T}, ::Any)
convert(::Type{Float64}, !Matched::Int8)
convert(::Type{Float64}, !Matched::Int16)
...
while loading In[140], in expression starting on line 1
in call at essentials.jl:56
I can just write a function that goes through the array and returns a float value of each element, but I find it a little odd if there's no built-in method to do this.
Use convert. Note the syntax I used for the first array; if you know what you want before the array is created, you can declare the type in front of the square brackets. Any just as easily could've been replaced with Float64 and eliminated the need for the convert function.
julia> a = Any[1.2, 3, 7]
3-element Array{Any,1}:
1.2
3
7
julia> convert(Array{Float64,1}, a)
3-element Array{Float64,1}:
1.2
3.0
7.0
You can also use the broadcast operator .:
a = Any[1.2, 3, 7]
Float64.(a)
You can use:
new_array = Array{Float64}(array)
Daniel and Randy's answers are solid, I'll just add another way here I like because it can make more complicated iterative cases relatively succinct. That being said, it's not as efficient as the other answers, which are more specifically related to conversion / type declaration. But since the syntax can be pretty easily extended to other use case it's worth adding:
a = Array{Any,1}(rand(1000))
f = [float(a[i]) for i = 1:size(a,1)]

Resources