Defining function for any array of integers - arrays

I want to define a function that takes as an input any array of dimension 2 that has integers (and only integers) as its elements. Although I know I don't have to specify the type of the arguments of a function in Julia, I would like to do it in order to speed it up.
With the type hierarchy, I can do this for a function that takes integers as input with the following code:
julia> function sum_two(x::Integer)
return x+2
end
sum_two (generic function with 1 method)
julia> sum_two(Int8(4))
6
julia> sum_two(Int16(4))
However, when I try to this for the type Array{Integer,2} I get the following error:
julia> function sum_array(x::Array{Integer,2})
return sum(x)
end
sum_array (generic function with 1 method)
julia> sum_array(ones(Int8,10,10))
ERROR: MethodError: no method matching sum_array(::Array{Int8,2})
Closest candidates are:
sum_array(::Array{Integer,2}) at REPL[4]:2
Stacktrace:
[1] top-level scope at none:0
I don't what I could do to solve this. One option would be to define the method for every lowest-level subtype of Integer in the following way:
function sum_array(x::Array{Int8,2})
return sum(x)
end
function sum_array(x::Array{UInt8,2})
return sum(x)
end
.
.
.
But it doesn't look very practical.

First of all: specifying the types of input arguments to a function does not speed up the code. This is a misunderstanding. You should specify concrete field types when you define structs, but for function signatures it makes no difference whatsoever to performance. You use it to control dispatch.
Now, to your question: Julia's type parameters are invariant, meaning that even if S<:T is true, A{S}<:A{T} is not true. You can read more about that here: https://docs.julialang.org/en/v1/manual/types/index.html#Parametric-Composite-Types-1
Therefore, ones(Int8,10,10), which is a Matrix{Int8} is not a subtype of Matrix{Integer}.
To get your code to work, you can do this:
function sum_array(x::Array{T, 2}) where {T<:Integer}
return sum(x)
end
or use this nice shortcut
function sum_array(x::Array{<:Integer, 2})
return sum(x)
end

Related

How do you initialize an array with only one element, without hardcoding the first index?

I am trying to pass an array of Unbounded_String to a function, and I don't care about the range of the index, as the function is going to loop over each element.
The (element1, element2) syntax automatically starts at the first index value in the range, then increments for the second value given, which works fine for more than one value. However, for a single value, this cannot be used as the parentheses are considered superfluous.
This code shows the error messages for each of the attempts I have made. (1) works, but (2), the preferable syntax for passing a single-element array, does not. (3) works, and is given as an answer to this similar question. However, this hardcodes the first index of the range into the calling side; if the String_Array implementation changes, all the call-sites have to be changed, even though they don't care about the index values used.
with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;
procedure Main is
function "+"(S: String) return Ada.Strings.Unbounded.Unbounded_String
renames Ada.Strings.Unbounded.To_Unbounded_String;
type String_Array is array (Positive range <>) of Unbounded_String;
procedure Foo(input : in String_Array) is
begin
null;
end Foo;
begin
Foo((+"one", +"two")); --(1)
--Foo((+"only")); --(2) positional aggregate cannot have one component
Foo((1 => +"only")); --(3)
--Foo((String_Array'First => +"only")); --(4) prefix for "First" attribute must be constrained array
--Foo((String_Array'Range => +"only")); --(5) prefix for "Range" attribute must be constrained array
--Foo((String_Array'Range'First => +"only")); --(6) range attribute cannot be used in expression
--Foo((String_Array'Range'Type_Class'First => +"only")); --(7) range attribute cannot be used in expression
end Main;
What you want (2) is indeed impossible as it could be mistaken for a parenthesized expression (see http://www.adaic.org/resources/add_content/standards/12aarm/html/AA-4-3-3.html note 10).
If you really want to avoid expression (3) for the reasons you stated, as workaround, you could define a function to handle the one-element array case:
function Singleton_String_Array (S: String) return String_Array is ((1 => + S));
-- one element call
Foo(Singleton_String_Array ("only"));
It reuse your expression (3) but the first index hardcoding is no longer done on call site.
You can also overload your foo function to handle the special one-element case:
procedure Process_String (input : in Ada.Strings.Unbounded.Unbounded_String) is
begin
null;
end Process_String;
procedure Foo(input : in String_Array) is
begin
for string of input loop
Process_String (string);
end loop;
end Foo;
procedure Foo(input : in Ada.Strings.Unbounded.Unbounded_String) is
begin
Process_String (input);
end Foo;
-- One element call
Foo(+"only");
The short answer is that all array objects must be constrained, which means callers usually have to decide on the array bounds.
However, you know the index type, and could do
Foo((Positive'First => +"only"));
which doesn't really answer your question, since someone may still fiddle with the array range, and there's not really any guard against that.
Adding a new subtype as the range may be a viable solution, though:
subtype String_Array_Range is Positive;
type String_Array is array (String_Array_Range range <>) of Unbounded_String;
...
Foo((String_Array_Range'First => +"only"));
Any fiddling can now be done on the String_Array_Range subtype without affecting any callers. But there's still no guarantee against evil programmers changing the index type of the array itself...
type String_Array is array (Positive range <>) of Unbounded_String;
Declares a type of array but doesn't provide the size.
Remember that an array has a static size.
So String_Array'First and String_Array'Range don't match to anything.
If you declared
type my_String_Array is String_Array(1 .. 35);
my_arr : my_String_Array
Then my_arr'First denotes 1 and my_arr'Range denotes 1..35.
As long as you don't put a contraint on the type, you won't have access to these attributes.

Custom searchsortedfirst method

I'm kinda new in Julia lang, so I'm still struggling with reading Julia documentation. Here is a piece of it and I am looking for explanation specifically the bolded part.
Base.Sort.searchsortedfirst — Function.
searchsortedfirst(a, x, [by=,] [lt=,]
[rev=false])
Returns the index of the first value in a greater than or equal to x,
according to the specified order. Returns length(a)+1 if x is greater
than all values in a. a is assumed to be sorted.
Website
My array looks like this:
A = Vector{Record}()
where
type Record
y::Int64
value::Float64
end
Now here is my problem. I would like to call above-mentioned method on my array and obtain Record where given x equals y in this Record (Record.y == x). Guess I have to write 'by' transfrom or 'lt' comparator? or both?
Any help would be appraciated :)
#crstnbr has provided a perfectly good answer for the case of one-off uses of searchsortedfirst. I thought it worth adding that there is also a more permanent solution. If your type Record exhibits a natural ordering, then just extend Base.isless and Base.isequal to your new type. The following example code shows how this works for some new type you might define:
struct MyType ; x::Float64 ; end #Define some type of my own
yvec = MyType.(sort!(randn(10))) #Build a random vector of my type
yval = MyType(0.0) #Build a value of my type
searchsortedfirst(yvec, yval) #ERROR: this use of searchsortedfirst will throw a MethodError since julia doesn't know how to order MyType
Base.isless(y1::MyType, y2::MyType)::Bool = y1.x < y2.x #Extend (aka overload) isless so it is defined for the new type
Base.isequal(y1::MyType, y2::MyType)::Bool = y1.x == y2.x #Ditto for isequal
searchsortedfirst(yvec, yval) #Now this line works
Some points worth noting:
1) In the step where I overload isless and isequal, I preface the method definition with Base.. This is because the isless and isequal functions are originally defined in Base, where Base refers to the core julia package that is automatically loaded every time you start julia. By prefacing with Base., I ensure that my new methods are added to the current set of methods for these two functions, rather than replacing them. Note, I could also achieve this by omitting Base. but including a line beforehand of import Base: isless, isequal. Personally, I prefer the way I've done it above (for the overly pedantic, you can also do both).
2) I can define isless and isequal however I want. It is my type and my method extensions. So you can choose whatever you think is the natural ordering for your new type.
3) The operators <. <=, ==, >=, >, all actually just call isless and isequal under the hood, so all of these operators will now work with your new type, eg MyType(1.0) > MyType(2.0) returns false.
4) Any julia function that uses the comparative operators above will now work with your new type, as long as the function is defined parametrically (which almost everything in Base is).
You can just define a custom less-than operation and give it to searchsortedfirst via lt keyword argument:
julia> type Record
y::Int64
value::Float64
end
julia> A = Vector{Record}()
0-element Array{Record,1}
julia> push!(A, Record(3,3.0))
1-element Array{Record,1}:
Record(3, 3.0)
julia> push!(A, Record(4,3.0))
2-element Array{Record,1}:
Record(3, 3.0)
Record(4, 3.0)
julia> push!(A, Record(5,3.0))
3-element Array{Record,1}:
Record(3, 3.0)
Record(4, 3.0)
Record(5, 3.0)
julia> searchsortedfirst(A, 4, lt=(r,x)->r.y<x)
2
Here, (r,x)->r.y<x is an anonymous function defining your custom less-than. It takes two arguments (the elements to be compared). The first will be the elements from A, the second is the fixed element to compare to.

Converting Array{Array{Float64,1},1} to Array{Float64,2} in Julia

My problem is similar to the problem described earlier,
with the difference that I don't input numbers manually. Thus the accepted answer there does not work for me.
I want to convert the vector of cartesian coordinates to polars:
function cart2pol(x0,
x1)
rho = sqrt(x0^2 + x1^2)
phi = atan2(x1, x0)
return [rho, phi]
end
#vectorize_2arg Number cart2pol
function cart2pol(x)
x1 = view(x,:,1)
x2 = view(x,:,2)
return cart2pol(x1, x2)
end
x = rand(5,2)
vcat(cart2pol(x))
The last command does not collect Arrays for some reason, returning the output of type 5-element Array{Array{Float64,1},1}. Any idea how to cast it to Array{Float64,2}?
If you look at the definition of cat (which is the underlying function for hcat and vcat), you see that you can collect several arrays into one single array of dimension 2:
cat(2, [1,2], [3,4], [5,6])
2×3 Array{Int64,2}:
1 3 5
2 4 6
This is basically what you want. The problem is that you have all your output polar points in an array itself. cat expects you to provide them as several arguments. This is where ... comes in.
... used to cause a single function argument to be split apart into many different arguments when used in the context of a function call.
Therefore, you can write
cat(2, [[1,2], [3,4], [5,6]]...)
2×3 Array{Int64,2}:
1 3 5
2 4 6
In your situation, it works exactly in the same way (I changed your x to have the points in columns):
x=rand(2,5)
cat(2, cart2pol.(view(x,1,:),view(x,2,:))...)
2×5 Array{Float64,2}:
0.587301 0.622 0.928159 0.579749 0.227605
1.30672 1.52956 0.352177 0.710973 0.909746
The function mapslices can also do this, essentially transforming the rows of the input:
julia> x = rand(5,2)
5×2 Array{Float64,2}:
0.458583 0.205246
0.285189 0.992547
0.947025 0.0853141
0.79599 0.67265
0.0273176 0.381066
julia> mapslices(row->cart2pol(row[1],row[2]), x, [2])
5×2 Array{Float64,2}:
0.502419 0.420827
1.03271 1.291
0.95086 0.0898439
1.04214 0.701612
0.382044 1.49923
The last argument specifies dimensions to operate over; e.g. passing [1] would transform columns.
As an aside, I would encourage one or two stylistic changes. First, it's good to map like to like, so if we stick with the row representation then cart2pol should accept a 2-element array (since that's what it returns). Then this call would just be mapslices(cart2pol, x, [2]). Or, if what we're really trying to represent is an array of coordinates, then the data could be an array of tuples [(x1,y1), (x2,y2), ...], and cart2pol could accept and return a tuple. In either case cart2pol would not need to be able to operate on arrays, and it's partly for this reason that we've deprecated the #vectorize_ macros.

How do I change the data type of a Julia array from "Any" to "Float64"?

Is there a function in Julia that returns a copy of an array in a desired type, i.e., an equivalent of numpys astype function? I have an "Any" type array, and want to convert it to a Float array. I tried:
new_array = Float64(array)
but I get the following error
LoadError: MethodError: `convert` has no method matching
convert(::Type{Float64}, ::Array{Any,2})
This may have arisen from a call to the constructor Float64(...),
since type constructors fall back to convert methods.
Closest candidates are:
call{T}(::Type{T}, ::Any)
convert(::Type{Float64}, !Matched::Int8)
convert(::Type{Float64}, !Matched::Int16)
...
while loading In[140], in expression starting on line 1
in call at essentials.jl:56
I can just write a function that goes through the array and returns a float value of each element, but I find it a little odd if there's no built-in method to do this.
Use convert. Note the syntax I used for the first array; if you know what you want before the array is created, you can declare the type in front of the square brackets. Any just as easily could've been replaced with Float64 and eliminated the need for the convert function.
julia> a = Any[1.2, 3, 7]
3-element Array{Any,1}:
1.2
3
7
julia> convert(Array{Float64,1}, a)
3-element Array{Float64,1}:
1.2
3.0
7.0
You can also use the broadcast operator .:
a = Any[1.2, 3, 7]
Float64.(a)
You can use:
new_array = Array{Float64}(array)
Daniel and Randy's answers are solid, I'll just add another way here I like because it can make more complicated iterative cases relatively succinct. That being said, it's not as efficient as the other answers, which are more specifically related to conversion / type declaration. But since the syntax can be pretty easily extended to other use case it's worth adding:
a = Array{Any,1}(rand(1000))
f = [float(a[i]) for i = 1:size(a,1)]

Basic operations combining two SharedArrays

I've spent the last month or so learning julia and I'm very impressed. In particular I'm analysing large amount of climate model output, I put all this into SharedArrays and adjust and plot it all in parallel. So far it's very quick and efficient and I've got quite a library of code. My current problem is in creating a function that can do basic operations on two shared arrays. I've successfully written a function that takes two arrays and how you want to process them. The code is based around the example in the parallel section of the julia doc and uses the myrange function as shown there
function myrange(q::SharedArray)
idx = indexpids(q)
##show (idx)
if idx == 0
# This worker is not assigned a piece
return 1:0, 1:0
print("NO WORKERS ASSIGNED")
end
nchunks = length(procs(q))
splits = [round(Int, s) for s in linspace(0,length(q),nchunks+1)]
splits[idx]+1:splits[idx+1]
end
function combine_arrays_chunk!(array_1,array_2,output_array,func, length_range);
##show (length_range)
for i in length_range
output_array[i] = func(array_1[i], array_2[i]);
#hardwired example for func = +
#output_array[i] = +(array_1[i], array_2[i]);
end
output_array
end
combine_arrays_shared_chunk!(array_1,array_2,output_array,func) = combine_arrays_chunk!(array_1,array_2,output_array,func, myrange(array_1));
function combine_arrays_shared(array_1::SharedArray,array_2::SharedArray,func)
if size(array_1)!=size(array_2)
return print("inputs not of the same size")
end
output_array=SharedArray(Float64,size(array_1));
#sync begin
for p in procs(array_1)
#async remotecall_wait(p, combine_arrays_shared_chunk!, array_1,array_2,output_array,func)
end
end
output_array
end
The works so one can do
strain_div = combine_arrays_shared(eps_1,eps_2,+);
strain_tot = combine_arrays_shared(eps_1,eps_2,hypot);
with the correct results an the output as a shared array as required. But ... it's quite slow. It's actually quicker to combine the sharedarray as a normal array on one processor, calculate and then convert back to a sharedarray (for my test cases anyway, with each array approx 200MB, when I move up to GBs I guess not). I can hardwire the combine_arrays_shared function to only do addition (or some other function), and then you get the speed increase, but with function type being passed within combine_arrays_shared the whole thing is slow (10 times slower than the hard wired addition).
I've looked at the FastAnonymous.jl package but I can't see how it would work in this case. I tried, and failed. Any ideas?
I might just resort to writing a different combine_arrays_... function for each basic function I use, or having the func argument as a option and call different functions from within combine_arrays_shared, but I want it to be more elegant! Also this is good way to learn more about Julia.
Harry
This question actually has nothing to do with SharedArrays, and is just "how do I pass functions-as-arguments and get better performance?"
The way FastAnonymous works---and similar to the way closures will work in julia soon---is to create a type with a call method. If you're having trouble with FastAnonymous for some reason, you can always do it manually:
julia> immutable Foo end
julia> Base.call(f::Foo, x, y) = x*y
call (generic function with 1036 methods)
julia> function applyf(f, X)
s = zero(eltype(X))
for x in X
s += f(x, x)
end
s
end
applyf (generic function with 1 method)
julia> X = rand(10^6);
julia> f = Foo()
Foo()
# Run the function once with each type of argument to JIT-compile
julia> applyf(f, X)
333375.63216645207
julia> applyf(*, X)
333375.63216645207
# Compile anything used by #time
julia> #time 1
0.000004 seconds (148 allocations: 10.151 KB)
1
# Now let's benchmark
julia> #time applyf(f, X)
0.002860 seconds (5 allocations: 176 bytes)
333433.439233112
julia> #time applyf(*, X)
0.142411 seconds (4.00 M allocations: 61.035 MB, 19.24% gc time)
333433.439233112
Note the big increase in speed and greatly-reduced memory consumption.

Resources