Sorted version of in - arrays

I have an array of times event_times and I want to check if t in event_times. However, I know that event_times is sorted. Is there a way to make use of that to make the search faster?

An idiomatic Julian way would be an elaboration of:
struct SortedVector{T,V<:AbstractVector} <: AbstractVector{T}
v::V
SortedVector{T,V}(v::AbstractVector{T}) where {T, V} = new(v)
# check sorted in inner constructor??
end
SortedVector(v::AbstractVector{T}) where T = SortedVector{T,typeof(v)}(v)
#inline Base.size(sv::SortedVector) = size(sv.v)
#inline Base.getindex(sv::SortedVector,i) = sv.v[i]
#inline Base.in(e::T,sv::SortedVector{T}) where T = !isempty(searchsorted(sv.v,e))
And then:
julia> v = SortedVector(sort(rand(1:10,10)))
10-element SortedVector{Int64,Array{Int64,1}}:
1
4
5
5
6
6
6
7
7
10
julia> 3 in v
false
julia> 1 in v
true
If I recall correctly David Sanders had an implementation with this name. Perhaps looking at https://github.com/JuliaIntervals/IntervalOptimisation.jl/blob/889bf43e8a514e696869baaa6af1300ace87b90b/src/SortedVectors.jl would promote reuse.

Following #ColinTBowers's hint, you can use the fact that searchsorted returns a range which is empty iff t is not in event_times. Thus !isempty(searchsorted(event_times,t)) is a fast method to get the answer.

Related

Is there a way to turn a array into a integer or float?

I'm trying to change an array with int into a single int in Julia 1.5.4 like that:
x = [1,2,3]
Here i would try or use a code/command (here: example())
x_new = example(x)
println(x_new)
typeof(x_new)
Ideal output would be something like this :
123
Int32
I already tried to solve this problem with parse() or push!() or something like this. But nothing worked well.
I couldn't find a similar problem...
You can find an issue about adding this functionality to Julia here: https://github.com/JuliaLang/julia/issues/40393
Bottom line, you don't want to use strings, and you should avoid unnecessary exponentiation, both of which will be really slow.
A very brief solution is
evalpoly(10, reverse([1,2,3]))
Spelling it out a bit more, you can do this
function joindigits(xs)
val = 0
for x in xs
val = 10*val + x
end
return val
end
Is this what you need?
julia> x = [1,2,3]
3-element Vector{Int64}:
1
2
3
julia> list2int(x) = sum(10 .^ (length(x)-1:-1:0) .* x)
list2int (generic function with 1 method)
julia> list2int(x)
123
You are looking for string concatenation and then parsing:
x_new = parse(Int64, string(x...))
Another interesting way to convert many small numbers to a bigger one is to combine raw bytes:
julia> reinterpret(Int16, [Int8(2),Int8(3)])
1-element reinterpret(Int16, ::Vector{Int8}):
770
Note that 770 = 256*3 + 2
Or for actual Ints:
julia> reinterpret(Int128, [10,1])
1-element reinterpret(Int128, ::Vector{Int64}):
18446744073709551626
(note that result is exactly Int128(2)^64+10)

How to exchange one specific value in an array in Julia?

I'm pretty new to Julia, so this is probably a pretty easy question. I want to create a vector and exchange a given value with a new given value.
This is how it would work in Java, but I can't find a solution for Julia. Do I have to copy the array first? I'm pretty clueless..
function sorted_exchange(v::Array{Int64,1}, in::Int64, out::Int64)
i=1
while v[i]!=out
i+=1
end
v[i]=in
return v
end
The program runs but just returns the "old" vector.
Example: sorted_exchange([1,2,3],4,3), expected:[1,2,4], actual:[1,2,3]
There's a nice built-in function for this: replace or its in-place version: replace!:
julia> v = [1,2,3];
julia> replace!(v, 3=>4);
julia> v
3-element Array{Int64,1}:
1
2
4
The code you have posted seems to work fine, though it does something slightly different. Your code only replaces the first instance of 3, while replace! replaces every instance. If you just want the first instance to be replaced you can write:
julia> v = [1,2,3,5,3,5];
julia> replace!(v, 3=>4; count=1)
6-element Array{Int64,1}:
1
2
4
5
3
5
You can find the value you want to replace using findall:
a = [1, 2, 5]
findall(isequal(5), a) # returns 3, the index of the 5 in a
and use that to replace the value
a[findall(isequal(5), a)] .= 6
a # returns [1, 2, 6]

Reverse lookup with non-unique values

What I'm trying to do
I have an array of numbers:
>> A = [2 2 2 2 1 3 4 4];
And I want to find the array indices where each number can be found:
>> B = arrayfun(#(x) {find(A==x)}, 1:4);
In other words, this B should tell me:
>> for ii=1:4, fprintf('Item %d in location %s\n',ii,num2str(B{ii})); end
Item 1 in location 5
Item 2 in location 1 2 3 4
Item 3 in location 6
Item 4 in location 7 8
It's like the 2nd output argument of unique, but instead of the first (or last) occurrence, I want all the occurrences. I think this is called a reverse lookup (where the original key is the array index), but please correct me if I'm wrong.
How can I do it faster?
What I have above gives the correct answer, but it scales terribly with the number of unique values. For a real problem (where A has 10M elements with 100k unique values), even this stupid for loop is 100x faster:
>> B = cell(max(A),1);
>> for ii=1:numel(A), B{A(ii)}(end+1)=ii; end
But I feel like this can't possibly be the best way to do it.
We can assume that A contains only integers from 1 to the max (because if it doesn't, I can always pass it through unique to make it so).
That's a simple task for accumarray:
out = accumarray(A(:),(1:numel(A)).',[],#(x) {x}) %'
out{1} = 5
out{2} = 3 4 2 1
out{3} = 6
out{4} = 8 7
However accumarray suffers from not being stable (in the sense of unique's feature), so you might want to have a look here for a stable version of accumarray, if that's a problem.
Above solution also assumes A to be filled with integers, preferably with no gaps in between. If that is not the case, there is no way around a call of unique in advance:
A = [2.1 2.1 2.1 2.1 1.1 3.1 4.1 4.1];
[~,~,subs] = unique(A)
out = accumarray(subs(:),(1:numel(A)).',[],#(x) {x})
To sum up, the most generic solution, working with floats and returning a sorted output could be:
[~,~,subs] = unique(A)
[subs(:,end:-1:1), I] = sortrows(subs(:,end:-1:1)); %// optional
vals = 1:numel(A);
vals = vals(I); %// optional
out = accumarray(subs, vals , [],#(x) {x});
out{1} = 5
out{2} = 1 2 3 4
out{3} = 6
out{4} = 7 8
Benchmark
function [t] = bench()
%// data
a = rand(100);
b = repmat(a,100);
A = b(randperm(10000));
%// functions to compare
fcns = {
#() thewaywewalk(A(:).');
#() cst(A(:).');
};
% timeit
t = zeros(2,1);
for ii = 1:100;
t = t + cellfun(#timeit, fcns);
end
format long
end
function out = thewaywewalk(A)
[~,~,subs] = unique(A);
[subs(:,end:-1:1), I] = sortrows(subs(:,end:-1:1));
idx = 1:numel(A);
out = accumarray(subs, idx(I), [],#(x) {x});
end
function out = cst(A)
[B, IX] = sort(A);
out = mat2cell(IX, 1, diff(find(diff([-Inf,B,Inf])~=0)));
end
0.444075509687511 %// thewaywewalk
0.221888202987325 %// CST-Link
Surprisingly the version with stable accumarray is faster than the unstable one, due to the fact that Matlab prefers sorted arrays to work on.
This solution should work in O(N*log(N)) due sorting, but is quite memory intensive (requires 3x the amount of input memory):
[U, X] = sort(A);
B = mat2cell(X, 1, diff(find(diff([Inf,U,-Inf])~=0)));
I am curious about the performance though.

How to check whether have same element in two arrays? [duplicate]

This question already has answers here:
How can I check if a Ruby array includes one of several values?
(5 answers)
Closed 7 years ago.
For example:
a = [1,2,3,4,5,6,7,8]
b = [1,9,10,11,12,13,14,15]
a array has 1 and b array has 1 too. So they have the same element.
How to compare them and return true or false with ruby?
Check if a & b is empty:
a & b
# => [1]
(a & b).empty?
# => false
If you have many elements per Array, doing an intersection (&) can be an expensive operation. I assume that it would be quicker to go 'by hand':
def have_same_element?(array1, array2)
# Return true on first element found that is in both array1 and array2
# Return false if no such element found
array1.each do |elem|
return true if array2.include?(elem)
end
return false
end
a = [*1..100] # [1, 2, 3, ... , 100]
b = a.reverse.to_a # [100, 99, 98, ... , 1]
puts have_same_element?(a, b)
If you know more beforehand (e.g. "array1 contains many duplicates") you can further optimize the operation (e.g. by calling uniq or compact first, depending on your data).
Would be interesting to see actual benchmarks.
Edit
require 'benchmark'
Benchmark.bmbm(10) do |bm|
bm.report("by hand") {have_same_element?(a, b)}
bm.report("set operation") { (a & b).empty? }
end
Rehearsal -------------------------------------------------
by hand 0.000000 0.000000 0.000000 ( 0.000014)
set operation 0.000000 0.000000 0.000000 ( 0.000095)
---------------------------------------- total: 0.000000sec
user system total real
by hand 0.000000 0.000000 0.000000 ( 0.000012)
set operation 0.000000 0.000000 0.000000 ( 0.000131)
So, in this case it looks as if the "by hand" method is really faster, but its quite a sloppy method of benchmarking with limited expressiveness.
Also, see #CarySwoveland s excellent comments about using sets, proper benchmarking and a snappier expression using find (detect would do the same and be more expressive imho - but carefull as it returns the value found - if your arrays contain falsey values like nil (or false)...; you generally want to use any?{} here).
Intersection of two arrays can get using & operator. If you need to get similar elements in two arrays, take intersect as
a = [1,2,3,4,5,6,7,8]
b = [1,9,10,11,12,13,14,15]
and taking intersection
u = a & b
puts u
# [1]
u.empty?
# false

Multiple assignment in Scala without using Array?

I have an input something like this: "1 2 3 4 5".
What I would like to do, is to create a set of new variables, let a be the first one of the sequence, b the second, and xs the rest as a sequence (obviously I can do it in 3 different lines, but I would like to use multiple assignment).
A bit of search helped me by finding the right-ignoring sequence patterns, which I was able to use:
val Array(a, b, xs # _*) = "1 2 3 4 5".split(" ")
What I do not understand is that why doesn't it work if I try it with a tuple? I get an error for this:
val (a, b, xs # _*) = "1 2 3 4 5".split(" ")
The error message is:
<console>:1: error: illegal start of simple pattern
Are there any alternatives for multiple-assignment without using Array?
I have just started playing with Scala a few days ago, so please bear with me :-) Thanks in advance!
Other answers tell you why you can't use tuples, but arrays are awkward for this purpose. I prefer lists:
val a :: b :: xs = "1 2 3 4 5".split(" ").toList
Simple answer
val Array(a, b, xs # _*) = "1 2 3 4 5".split(" ")
The syntax you are seeing here is a simple pattern-match. It works because "1 2 3 4 5".split(" ") evaluates to an Array:
scala> "1 2 3 4 5".split(" ")
res0: Array[java.lang.String] = Array(1, 2, 3, 4, 5)
Since the right-hand-side is an Array, the pattern on the left-hand-size must, also, be an Array
The left-hand-side can be a tuple only if the right-hand-size evaluates to a tuple as well:
val (a, b, xs) = (1, 2, Seq(3,4,5))
More complex answer
Technically what's happening here is that the pattern match syntax is invoking the unapply method on the Array object, which looks like this:
def unapplySeq[T](x: Array[T]): Option[IndexedSeq[T]] =
if (x == null) None else Some(x.toIndexedSeq)
Note that the method accepts an Array. This is what Scala must see on the right-hand-size of the assignment. And it returns a Seq, which allows for the #_* syntax you used.
Your version with the tuple doesn't work because Tuple3's unapplySeq is defined with a Product3 as its parameter, not an Array:
def unapply[T1, T2, T3](x: Product3[T1, T2, T3]): Option[Product3[T1, T2, T3]] =
Some(x)
You can actually "extractors" like this that do whatever you want by simply creating an object and writing an unapply or unapplySeq method.
The answer is:
val a :: b :: c = "1 2 3 4 5".split(" ").toList
Should clarify that in some cases one may want to bind just the first n elements in a list, ignoring the non-matched elements. To do that, just add a trailing underscore:
val a :: b :: c :: _ = "1 2 3 4 5".split(" ").toList
That way:
c = "3" vs. c = List("3","4","5")
I'm not an expert in Scala by any means, but I think this might have to do with the fact that Tuples in Scala are just syntatic sugar for classes ranging from Tuple2 to Tuple22.
Meaning, Tuples in Scala aren't flexible structures like in Python or other languages of the sort, so it can't really create a Tuple with an unknown a priori size.
We can use pattern matching to extract the values from string and assign it to multiple variables. This requires two lines though.
Pattern says that there are 3 numbers([0-9]) with space in between. After the 3rd number, there can be text or not, which we don't care about (.*).
val pat = "([0-9]) ([0-9]) ([0-9]).*".r
val (a,b,c) = "1 2 3 4 5" match { case pat(a,b,c) => (a,b,c) }
Output
a: String = 1
b: String = 2
c: String = 3

Resources