How can I refactor this Ruby method to run faster? - arrays

The method below is supposed to take an array a and return the duplicated integer whose second index value is the lowest. The array will only include integers between 1 and a.length. With this example,
firstDuplicate([1,2,3,2,4,5,1])
the method returns 2.
def firstDuplicate(a)
num = 1
big_num_array = []
a.length.times do
num_array = []
if a.include?(num)
num_array.push(a.index(num))
a[a.index(num)] = "x"
if a.include?(num)
num_array.unshift(a.index(num))
num_array.push(num)
end
big_num_array.push(num_array) if num_array.length == 3
end
num += 1
end
if big_num_array.length > 0
big_num_array.sort![0][2]
else
-1
end
end
The code works, but seems longer than necessary and doesn't run fast enough. I am looking for ways to refactor this.

You could count the entries as you go and use Enumerable#find to stop iterating as soon as you find something again:
h = { }
a.find do |e|
h[e] = h[e].to_i + 1 # The `to_i` converts `nil` to zero without a bunch of noise.
h[e] == 2
end
You could also say:
h = Hash.new(0) # to auto-vivify with zeros
a.find do |e|
h[e] += 1
h[e] == 2
end
or use Hash#fetch with a default value:
h = { }
a.find do |e|
h[e] = h.fetch(e, 0) + 1
h[e] == 2
end
find will stop as soon as it finds an element that makes that block true so this should be reasonably efficient.

Here are two ways that could be done quite simply.
Use a set
require 'set'
def first_dup(arr)
st = Set.new
arr.find { |e| st.add?(e).nil? }
end
first_dup [1,2,3,2,4,5,4,1,4]
#=> 2
first_dup [1,2,3,4,5]
#=> nil
See Set#add?.
Use Array#difference
def first_dup(arr)
arr.difference(arr.uniq).first
end
first_dup [1,2,3,2,4,5,4,1,4]
#=> 2
first_dup [1,2,3,4,5]
#=> nil
I have found Array#difference to be sufficiently useful that I proposed it be added to the Ruby core (but it doesn't seem to be gaining traction). It is as follows:
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
As explained at the link, it differs from Array#- as follows:
a = [1,2,2,3,3,2,2]
b = [2,2,3]
a - b
#=> [1]
a.difference(b)
#=> [1,3,2,2]
That is, difference "removes" one 2in a for each 2 in b (similar for 3), preserving the order of what's left of a. a is not mutated, however.
The steps in the example given above for the present problem are as follows.
arr = [1,2,3,2,4,5,4,1,4]
a = arr.uniq
#=> [1,2,3,4,5]
b = arr.difference(a)
#=> [2, 4, 1, 4]
b.first
#=> 2

If you are looking for super performance, ruby is probably not a best language of choice. If you are looking for a readability, here you go:
[1,2,3,2,4,5,1].
map. # or each (less readable, probably faster)
with_index.
group_by(&:shift). # or group_by(&:first)
min_by { |v, a| a[1] && a[1].last || Float::INFINITY }.
first
#⇒ 2

Related

How do I find the unique number in an array and return only that number in ruby?

There is an array with some numbers. All numbers are equal except for one. I'm trying to get this type of thing:
find_uniq([ 1, 1, 1, 2, 1, 1 ]) == 2
find_uniq([ 0, 0, 0.55, 0, 0 ]) == 0.55
I tried this:
def find_uniq(arr)
arr.uniq.each{|e| arr.count(e)}
end
It gives me the two different values in the array, but I'm not sure how to pick the one that's unique. Can I use some sort of count or not? Thanks!
This worked:
def find_uniq(arr)
return nil if arr.size < 3
if arr[0] != arr[1]
return arr[1] == arr[2] ? arr[0] : arr[1]
end
arr.each_cons(2) { |x, y| return y if y != x }
end
Thanks pjs and Cary Swoveland.
I would do this:
[ 1, 1, 1, 2, 1, 1 ]
.tally # { 1=>5, 2=>1 }
.find { |_, v| v == 1 } # [2, 1]
.first # 2
Or as 3limin4t0r suggested:
[ 1, 1, 1, 2, 1, 1 ]
.tally # { 1=>5, 2=>1 }
.invert[1] # { 5=>1, 1=>2 } => 2
The following doesn't use tallies and will short circuit the search when a unique item is found. First, it returns nil if the array has fewer than 3 elements, since there's no way to answer the question in that case. If that check is passed, it works by comparing adjacent values. It performs an up-front check that the first two elements are equal—if not, it checks against the third element to see which one is different. Otherwise, it iterates through the array and returns the first value it finds which is unequal. It returns nil if there is not a distinguished element in the array.
def find_uniq(arr)
return nil if arr.size < 3
if arr[0] == arr[1]
arr.each.with_index do |x, i|
i += 1
return arr[i] if arr[i] != x
end
elsif arr[1] == arr[2]
arr[0]
else
arr[1]
end
end
This also works with non-numeric arrays such as find_uniq(%w(c c c d c c c c)).
Thanks to Cary Swoveland for reminding me about each_cons. That can tighten up the solution considerably:
def find_uniq(arr)
return nil if arr.size < 3
if arr[0] != arr[1]
return arr[1] == arr[2] ? arr[0] : arr[1]
end
arr.each_cons(2) { |x, y| return y if y != x }
end
For all but tiny arrays this method effectively has the speed of Enumerable#find.
def find_uniq(arr)
multi = arr[0,3].partition { |e| e == arr.first }
.sort_by { |e| -e.size }.first.first
arr.find { |e| e != multi }
end
find_uniq [1, 1, 1, 2, 1, 1] #=> 2
find_uniq [0, 0, 0.55, 0, 0] #=> 0.55
find_uniq [:pig, :pig, :cow, :pig] #=> :cow
The wording of the question implies the array contains at least three elements. It certainly cannot be empty or have two elements. (If it could contain one element add the guard clause return arr.first if arr.size == 1.)
I examine the first three elements to determine the object that has duplicates, which I assign to the variable multi. I then am able to use find. find is quite fast, in part because it short-circuits (stops enumerating the array when it achieves a match).
If
arr = [1, 1, 1, 2, 1, 1]
then
a = arr[0,3].partition { |e| e == arr.first }.sort_by { |e| -e.size }
#=> [[1, 1, 1], []]
multi = a.first.first
#=> 1
If any of these:
arr = [2, 1, 1, 1, 1, 1]
arr = [1, 2, 1, 1, 1, 1]
arr = [1, 1, 2, 1, 1, 1]
apply then
a = arr[0,3].partition { |e| e == arr.first }.sort_by { |e| -e.size }
#=> [[1, 1], [2]]
multi = a.first.first
#=> 1
Let's compare the computational performace of the solutions that have been offered.
def spickermann1(arr)
arr.tally.find { |_, v| v == 1 }.first
end
def spickermann2(arr)
arr.tally.invert[1]
end
def spickermann3(arr)
arr.tally.min_by(&:last).first
end
def pjs(arr)
if arr[0] == arr[1]
arr.each.with_index do |x, i|
i += 1
return arr[i] if arr[i] != x
end
elsif arr[1] == arr[2]
arr[0]
else
arr[1]
end
end
I did not include #3limin4t0r's solution because of the author's admission that it is relatively inefficient. I did include, however, include two variants of #spikermann's answer, one ("spickermann2") having been proposed by #3limin4t0r in a comment.
require 'benchmark'
def test(n)
puts "\nArray size = #{n}"
arr = Array.new(n-1,0) << 1
Benchmark.bm do |x|
x.report("Cary") { find_uniq(arr) }
x.report("spickermann1") { spickermann1(arr) }
x.report("spickermann2") { spickermann2(arr) }
x.report("spickermann3") { spickermann3(arr) }
x.report("PJS") { pjs(arr) }
end
end
test 100
Array size = 100
user system total real
Cary 0.000032 0.000009 0.000041 ( 0.000029)
spickermann1 0.000022 0.000015 0.000037 ( 0.000019)
spickermann2 0.000017 0.000002 0.000019 ( 0.000016)
spickermann3 0.000019 0.000002 0.000021 ( 0.000018)
PJS 0.000042 0.000025 0.000067 ( 0.000034)
test 10_000
Array size = 10_000
user system total real
Cary 0.001101 0.000091 0.001192 ( 0.001119)
spickermann1 0.000699 0.000096 0.000795 ( 0.000716)
spickermann2 0.000794 0.000071 0.000865 ( 0.000896)
spickermann3 0.000776 0.000081 0.000857 ( 0.000781)
PJS 0.001140 0.000113 0.001253 ( 0.001300)
test 1_000_000
Array size = 1_000_000
user system total real
Cary 0.061148 0.000787 0.061935 ( 0.063022)
spickermann1 0.043598 0.000474 0.044072 ( 0.044590)
spickermann2 0.044909 0.000663 0.045572 ( 0.046371)
spickermann3 0.042907 0.000210 0.043117 ( 0.043162)
PJS 0.072766 0.000226 0.072992 ( 0.073168)
I attribute the apparent superiority of #spickermann's answer to the fact that Enumerable#tally has no block to evaluate (unlike, for example, Enumerable#find in my answer).
Your code can be fixed by using find instead of each:
def find_uniq(arr)
arr.uniq.find { |e| arr.count(e) == 1 }
end
However this is quite inefficient since uniq needs to iterate the full collection. After finding the unique values the arr collection is iterated 1 or 2 more times by count (assuming there are only two unique values), depending on the position of the values in the uniq result.
For simple solution I suggest looking at the answer of spickermann which only iterates the full collection once.
For your specific scenario you could technically increase performance by short-circuiting the tally. This is done by manually tallying and breaking the loop if the tally contains 2 distinct values and at least 3 items are tallied.
def find_uniq(arr)
tally = Hash.new(0)
arr.each_with_index do |item, index|
break if tally.size == 2 && index >= 3
tally[item] += 1
end
tally.invert[1]
end

If there's two maximum elements of an array?

In this code if user type 2, two times and 1, two times. Then there's two maximum elements and both Kinder and Twix should be printed. But how ? I probably can do this with if method but this will make my code even longer. Any cool version? Can I do this with just one if?
a = [0, 0, 0,]
b = ["Kinder", "Twix", "Mars"]
while true
input = gets.chomp.to_i
if input == 1
a[0] += 1
elsif input == 2
a[1] += 1
elsif input == 3
a[2] += 1
elsif input == 0
break
end
end
index = a.index(a.max)
chocolate = b[index] if index
print a.max,chocolate
The question really has nothing to do with how the array a is constructed.
def select_all_max(a, b)
mx = a.max
b.values_at(*a.each_index.select { |i| a[i] == mx })
end
b = ["Kinder", "Twix", "Mars"]
p select_all_max [0, 2, 1], b
["Twix"]
p select_all_max [2, 2, 1], b
["Kinder", "Twix"]
See Array#values_at.
This could alternatively be done in a single pass.
def select_all_max(a, b)
b.values_at(
*(1..a.size-1).each_with_object([0]) do |i,arr|
case a[i] <=> arr.last
when 0
arr << i
when 1
arr = [i]
end
end
)
end
p select_all_max [0, 2, 1], b
["Twix"]
p select_all_max [2, 2, 1], b
["Kinder", "Twix"]
p select_all_max [1, 1, 1], b
["Kinder", "Twix", "Mars"]
One way would be as follows:
First, just separate the input-gathering from the counting, so we'll just gather input in this step:
inputs = []
loop do
input = gets.chomp.to_i
break if input.zero?
inputs << input
end
Now we can tally up the inputs. If you have Ruby 2.7 you can simply do counts_by_input = inputs.tally to get { "Twix" => 2, "Kinder" => 2 }. Otherwise, my preferred approach is to use group_by with transform_values:
counts_by_input = inputs.group_by(&:itself).transform_values(&:count)
# => { "Twix" => 2, "Kinder" => 2 }
Now, since we're going to be extracting values based on their count, we want to have the counts as keys. Normally we might invert the hash, but that won't work in this case because it will only give us one value per key, and we need multiple:
inputs_by_count = counts_by_input.invert
# => { 2 => "Kinder" }
# This doesn't work, it removed one of the values
Instead, we can use another group_by and transform_values (the reason I like these methods is because they're very versatile ...):
inputs_by_count = counts_by_input.
group_by { |input, count| count }.
transform_values { |keyvals| keyvals.map(&:first) }
# => { 2 => ["Twix", "Kinder"] }
The transform_values code here is probably a bit confusing, but one important thing to understand is that often times, calling Enumerable methods on hashes converts them to [[key1, val1], [key2, val2]] arrays:
counts_by_input.group_by { |input, count| count }
# => { 2 => [["Twix", 2], ["Kinder", 2]] }
Which is why we call transform_values { |keyvals| keyvals.map(&:first) } afterwards to get our desired format { 2 => ["Twix", "Kinder"] }
Anyway, at this point getting our result is very easy:
inputs_by_count[inputs_by_count.keys.max]
# => ["Twix", "Kinder"]
I know this probably all seems a little insane, but when you get familiar with Enumerable methods you will be able to do this kind of data transformation pretty fluently.
Tl;dr, give me the codez
inputs = []
loop do
input = gets.chomp.to_i
break if input.zero?
inputs << input
end
inputs_by_count = inputs.
group_by(&:itself).
transform_values(&:count).
group_by { |keyvals, count| count }.
transform_values { |keyvals| keyvals.map(&:first) }
top_count = inputs_by_count.keys.max
inputs_by_count[top_count]
# => ["Twix", "Kinder"]
How about something like this:
maximum = a.max # => 2
top_selling_bars = a.map.with_index { |e, i| b[i] if e == maximum }.compact # => ['Kinder', 'Twix']
p top_selling_bars # => ['Kinder', 'Twix']
If you have
a = [2, 2, 0,]
b = ['Kinder', 'Twix', 'Mars']
You can calculate the maximum value in a via:
max = a.max #=> 2
and find all elements corresponding to that value via:
b.select.with_index { |_, i| a[i] == max }
#=> ["Kinder", "Twix"]

Find difference between two arrays considering duplicates [duplicate]

[1,2,3,3] - [1,2,3] produces the empty array []. Is it possible to retain duplicates so it returns [3]?
I am so glad you asked. I would like to see such a method added to the class Array in some future version of Ruby, as I have found many uses for it:
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
A description of the method and links to some of its applications are given here.
By way of example:
a = [1,2,3,4,3,2,4,2]
b = [2,3,4,4,4]
a - b #=> [1]
a.difference b #=> [1,2,3,2]
Ruby v2.7 gave us the method Enumerable#tally, allowing us to replace the first line of the method with
h = other.tally
As far as I know, you can't do this with a built-in operation. Can't see anything in the ruby docs either. Simplest way to do this would be to extend the array class like this:
class Array
def difference(array2)
final_array = []
self.each do |item|
if array2.include?(item)
array2.delete_at(array2.find_index(item))
else
final_array << item
end
end
end
end
For all I know there's a more efficient way to do this, also
EDIT:
As suggested by user2864740 in question comments, using Array#slice! is a much more elegant solution
def arr_sub(a,b)
a = a.dup #if you want to preserve the original array
b.each {|del| a.slice!(a.index(del)) if a.include?(del) }
return a
end
Credit:
My original answer
def arr_sub(a,b)
b = b.each_with_object(Hash.new(0)){ |v,h| h[v] += 1 }
a = a.each_with_object([]) do |v, arr|
arr << v if b[v] < 1
b[v] -= 1
end
end
arr_sub([1,2,3,3],[1,2,3]) # a => [3]
arr_sub([1,2,3,3,4,4,4],[1,2,3,4,4]) # => [3, 4]
arr_sub([4,4,4,5,5,5,5],[4,4,5,5,5,5,6,6]) # => [4]

How to check an array that it contains equal number of characters or not using Ruby

I have an array like this ['n','n','n','s','n','s','n','s','n','s'] and I want to check if there are equal counts of characters or not. In the above one I have 6 ns and 4 ss and so they are not equal and I tried, but nothing went correct. How can I do this using Ruby?
Given array:
a = ['n','n','n','s','n','s','n','s','n','s']
Group array by it's elements and take only values of this group:
(f,s) = a.group_by{|e| e}.values
Compare sizes:
f.size == s.size
Result: false
Or you can try this:
x = ['n','n','n','s','n','s','n','s','n','s']
x.group_by {|c| c}.values.map(&:size).inject(:==)
You can go for something like this:
def eq_num? arr
return false if arr.size == 1
arr.uniq.map {|i| arr.count(i)}.uniq.size == 1
end
arr = ['n','n','n','s','n','s','n','s','n','s']
eq_num? arr #=> false
arr = ['n','n','n','s','n','s','s','s']
eq_num? arr #=> true
Works for more than two kinds of letters too:
arr = ['n','n','t','s','n','t','s','s','t']
eq_num? arr #=> true
Using Array#count is relatively inefficient as it requires a full pass through the array for each element whose instances are being counted. Instead use Enumerable#group_by, as others have done, or use a counting hash, as below (see Hash::new):
Code
def equal_counts?(arr)
arr.each_with_object(Hash.new(0)) { |s,h| h[s] += 1 }.values.uniq.size == 1
end
Examples
equal_counts? ['n','n','n','s','n','s','n','s','n','s']
#=> false
equal_counts? ['n','r','r','n','s','s','n','s','r']
#=> true
Explanation
For
arr = ['n','n','n','s','n','s','n','s','n','s']
the steps are as follows.
h = arr.each_with_object(Hash.new(0)) { |s,h| h[s] += 1 }
#=> {"n"=>6, "s"=>4}
a = h.values
#=> [6, 4]
b = a.uniq
#=> [6, 4]
b.size == 1
#=> false

Ruby: Find index of next match in array, or find with offset

I want to find further matches after Array#find_index { |item| block } matches for the first time. How can I search for the index of the second match, third match, and so on?
In other words, I want the equivalent of the pos argument to Regexp#match(str, pos) for Array#find_index. Then I can maintain a current-position index to continue the search.
I cannot use Enumerable#find_all because I might modify the array between calls (in which case, I will also adjust my current-position index to reflect the modifications). I do not want to copy part of the array, as that would increase the computational complexity of my algorithm. I want to do this without copying the array:
new_pos = pos + array[pos..-1].find_index do |elem|
elem.matches_condition?
end
The following are different questions. They only ask the first match in the array, plus one:
https://stackoverflow.com/questions/11300886/ruby-how-to-find-the-next-match-in-an-array
https://stackoverflow.com/questions/4596517/ruby-find-next-in-array
The following question is closer, but still does not help me, because I need to process the first match before continuing to the next (and this way also conflicts with modification):
https://stackoverflow.com/questions/9925654/ruby-find-in-array-with-offset
A simpler way to do it is just:
new_pos = pos
while new_pos < array.size and not array[new_pos].matches_condition?
new_pos += 1
end
new_pos = nil if new_pos == array.size
In fact, I think this is probably better than my other answer, because it's harder to get wrong, and there's no chance of future shadowing problems being introduced from the surrounding code. However, it's still clumsy.
And if the condition is more complex, then you end up needing to do something like this:
new_pos = pos
# this check is only necessary if pos may be == array.size
if new_pos < array.size
prepare_for_condition
end
while new_pos < array.size and not array[new_pos].matches_condition?
new_pos += 1
if new_pos < array.size
prepare_for_condition
end
end
new_pos = nil if new_pos == array.size
Or, God forbid, a begin ... end while loop (although then you run into trouble with the initial value of new_pos):
new_pos = pos - 1
begin
new_pos += 1
if new_pos < array.size
prepare_for_condition
end
end while new_pos < array.size and not array[new_pos].matches_condition?
new_pos = nil if new_pos == array.size
This may seem horrible. However, supposing prepare_for_condition is something that keeps being tweaked in small ways. Those tweaks will eventually get refactored; however, by that time, the output of the refactored code will also end up getting tweaked in small ways that don't belong with the old refactored code, but do not yet seem to justify refactoring of their own - and so on. Occasionally, someone will forget to change both places. This may seem pathological; however, in programming, as we all know, the pathological case has a habit of occurring only too often.
Here is one way this can be done. We can define a new method in Array class that will allow us to find indexes that match a given condition. The condition can be specified as block that returns boolean.
The new method returns an Enumerator so that we get the benefit of many of the Enumerator methods such next, to_a, etc.
ary = [1,2,3,4,5,6]
class Array
def find_index_r(&block)
Enumerator.new do |yielder|
self.each_with_index{|i, j| yielder.yield j if block.call(i)}
end
end
end
e = ary.find_index_r { |r| r % 2 == 0 }
p e.to_a #=> [1, 3, 5]
p e.next
#=> 1
p e.next
#=> 3
ary[2]=10
p ary
#=> [1, 2, 10, 4, 5, 6]
p e.next
#=> 5
e.rewind
p e.next
#=> 1
p e.next
#=> 2
Note: I added a new method in Array class for demonstration purpose. Solution can be adapted easily to work without the monkey-patching
Of course, one way to do it would be:
new_pos = pos + (pos...array.size).find_index do |index|
elem = array[index]
elem.matches_condition?
end
However, this is clumsy and easy to get wrong. For example, you may forget to add pos. Also, you have to make sure elem isn't shadowing something. Both of these can lead to hard-to-trace bugs.
I find it hard to believe that an index argument to Array#find_index and Array#index still hasn't made it into the language. However, I notice Regexp#match(str,pos) wasn't there until version 1.9, which is equally surprising.
Suppose
arr = [9,1,4,1,9,36,25]
findees = [1,6,3,6,3,7]
proc = ->(n) { n**2 }
and for each element n in findees we want the index of the first unmatched element m of arr for which proc[n] == m. For example, if n=3, then proc[3] #==> 9, so the first matching index in arr would be 0. For the next n=3 in findees, the first unmatched match in arr is at index 4.
We can do this like so:
arr = [9,1,4,1,9,36,25]
findees = [1,6,3,6,3,7]
proc = ->(n) { n**2 }
h = arr.each_with_index.with_object(Hash.new { |h,k| h[k] = [] }) { |(n,i),h| h[n] << i }
#=> {9=>[0, 4], 1=>[1, 3], 4=>[2], 36=>[5], 25=>[6]}
findees.each_with_object([]) { |n,a| v=h[proc[n]]; a << v.shift if v }
#=> [1, 5, 0, nil, 4, nil]
We can generalize this into a handy Array method as follow:
class Array
def find_indices(*args)
h = each_with_index.with_object(Hash.new {|h,k| h[k] = []}) { |(n,i),h| h[n] << i }
args.each_with_object([]) { |n,a| v=h[yield n]; a << v.shift if v }
end
end
arr.find_indices(*findees) { |n| n**2 }
#=> [1, 5, 0, nil, 4, nil]
arr = [3,1,2,1,3,6,5]
findees = [1,6,3,6,3,7]
arr.find_indices(*findees, &:itself)
#=> [1, 5, 0, nil, 4, nil]
My approach is not much different from the others but perhaps packaged cleaner to be syntactically similar to Array#find_index . Here's the compact form.
def find_next_index(a,prior=nil)
(((prior||-1)+1)...a.length).find{|i| yield a[i]}
end
Here's a simple test case.
test_arr = %w(aa ab ac ad)
puts find_next_index(test_arr){|v| v.include?('a')}
puts find_next_index(test_arr,1){|v| v.include?('a')}
puts find_next_index(test_arr,3){|v| v.include?('a')}
# evaluates to:
# 0
# 2
# nil
And of course, with a slight rewrite you could monkey-patch it into the Array class

Resources