Can't get updated values in array after using .map method - arrays

I need to implement a method, which works that way:
# do_magic("abcd") # "Aaaa-Bbb-Cc-D"
# do_magic("a") # "A"
# do_magic("ab") # "Aa-B"
# do_magic("teSt") # "Tttt-Eee-Ss-T"
My decision was to convert a string into an array, iterate through this array and save the result. The code works properly inside the block, but I'm unable to get the array with updated values with this solution, it returns the same string divided by a dash (for example "t-e-S-t" when ".map" used or "3-2-1-0" when ".map!" used):
def do_magic(str)
letters = str.split ''
counter = letters.length
while counter > 0
letters.map! do |letter|
(letter * counter).capitalize
counter -= 1
end
end
puts letters.join('-')
end
Where is the mistake?

You're so close. When you have a block (letters.map!), the return of that block is the last evaluated statement. In this case, counter -= 1 is being mapped into letters.
Try
l = (letter * counter).capitalize
counter -= 1
l

You can try something like this using each_with_index
def do_magic(str)
letters = str.split("")
length = letters.length
new_letters = []
letters.each_with_index do |letter, i|
new_letters << (letter * (length - i)).capitalize
end
new_letters.join("-")
end
OR
using map_with_index equivalent each_with_index.map
def do_magic(str)
letters = str.split("")
length = letters.length
letters.each_with_index.map { |letter, i|
(letter * (length - i)).capitalize
}.join("-")
end

I suggest the following.
def do_magic(letters)
length = letters.size
letters.downcase.each_char.with_index.with_object([]) { |(letter, i), new_letters|
new_letters << (letter * (length - i)).capitalize }.join
end
do_magic 'teSt'
# => "TtttEeeSsT"
Let's go through the steps.
letters = 'teSt'
length = letters.size
#=> 4
str = letters.downcase
#=> "test"
enum0 = str.each_char
#=> #<Enumerator: "test":each_char>
enum1 = enum0.with_index
#=> #<Enumerator: #<Enumerator: "test":each_char>:with_index>
enum2 = enum1.with_object([])
#=> #<Enumerator: #<Enumerator: #<Enumerator: "test":each_char>:
# with_index>:with_object([])>
Carefully examine the return values from the creation of the enumerators enum0, enum1 and enum2. The latter two may be thought of as compound enumerators.
The first element is generated by enum2 (the value of enum2.next) and the block variables are assigned values using disambiguation (aka decomposition).
(letter, i), new_letters = enum2.next
#=> [["t", 0], []]
letter
#=> "t"
i #=> 0
new_letters
#=> []
The block calculation is then performed.
m = letter * (length - i)
#=> "tttt"
n = m.capitalize
#=> "Tttt"
new_letters << n
#=> ["Tttt"]
The next element is generated by enum2, passed to the block and the block calculations are performed.
(letter, i), new_letters = enum2.next
#=> [["e", 1], ["Tttt"]]
letter
#=> "e"
i #=> 1
new_letters
#=> ["Tttt"]
Notice how new_letters has been updated. The block calculation is as follows.
m = letter * (length - i)
#=> "eee"
n = m.capitalize
#=> "Eee"
new_letters << n
#=> ["Tttt", "Eee"]
After the last two elements of enum2 are generated we have
new_letters
#=> ["Tttt", "Eee", "Se", "T"]
The last step is to combine the elements of new_letters to form a single string.
new_letters.join
#=> "TtttEeeSeT"

Related

If there's two maximum elements of an array?

In this code if user type 2, two times and 1, two times. Then there's two maximum elements and both Kinder and Twix should be printed. But how ? I probably can do this with if method but this will make my code even longer. Any cool version? Can I do this with just one if?
a = [0, 0, 0,]
b = ["Kinder", "Twix", "Mars"]
while true
input = gets.chomp.to_i
if input == 1
a[0] += 1
elsif input == 2
a[1] += 1
elsif input == 3
a[2] += 1
elsif input == 0
break
end
end
index = a.index(a.max)
chocolate = b[index] if index
print a.max,chocolate
The question really has nothing to do with how the array a is constructed.
def select_all_max(a, b)
mx = a.max
b.values_at(*a.each_index.select { |i| a[i] == mx })
end
b = ["Kinder", "Twix", "Mars"]
p select_all_max [0, 2, 1], b
["Twix"]
p select_all_max [2, 2, 1], b
["Kinder", "Twix"]
See Array#values_at.
This could alternatively be done in a single pass.
def select_all_max(a, b)
b.values_at(
*(1..a.size-1).each_with_object([0]) do |i,arr|
case a[i] <=> arr.last
when 0
arr << i
when 1
arr = [i]
end
end
)
end
p select_all_max [0, 2, 1], b
["Twix"]
p select_all_max [2, 2, 1], b
["Kinder", "Twix"]
p select_all_max [1, 1, 1], b
["Kinder", "Twix", "Mars"]
One way would be as follows:
First, just separate the input-gathering from the counting, so we'll just gather input in this step:
inputs = []
loop do
input = gets.chomp.to_i
break if input.zero?
inputs << input
end
Now we can tally up the inputs. If you have Ruby 2.7 you can simply do counts_by_input = inputs.tally to get { "Twix" => 2, "Kinder" => 2 }. Otherwise, my preferred approach is to use group_by with transform_values:
counts_by_input = inputs.group_by(&:itself).transform_values(&:count)
# => { "Twix" => 2, "Kinder" => 2 }
Now, since we're going to be extracting values based on their count, we want to have the counts as keys. Normally we might invert the hash, but that won't work in this case because it will only give us one value per key, and we need multiple:
inputs_by_count = counts_by_input.invert
# => { 2 => "Kinder" }
# This doesn't work, it removed one of the values
Instead, we can use another group_by and transform_values (the reason I like these methods is because they're very versatile ...):
inputs_by_count = counts_by_input.
group_by { |input, count| count }.
transform_values { |keyvals| keyvals.map(&:first) }
# => { 2 => ["Twix", "Kinder"] }
The transform_values code here is probably a bit confusing, but one important thing to understand is that often times, calling Enumerable methods on hashes converts them to [[key1, val1], [key2, val2]] arrays:
counts_by_input.group_by { |input, count| count }
# => { 2 => [["Twix", 2], ["Kinder", 2]] }
Which is why we call transform_values { |keyvals| keyvals.map(&:first) } afterwards to get our desired format { 2 => ["Twix", "Kinder"] }
Anyway, at this point getting our result is very easy:
inputs_by_count[inputs_by_count.keys.max]
# => ["Twix", "Kinder"]
I know this probably all seems a little insane, but when you get familiar with Enumerable methods you will be able to do this kind of data transformation pretty fluently.
Tl;dr, give me the codez
inputs = []
loop do
input = gets.chomp.to_i
break if input.zero?
inputs << input
end
inputs_by_count = inputs.
group_by(&:itself).
transform_values(&:count).
group_by { |keyvals, count| count }.
transform_values { |keyvals| keyvals.map(&:first) }
top_count = inputs_by_count.keys.max
inputs_by_count[top_count]
# => ["Twix", "Kinder"]
How about something like this:
maximum = a.max # => 2
top_selling_bars = a.map.with_index { |e, i| b[i] if e == maximum }.compact # => ['Kinder', 'Twix']
p top_selling_bars # => ['Kinder', 'Twix']
If you have
a = [2, 2, 0,]
b = ['Kinder', 'Twix', 'Mars']
You can calculate the maximum value in a via:
max = a.max #=> 2
and find all elements corresponding to that value via:
b.select.with_index { |_, i| a[i] == max }
#=> ["Kinder", "Twix"]

Get the longest prefix in arrays Ruby

I have an array of arrays. Within each subarray, if two or more elements share a prefix whose length equals to or is greater than eight, then I want to replace those elements by their longest prefix. For this array:
m = [
["A", "97455589955", "97455589920", "97455589921"],
["B", "2348045101518", "2348090001559"]
]
I expect an output like this:
n = [
["A", "974555899"],
["B", "2348045101518", "2348090001559"]
]
For first subarray in m, the longest prefix is "974555899" of length nine.
974555899-55
974555899-20
974555899-21
For the second subarray, the longest prefix is "23480" of length five, and that is shorter than eight. In this case, the second subarray is left as is.
23480-45101518
23480-90001559
For this input:
m = [
["A", "2491250873330", "249111222333", "2491250872214", "2491250872213"],
["B", "221709900000"],
["C", "6590247968", "6590247969", "6598540040", "65985400217"]
]
The output should be like this:
[
["A", "2491250873330", "249111222333", "249125087221"],
["B", "221709900000"],
["C", "659024796", "65985400"]
]
For array m[0], there is no prefix long enough between its four numbers, but there is a prefix 249125087221 of length twelve between m[0][2] and m[0][3]. For array m[2], there is prefix "659024796" of length nine between m[2][0] and m[2][1], and there is another prefix "65985400" of length eight between m[2][2] and m[2][3].
I constructed the code below:
m.map{|x, *y|
[x, y.map{|z| z[0..7]}.uniq].flatten
}
With my code with the first input, I get this output.
[
["A", "97455589"],
["B", "23480451", "23480900"]
]
I'm stuck on how to get dynamically the common prefix without setting a fixed length.
Code
def doit(arr, min_common_length)
arr.map do |label, *values|
[label, values.group_by { |s| s[0, min_common_length] }.
map { |_,a| a.first[0, nbr_common_digits(a, min_common_length)] }]
end
end
def nbr_common_digits(a, min_common_length)
max_digits = a.map(&:size).min
return max_digits if max_digits == min_common_length + 1
(min_common_length..max_digits).find { |i|
a.map { |s| s[i] }.uniq.size > 1 } || max_digits
end
Example
arr = [["A","2491250873330","249111222333","2491250872214","2491250872213"],
["B","221709900000"],
["C","6590247968","6590247969","6598540040","65985400217"]]
doit(arr, 8)
#=> [["A", ["249125087", "249111222333"]],
# ["B", ["221709900000"]],
# ["C", ["659024796", "65985400"]]]
Explanation
Let's first consider the helper method, nbr_common_digits. Suppose
a = ["123467", "12345", "1234789"]
min_common_length = 2
then the steps are as follows.
max_digits = a.map(&:size).min
#=> 5 (the length of "12345")
max_digits == min_common_length + 1
#=> 5 == 2 + 1
#=> false, so do not return max_digits
b = (min_common_length..max_digits).find { |i| a.map { |s| s[i] }.uniq.size > 1 }
#=> (2..5).find { |i| a.map { |s| s[i] }.uniq.size > 1 }
#=> 4
At this point we must consider the possibility that b will equal nil, which occurs when the first 5 characters of all strings are equal. In that case we should return max_digits, which is why we require the following.
b || max_digits
#=> 4
In doit the steps are as follows.
min_common_length = 8
Firstly, we use Enumerable#group_by to group values by their first min_common_length digits.
arr.map { |label, *values| [label,
values.group_by { |s| s[0, min_common_length] }] }
#=> [["A", {"24912508"=>["2491250873330", "2491250872214", "2491250872213"],
# "24911122"=>["249111222333"]}],
# ["B", {"22170990"=>["221709900000"]}],
# ["C", {"65902479"=>["6590247968", "6590247969"],
# "65985400"=>["6598540040", "65985400217"]}]]
The second step is to compute the longest common lengths and replace values as required.
arr.map do |label, *values| [label,
values.group_by { |s| s[0, min_common_length] }.
map { |_,a| a.first[0, nbr_common_digits(a, min_common_length)] }]
end
#=> [["A", ["249125087", "249111222333"]],
# ["B", ["221709900000"]],
# ["C", ["659024796", "65985400"]]]
The first block variable in the second map's block (whose value equals a string with nbr_common_length characters--group_by's grouping criterion) is represented by an underscore (a legitimate local variable) to signify that it is not used in the block calculation.
This is an interesting problem. Here's my solution:
def lcn(lim, *arr)
# compute all substrings of lengths >= lim and build a lookup by length
lookup = lcn_explode(lim, arr)
# first pass: look for largest common number among all elements
res, = lcn_filter(arr, lookup) { |size| size == arr.size }
return res unless res.empty?
# second pass: look for largest common number among some elements
res, rem = lcn_filter(arr, lookup) { |size| size > 1 }
# append remaining candidates with no matches
res.concat(rem)
end
def lcn_explode(lim, arr)
memo = Hash.new { |h, k| h[k] = Array.new }
arr.uniq.each do |n|
lim.upto([n.size, lim].max) do |i|
memo[i] << [n[0, i], n]
end
end
memo
end
def lcn_filter(arr, lookup)
memo = []
lookup.keys.sort!.reverse_each do |i|
break if arr.empty?
matches = Hash.new { |h, k| h[k] = Array.new }
lookup[i].each do |m, n|
matches[m] << n if arr.include?(n)
end
matches.each_pair do |m, v|
next unless yield v.size
memo << m
# remove elements from input array so they won't be reused
arr -= v
end
end
return memo, arr
end
You use it like so:
p lcn(8, "97455589955", "97455589920", "97455589921") => ["974555899"]
Or:
m.each do |key, *arr|
p [key, *lcn(8, *arr)]
end
Which prints:
["A", "249125087221", "2491250873330", "249111222333"]
["B", "221709900000"]
["C", "659024796", "65985400"]
Your task can be splitten into two: calculating Largest Common Number and modifying original array.
Largest Common Number operates on arrays, therefore, it should a method of Array.
After calculating LCN you can just compare its length with the limit (i.e. 8).
class Array
def lcn
first.length.times do |index|
numb = first[0..index]
return numb unless self[1..-1].all? { |n| n.start_with?(numb) }
end
first
end
end
def task(m, limit = 8)
m.map { |i,*n| [i, n.lcn.length >= limit ? n.lcn : n].flatten }
end
task(m) # => [["A", "9745558995"], ["B", "2348045101518", "2348090001559"]]
In your solution you do not actually implement lcn finding and filtering output.

How can I refactor this Ruby method to run faster?

The method below is supposed to take an array a and return the duplicated integer whose second index value is the lowest. The array will only include integers between 1 and a.length. With this example,
firstDuplicate([1,2,3,2,4,5,1])
the method returns 2.
def firstDuplicate(a)
num = 1
big_num_array = []
a.length.times do
num_array = []
if a.include?(num)
num_array.push(a.index(num))
a[a.index(num)] = "x"
if a.include?(num)
num_array.unshift(a.index(num))
num_array.push(num)
end
big_num_array.push(num_array) if num_array.length == 3
end
num += 1
end
if big_num_array.length > 0
big_num_array.sort![0][2]
else
-1
end
end
The code works, but seems longer than necessary and doesn't run fast enough. I am looking for ways to refactor this.
You could count the entries as you go and use Enumerable#find to stop iterating as soon as you find something again:
h = { }
a.find do |e|
h[e] = h[e].to_i + 1 # The `to_i` converts `nil` to zero without a bunch of noise.
h[e] == 2
end
You could also say:
h = Hash.new(0) # to auto-vivify with zeros
a.find do |e|
h[e] += 1
h[e] == 2
end
or use Hash#fetch with a default value:
h = { }
a.find do |e|
h[e] = h.fetch(e, 0) + 1
h[e] == 2
end
find will stop as soon as it finds an element that makes that block true so this should be reasonably efficient.
Here are two ways that could be done quite simply.
Use a set
require 'set'
def first_dup(arr)
st = Set.new
arr.find { |e| st.add?(e).nil? }
end
first_dup [1,2,3,2,4,5,4,1,4]
#=> 2
first_dup [1,2,3,4,5]
#=> nil
See Set#add?.
Use Array#difference
def first_dup(arr)
arr.difference(arr.uniq).first
end
first_dup [1,2,3,2,4,5,4,1,4]
#=> 2
first_dup [1,2,3,4,5]
#=> nil
I have found Array#difference to be sufficiently useful that I proposed it be added to the Ruby core (but it doesn't seem to be gaining traction). It is as follows:
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
As explained at the link, it differs from Array#- as follows:
a = [1,2,2,3,3,2,2]
b = [2,2,3]
a - b
#=> [1]
a.difference(b)
#=> [1,3,2,2]
That is, difference "removes" one 2in a for each 2 in b (similar for 3), preserving the order of what's left of a. a is not mutated, however.
The steps in the example given above for the present problem are as follows.
arr = [1,2,3,2,4,5,4,1,4]
a = arr.uniq
#=> [1,2,3,4,5]
b = arr.difference(a)
#=> [2, 4, 1, 4]
b.first
#=> 2
If you are looking for super performance, ruby is probably not a best language of choice. If you are looking for a readability, here you go:
[1,2,3,2,4,5,1].
map. # or each (less readable, probably faster)
with_index.
group_by(&:shift). # or group_by(&:first)
min_by { |v, a| a[1] && a[1].last || Float::INFINITY }.
first
#⇒ 2

How to check an array that it contains equal number of characters or not using Ruby

I have an array like this ['n','n','n','s','n','s','n','s','n','s'] and I want to check if there are equal counts of characters or not. In the above one I have 6 ns and 4 ss and so they are not equal and I tried, but nothing went correct. How can I do this using Ruby?
Given array:
a = ['n','n','n','s','n','s','n','s','n','s']
Group array by it's elements and take only values of this group:
(f,s) = a.group_by{|e| e}.values
Compare sizes:
f.size == s.size
Result: false
Or you can try this:
x = ['n','n','n','s','n','s','n','s','n','s']
x.group_by {|c| c}.values.map(&:size).inject(:==)
You can go for something like this:
def eq_num? arr
return false if arr.size == 1
arr.uniq.map {|i| arr.count(i)}.uniq.size == 1
end
arr = ['n','n','n','s','n','s','n','s','n','s']
eq_num? arr #=> false
arr = ['n','n','n','s','n','s','s','s']
eq_num? arr #=> true
Works for more than two kinds of letters too:
arr = ['n','n','t','s','n','t','s','s','t']
eq_num? arr #=> true
Using Array#count is relatively inefficient as it requires a full pass through the array for each element whose instances are being counted. Instead use Enumerable#group_by, as others have done, or use a counting hash, as below (see Hash::new):
Code
def equal_counts?(arr)
arr.each_with_object(Hash.new(0)) { |s,h| h[s] += 1 }.values.uniq.size == 1
end
Examples
equal_counts? ['n','n','n','s','n','s','n','s','n','s']
#=> false
equal_counts? ['n','r','r','n','s','s','n','s','r']
#=> true
Explanation
For
arr = ['n','n','n','s','n','s','n','s','n','s']
the steps are as follows.
h = arr.each_with_object(Hash.new(0)) { |s,h| h[s] += 1 }
#=> {"n"=>6, "s"=>4}
a = h.values
#=> [6, 4]
b = a.uniq
#=> [6, 4]
b.size == 1
#=> false

Appending to an array value in a hash

I'm parsing multiple website and trying to build a hash that looks something like:
"word" => [[01.html, 2], [02.html, 7], [03.html, 4]]
where word is a given word in the index, the first value in each sublist is the file it was found in, and the second value is the number of occurrences in that given file.
I'm running into an issue where, rather than appending ["02.html", 7] inside the values list, it creates a whole new entry for "word" and puts ["02.html", 7] at the end of the hash. This results in basically giving me individual indexes for all of my websites appended after each other rather than giving me one master index.
Here is my code:
for token in tokens
if !invindex.include?(token)
invindex[token] = [[doc_name, 1]] #adds the word to the hash with the doc name and occurrence of 1
else
for list in invindex[token]
if list[0] == doc_name
list[1] += 1 #adds one to the occurrence with the same doc_name
else
invindex[token].insert([doc_name, 1]) #this SHOULD append the doc name and initial occurrence inside the word's value list since the word is already in the hash
end
end
end
end
end
Hopefully it's something simple and I just missed something when I traced it on paper.
I'm running into an issue where, rather than appending ["02.html", 7]
inside the values list, it creates a whole new entry for "word" and
puts ["02.html", 7] at the end of the hash.
I'm not seeing that:
invindex = {
word1: [
['01.html', 2],
]
}
tokens = %i[
word1
word2
word3
]
doc_name = '02.html'
tokens.each do |token|
if !invindex.include?(token)
invindex[token] = [[doc_name, 1]] #adds the word to the hash with the doc name and occurrence of 1
else
invindex[token].each do |list|
if list[0] == doc_name
list[1] += 1 #adds one to the occurrence with the same doc_name
else
invindex[token].insert([doc_name, 1]) #this SHOULD append the doc name and initial occurrence inside the word's value list since the word is already in the hash
end
end
end
end
p invindex
--output:--
{:word1=>[["01.html", 2]], :word2=>[["02.html", 1]], :word3=>[["02.html", 1]]}
invindex[token].insert([doc_name, 1]) #this SHOULD append the doc name
Nope:
invindex = {
word: [
['01.html', 2],
]
}
token = :word
doc_name = '02.html'
invindex[token].insert([doc_name, 7])
p invindex
invindex[token].insert(-1, ["02.html", 7])
p invindex
--output:--
{:word=>[["01.html", 2]]}
{:word=>[["01.html", 2], ["02.html", 7]]}
Array#insert() requires that you specify an index as the first argument. Typically when you want to append something to the end, you use <<:
invindex = {
word: [
['01.html', 2],
]
}
token = :word
doc_name = '02.html'
invindex[token] << [doc_name, 7]
p invindex
--output:--
{:word=>[["01.html", 2], ["02.html", 7]]}
for token in tokens
Rubyists don't use for-in loops because for-in loops call each(), so rubyists call each() directly:
tokens.each do |token|
...
end
Finally, indenting in ruby is 2 spaces--not 3 spaces, not 1 space, not 4 spaces. It's 2 spaces.
Applying all that to your code:
invindex = {
word1: [
['01.html', 2],
]
}
tokens = %i[
word1
word2
word3
]
doc_name = '01.html'
tokens.each do |token|
if !invindex.include?(token)
invindex[token] = [[doc_name, 1]] #adds the word to the hash with the doc name and occurrence of 1
else
invindex[token].each do |list|
if list[0] == doc_name
list[1] += 1 #adds one to the occurrence with the same doc_name
else
invindex[token] << [doc_name, 1] #this SHOULD append the doc name and initial occurrence inside the word's value list since the word is already in the hash
end
end
end
end
p invindex
--output:--
{:word1=>[["01.html", 3]], :word2=>[["01.html", 1]], :word3=>[["01.html", 1]]}
However, there is still a problem, which is due to the fact that you are changing an Array that you are stepping through--a big no-no in computer programming:
invindex[token].each do |list|
if list[0] == doc_name
list[1] += 1 #adds one to the occurrence with the same doc_name
else
invindex[token] << [doc_name, 1] #***PROBLEM***
Look what happens:
invindex = {
word1: [
['01.html', 2],
]
}
tokens = %i[
word1
word2
word3
]
%w[ 01.html 02.html].each do |doc_name|
tokens.each do |token|
if !invindex.include?(token)
invindex[token] = [[doc_name, 1]] #adds the word to the hash with the doc name and occurrence of 1
else
invindex[token].each do |list|
if list[0] == doc_name
list[1] += 1 #adds one to the occurrence with the same doc_name
else
invindex[token] << [doc_name, 1] #this SHOULD append the doc name and initial occurrence inside the word's value list since the word is already in the hash
end
end
end
end
end
p invindex
--output:--
{:word1=>[["01.html", 3], ["02.html", 2]], :word2=>[["01.html", 1], ["02.html", 2]], :word3=>[["01.html", 1], ["02.html", 2]]}
Problem 1: You don't want to insert [doc_name, 1] every time the sub Array you are examining doesn't contain the doc_name--you only want to insert [doc_name, 1] after ALL the sub Arrays have been examined, and the doc_name wasn't found. If you run the example above with the starting hash:
invindex = {
word1: [
['01.html', 2],
['02.html', 7],
]
}
...you will see that the output is even worse.
Problem 2: Appending [doc_name, 1] to the Array while you are stepping through the Array means that [doc-name, 1] will be examined, too, when the loop gets to the end of the Array--and then your loop will increment its count to 2. The rule is: don't change an Array you are stepping through because bad things will happen.
Do you actually need to have a hash that contains an array of arrays?
This can be much better described with a nested hash
invindex = {
"word" => { '01.html' => 2, '02.html' => 7, '03.html' => 4 },
"other" => { '01.html' => 1, '02.html' => 17, '04.html' => 4 }
}
which can be easily populated by using a Hash factory like
invindex = Hash.new { |h,k| h[k] = Hash.new {|hh,kk| hh[kk] = 0} }
tokens.each do |token|
invindex[token][doc_name] += 1
end
now if you absolutely need to have the format you mention you can get it from the described invindex with a simple iteration
result = {}
invindex.each {|k,v| result[k] = v.to_a }
Suppose:
arr = %w| 01.html 02.html 03.html 02.html 03.html 03.html |
#=> ["01.html", "02.html", "03.html", "02.html", "03.html", "03.html"]
is an array of your files for a given word in the index. Then the value of that word in the hash is given by constructing the counting hash:
h = arr.each_with_object(Hash.new(0)) { |s,h| h[s] += 1 }
#=> {"01.html"=>1, "02.html"=>2, "03.html"=>3}
and then converting it to an array:
h.to_a
#=> [["01.html", 1], ["02.html", 2], ["03.html", 3]]
so you could write:
arr.each_with_object(Hash.new(0)) { |s,h| h[s] += 1 }.to_a
Hash::new is given a default value of zero. That means that if the hash being constructed, h, does not have a key s, h[s] returns zero. In that case:
h[s] += 1
#=> h[s] = h[s] + 1
# = 0 + 1 = 1
and when the same value of s in arr is passed to the block:
h[s] += 1
#=> h[s] = h[s] + 1
# = 1 + 1 = 2
You may consider whether it would be better to make the value of each word in the index the hash h.

Resources