How to batch enumerables in ruby - arrays

In my quest to understand ruby's enumerable, I have something similar to the following
FileReader.read(very_big_file)
.lazy
.flat_map {|line| get_array_of_similar_words } # array.size is ~10
.each_slice(100) # wait for 100 items
.map{|array| process_100_items}
As much as each flat_map call emits an array of ~10 items, I was expecting the each_slice call to batch the items in 100's but that is not the case. I.e wait until there are 100 items before passing them to the final .map call.
How do I achieve functionality similar to the buffer function in reactive programming?

To see how lazy affects the calculations, let's look at an example. First construct a file:
str =<<~_
Now is the
time for all
good Ruby coders
to come to
the aid of
their bowling
team
_
fname = 't'
File.write(fname, str)
#=> 82
and specify the slice size:
slice_size = 4
Now I will read lines, one-by-one, split the lines into words, remove duplicate words and then append those words to an array. As soon as the array contains at least 4 words I will take the first four and map them into the longest word of the 4. The code to do that follows. To show how the calculations progress I will salt the code with puts statements. Note that IO::foreach without a block returns an enumerator.
IO.foreach(fname).
lazy.
tap { |o| puts "o1 = #{o}" }.
flat_map { |line|
puts "line = #{line}"
puts "line.split.uniq = #{line.split.uniq} "
line.split.uniq }.
tap { |o| puts "o2 = #{o}" }.
each_slice(slice_size).
tap { |o| puts "o3 = #{o}" }.
map { |arr|
puts "arr = #{arr}, arr.max = #{arr.max_by(&:size)}"
arr.max_by(&:size) }.
tap { |o| puts "o3 = #{o}" }.
to_a
#=> ["time", "good", "coders", "bowling", "team"]
The following is displayed:
o1 = #<Enumerator::Lazy:0x00005992b1ab6970>
o2 = #<Enumerator::Lazy:0x00005992b1ab6880>
o3 = #<Enumerator::Lazy:0x00005992b1ab6678>
o3 = #<Enumerator::Lazy:0x00005992b1ab6420>
line = Now is the
line.split.uniq = ["Now", "is", "the"]
line = time for all
line.split.uniq = ["time", "for", "all"]
arr = ["Now", "is", "the", "time"], arr.max = time
line = good Ruby coders
line.split.uniq = ["good", "Ruby", "coders"]
arr = ["for", "all", "good", "Ruby"], arr.max = good
line = to come to
line.split.uniq = ["to", "come"]
line = the aid of
line.split.uniq = ["the", "aid", "of"]
arr = ["coders", "to", "come", "the"], arr.max = coders
line = their bowling
line.split.uniq = ["their", "bowling"]
arr = ["aid", "of", "their", "bowling"], arr.max = bowling
line = team
line.split.uniq = ["team"]
arr = ["team"], arr.max = team
If the line lazy. is removed the return value is the same but the following is displayed (.to_a at the end now being superfluous):
o1 = #<Enumerator:0x00005992b1a438f8>
line = Now is the
line.split.uniq = ["Now", "is", "the"]
line = time for all
line.split.uniq = ["time", "for", "all"]
line = good Ruby coders
line.split.uniq = ["good", "Ruby", "coders"]
line = to come to
line.split.uniq = ["to", "come"]
line = the aid of
line.split.uniq = ["the", "aid", "of"]
line = their bowling
line.split.uniq = ["their", "bowling"]
line = team
line.split.uniq = ["team"]
o2 = ["Now", "is", "the", "time", "for", "all", "good", "Ruby",
"coders", "to", "come", "the", "aid", "of", "their",
"bowling", "team"]
o3 = #<Enumerator:0x00005992b1a41a08>
arr = ["Now", "is", "the", "time"], arr.max = time
arr = ["for", "all", "good", "Ruby"], arr.max = good
arr = ["coders", "to", "come", "the"], arr.max = coders
arr = ["aid", "of", "their", "bowling"], arr.max = bowling
arr = ["team"], arr.max = team
o3 = ["time", "good", "coders", "bowling", "team"]

Related

count the distance between to strings in an array Ruby

i have an array
line_one = ["flinders street", "richmond", "east richmond", "burnley", "hawthorn", "glenferrie"]
user_input1 = "flinders street"
user_input2 = "glenferrie"
how could I count the distance between the two strings?
expected output 5.
The first thing that comes to mind:
line_one = ["flinders street", "richmond", "east richmond", "burnley", "hawthorn", "glenferrie"]
user_input1 = "flinders street"
user_input2 = "glenferrie"
(line_one.find_index(user_input1) - line_one.find_index(user_input2)).abs
#=> 5
line_one = ["flinders street", "richmond", "east richmond", "burnley", "hawthorn", "glenferrie"]
Code
p (line_one.index("flinders street")...line_one.index("glenferrie")).count
output
5

Musical Script Solution

guys
I am a beginner in ruby ​​and in my practices I thought of a musical script and there is a point that is making me sleepy: The moment I type Scale.major_by_note ('C') in irb everything is fine, but if I type Scale.major_by_note ('C #'), it doesn't work, for it to work I must put a "C # / Db", help me to make sure with both "C" and "C #" and "C # / Db", thank you! below is the script:
class Scale
NATURAL = %w[C D E F G A B].freeze
ACCIDENT = %w[C# Db D# Eb F# Gb G# Ab A# Bb].freeze
CHROMATIC = %w[C C#/Db D D#/Eb E F F#/Gb G G# A A#/Bb B].freeze
SCALE_MAJOR_PATTERN = [0, 2, 4, 5, 7, 9, 11, 12].freeze # T T st T T T st
SCALE_MINOR_PATTERN = [0, 2, 3, 5, 7, 8, 10, 12].freeze # T st T T st T T
def self.show_all_scales(note)
major = Scale.major_by_note(note)
minor = Scale.minor_by_note(note)
all = { major: major, minor: minor}
end
def self.major_by_note(note)
major_note_index = CHROMATIC.index(note)
SCALE_MAJOR_PATTERN.map do |major_interval| # Interação
major_scale_note_index = major_note_index + major_interval
if major_scale_note_index <= (CHROMATIC.length - 1)
CHROMATIC[major_scale_note_index]
else
reseted_major_scale_note_index = major_scale_note_index - CHROMATIC.length
CHROMATIC[reseted_major_scale_note_index]
end
end
end
def self.minor_by_note(note)
minor_note_index = CHROMATIC.index(note)
SCALE_MINOR_PATTERN.map do |minor_interval|
minor_scale_note_index = minor_note_index + minor_interval
if minor_scale_note_index <= (CHROMATIC.length - 1)
CHROMATIC[minor_scale_note_index]
else
reseted_minor_scale_note_index = minor_scale_note_index - CHROMATIC.length
CHROMATIC[reseted_minor_scale_note_index]
end
end
end
end```
When you type
%w[C C#/Db D D#/Eb E F F#/Gb G G# A A#/Bb B]
Ruby is turning this into an Array of Strings:
["C", "C#/Db", "D", "D#/Eb", "E", "F", "F#/Gb", "G", "G#", "A", "A#/Bb", "B"]
Now while you know C# and Db are the same note, Ruby doesn't. It thinks the note in this case is called C#/Db. When it tries to find CHROMATIC.index("C#") it is returning nil because there is no C# in the Array.
A solution could be to write it like this:
CHROMATIC = %w[C C# D D# E F F# G G# A A# B].freeze
CHROMATIC_PAIR_MAP = {
"Db" => "C#",
"Eb" => "D#",
"Gb" => "F#",
"Ab" => "B#",
}
...
def self.index_of_note(note)
CHROMATIC.index(note) ||
CHROMATIC.index(CHROMATIC_PAIR_MAP[note])
end
def self.major_by_note(note)
major_note_index = index_of_note(note)
Here I am making a new helper method to get the index of the note by either getting it straight from CHROMATIC array, or looking up a the note key in the CHROMATIC_PAR_MAP Hash. It will only perform the lookup in the Hash if CHROMATIC.index(note) returns nil.
This is what I get in the console (irb):
irb(main):191:0> Scale.major_by_note("C#")
=> ["C#", "D#", "F", "F#", "G#", "A#", "C", "C#"]
irb(main):192:0> Scale.major_by_note("Db")
=> ["C#", "D#", "F", "F#", "G#", "A#", "C", "C#"]
irb(main):193:0> Scale.major_by_note("D#")=> ["D#", "F", "G", "G#", "A#", "C", "D", "D#"]
irb(main):194:0> Scale.major_by_note("Eb")=> ["D#", "F", "G", "G#", "A#", "C", "D", "D#"]
The full new class:
class Scale
NATURAL = %w[C D E F G A B].freeze
ACCIDENT = %w[C# Db D# Eb F# Gb G# Ab A# Bb].freeze
CHROMATIC = %w[C C# D D# E F F# G G# A A# B].freeze
SCALE_MAJOR_PATTERN = [0, 2, 4, 5, 7, 9, 11, 12].freeze # T T st T T T st
SCALE_MINOR_PATTERN = [0, 2, 3, 5, 7, 8, 10, 12].freeze # T st T T st T T
CHROMATIC_PAIR_MAP = {
"Db" => "C#",
"Eb" => "D#",
"Gb" => "F#",
"Ab" => "B#",
}
def self.show_all_scales(note)
major = Scale.major_by_note(note)
minor = Scale.minor_by_note(note)
all = { major: major, minor: minor}
end
def self.major_by_note(note)
major_note_index = index_of_note(note)
SCALE_MAJOR_PATTERN.map do |major_interval| # Interação
major_scale_note_index = major_note_index + major_interval
if major_scale_note_index <= (CHROMATIC.length - 1)
CHROMATIC[major_scale_note_index]
else
reseted_major_scale_note_index = major_scale_note_index - CHROMATIC.length
CHROMATIC[reseted_major_scale_note_index]
end
end
end
def self.minor_by_note(note)
minor_note_index = CHROMATIC.index(note)
SCALE_MINOR_PATTERN.map do |minor_interval|
minor_scale_note_index = minor_note_index + minor_interval
if minor_scale_note_index <= (CHROMATIC.length - 1)
CHROMATIC[minor_scale_note_index]
else
reseted_minor_scale_note_index = minor_scale_note_index - CHROMATIC.length
CHROMATIC[reseted_minor_scale_note_index]
end
end
end
def self.index_of_note(note)
CHROMATIC.index(note) ||
CHROMATIC.index(CHROMATIC_PAIR_MAP[note])
end
end

How to merge 2 arrays where value in one matches a value in another with different key in Ruby

I have an array that contains other arrays of items with prices but when one has a sale a new item is created How do I merge or pull value from one to the other to make 1 array so that the sale price replaces the non sale but contains the original price?
Example:
items=[{"id": 123, "price": 100, "sale": false},{"id":456,"price":25,"sale":false},{"id":678, "price":75, "sale":true, "parent_price_id":123}]
Transform into:
items=[{"id":456,"price":25,"sale":false},{"id":678, "price":75, "sale":true, "parent_price_id":123, "original_price": 100}]
It's not the prettiest solution, but here's one way you can do it. I added a minitest spec to check it against the values you provided and it gives the answer you're hoping for.
require "minitest/autorun"
def merge_prices(prices)
# Create a hash that maps the ID to the values
price_map =
prices
.map do |price|
[price[:id], price]
end
.to_h
# Create a result array which is initially duplicated from the original
result = prices.dup
result.each do |price|
if price.key?(:parent_price)
price[:original_price] = price_map[price[:parent_price]][:price]
# Delete the original
result.delete_if { |x| x[:id] == price[:parent_price] }
end
end
result
end
describe "Merge prices" do
it "should work" do
input = [
{"id":123, "price": 100, "sale": false},
{"id":456,"price":25,"sale": false},
{"id":678, "price":75, "sale": true, "parent_price":123}
].freeze
expected_output = [
{"id":456,"price":25,"sale": false},
{"id":678, "price":75, "sale": true, "parent_price":123, "original_price": 100}
].freeze
assert_equal(merge_prices(input), expected_output)
end
end
Let's being by defining items in an equivalent, but more familiar, way:
items = [
[{:id=>123, :price=>100, :sale=>false}],
[{:id=>456, :price=>25, :sale=>false}],
[{:id=>678, :price=>75, :sale=>true, :parent_price=>123}]
]
with the desired return value being:
[
{:id=>456, :price=>25, :sale=>false},
{:id=>678, :price=>75, :sale=>true, :parent_price=>123,
:original_price=>100}
]
I assume that h[:sale] #=> false for every element of items (a hash) g for which g[:parent_price] = h[:id].
A convenient first step is to create the following hash.
h = items.map { |(h)| [h[:id], h] }.to_h
#=> {123=>{:id=>123, :price=>100, :sale=>false},
# 456=>{:id=>456, :price=>25, :sale=>false},
# 678=>{:id=>678, :price=>75, :sale=>true, :parent_price=>123}}
Then:
h.keys.each { |k| h[k][:original_price] =
h.delete(h[k][:parent_price])[:price] if h[k][:sale] }
#=> [123, 456, 678] (not used)
h #=> {456=>{:id=>456, :price=>25, :sale=>false},
# 678=>{:id=>678, :price=>75, :sale=>true, :parent_price=>123,
# :original_price=>100}}
Notice that Hash#delete returns the value of the deleted key.
The last two steps are to extract the values from this hash and replace items with the resulting array of hashes:
items.replace(h.values)
#=> [{:id=>456, :price=>25, :sale=>false},
# {:id=>678, :price=>75, :sale=>true, :parent_price=>123,
# :original_price=>100}]
See Array#replace.
If desired we could combine these steps as follows.
items.replace(
items.map { |(h)| [h[:id], h] }.to_h.tap do |h|
h.keys.each { |k| h[k][:original_price] =
h.delete(h[k][:parent_price])[:price] if h[k][:sale] }
end.values)
#=> [{:id=>456, :price=>25, :sale=>false},
# {:id=>678, :price=>75, :sale=>true, :parent_price=>123,
# :original_price=>100}]
See Object#tap.

Performing union of two arrays with custom rules

I have two arrays
b = ["John Roberts", "William Koleva", "Lili Joe", "Victoria Jane", "Allen Thomas"]
a = ["Jon Roberts", "Wil Koleva", "Lilian Joe", "Vic Jane", "Al Thomas"]
Currently I am using the union operator on these two arrays, like this: a | b. When combined, even though the names in each array are the "same" name (they're just using the shortened version of the name), it will duplicate my names.
My proposed solution to this is simply choose the first occurrence of first initial + last name as the name to perform the union on, however, I don't recall there being any methods in Ruby that can perform such an operation.
So the result of some_method(a | b) will return c which is just:
["John Roberts", "William Koleva", "Lili Joe", "Victoria Jane", "Allen Thomas"]
I am wondering how I could go about achieving this?
b = ["John Roberts", "William Koleva", "Lili Joe", "Victoria Jane", "Allen Thomas"]
a = ["Jon Roberts", "Wil Koleva", "Lilian Joe", "Vic Jane", "Al Thomas"]
r = /
\s # match a space
[[:alpha:]]+ # match > 0 alphabetic characters
\z # match end of string
/x # free-spacing regex definition mode
(b+a).uniq { |str| [str[0], str[r]] }
#=> ["John Roberts", "William Koleva", "Lili Joe", "Victoria Jane", "Allen Thomas"]
This uses the form of the method Array#uniq that employs a block.
You may alternatively write (b|a).uniq { |str| [str[0], str[r]] }
The steps are as follows.
c = b+a
# => ["John Roberts", "William Koleva", "Lili Joe", "Victoria Jane", "Allen Thomas",
# "Jon Roberts", "Wil Koleva", "Lilian Joe", "Vic Jane", "Al Thomas"]
The first element of c passed to the block is
str = c.first
#=> "John Roberts"
so the block calculation is
[str[0], str[r]]
#=> ["J", " Roberts"]
The calculations are similar for all the other elements of c. The upshot is that
c.uniq { |str| [str[0], str[r]] }
is equivalent to selecting the first elements of c, when converted to [<first name initial>, <last name>], that match an element of the array d, where
d = [["J", "Roberts"], ["W", "Koleva"], ["L", "Joe"], ["V", "Jane"], ["A", "Thomas"],
["J", "Roberts"], ["W", "Koleva"], ["L", "Joe"], ["V", "Jane"], ["A", "Thomas"]].uniq
#=> [["J", "Roberts"], ["W", "Koleva"], ["L", "Joe"], ["V", "Jane"], ["A", "Thomas"]]
Pascal suggested that it would be better for uniq's block to return a string:
{ |str| "#{str[0]} #{str[r]}" }
(e.g., "J Roberts") which might instead be written
{ |str| str.sub(/(?<=.)\S+/,"") }
The inclusion of the space after the first initial is optional (e.g., "JRoberts" would also work).
Sure, just use Enumerable#uniq with a block:
c = (a | b).uniq do |full_name|
first_name, last_name = full_name.split(nil, 2)
[first_name[0], last_name]
end
Note: the first iteration of the code used the initials instead of abbreviated name.
Perhaps you can introduce the concept of a Name? It's a bit more code than just providing a block to uniq but it nicely encapsulates everything related.
class Name
def initialize(first, last)
#first, #last = first, last
end
def abbreviated
"#{#first[0]} #{#last}"
end
def eql?(other)
return false if !other.respond_to?(:abbreviated)
abbreviated == other.abbreviated
end
def hash
abbreviated.hash
end
def full
"#{#first} #{#last}"
end
end
a = Name.new('John', 'Roberts')
b = Name.new('Jon', 'Roberts')
c = Name.new('William', 'Koleva')
d = Name.new('Wil', 'Koleva')
x = [a, c]
y = [b, d]
p (y | x).map(&:full)
It's worth noting that abbreviated firstname does not really suffice to check equality of names.
Consider:
Jim Jackson
James Jackson
Janine Jackson
...

Splitting subarrays in ruby nokogiri web scraper

Hello I just finished the following tutorials: https://github.com/ryandhaase/Web-Scraper/blob/master/airbnb_scraper.rb and https://medium.com/#tabor_francesca/web-scraper-airbnb-24d67939b08a#.mg7ny2tke. And I am now practicing. I am having trouble splitting subarrays. Everything works, but I cannot split the city, state and zip code into separate excel columns.
The following line is incorrect, how can I fix it?
city << [subarray[0], "this is not working", subarray[1]]
My guess is there is another line that needs to be fixed.
require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'csv'
url = "https://www.tesla.com/findus/list/stores/United+States"
page = Nokogiri::HTML(open(url))
page = Nokogiri::HTML(open("https://www.tesla.com/findus/list/stores/United+States"))
puts page.class
name = []
street_address = []
extended_address = []
city = []
state = []
zip = []
page.css('a.fn.org.url').each do |line|
name << line.text.strip
end
page.css('span.street-address').each do |line|
street_address << line.text
end
page.css('span.extended-address').each do |line|
extended_address << line.text
end
page.css('span.locality').each do |line|
subarray = line.text.strip.split(/ · /)
if subarray.length == 3
city << subarray
else
city << [subarray[0], "this is not working", subarray[1]]
end
end
CSV.open("teslaStores.csv", "w") do |file|
file << ["Name", "Street Address", "Street Address Continued", "City", "State", "Zip"]
name.length.times do |i|
file << [name[i], street_address[i], extended_address[i], city[i], city[i][0], city[i][1]]
end
end
Just as a FYI, this is untested, but more idiomatic code in Ruby:
require 'csv'
require 'nokogiri'
require 'open-uri'
page = Nokogiri::HTML(open('https://www.tesla.com/findus/list/stores/United+States'))
name = page.css('a.fn.org.url').map{ |n| n.text.strip }
street_address = page.css('span.street-address').map { |n| n.text }
extended_address = page.css('span.extended-address').map{ |n| n.text }
city = page.css('span.locality').map { |n|
subarray = n.text.strip.split(/ · /)
if subarray.length == 3
subarray
else
[subarray[0], 'this is not working', subarray[1]]
end
}
CSV.open('teslaStores.csv', 'w') do |file|
file << ['Name', 'Street Address', 'Street Address Continued', 'City', 'State', 'Zip']
name.length.times do |i|
file << [name[i], street_address[i], extended_address[i], city[i], city[i][0], city[i][1]]
end
end
And that can be reduced a bit further:
street_address, extended_address = [
'span.street-address',
'span.extended-address'
].map{ |selector|
page.css(selector).map { |n| n.text }
}
So, I went to a meetup.com event on python and asked one of the instructions for assistance if even though the class was not on this topic :). The teacher explained that I needed split by commas and spaces. Where before I was splitting by periods.
I had to change this:
page.css('span.locality').each do |line|
subarray = line.text.strip.split(/ · /)
if subarray.length == 3
city << subarray
else
city << [subarray[0], "this is not working", subarray[1]]
end
To this:
page.css('span.locality').each do |line|
subarray = line.text.strip.split(',')
subarray2 = subarray[1].split(' ')
city << subarray[0]
state << subarray2[0]
zip << subarray2[1]
end
Here's the full answer:
require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'csv'
url = "https://www.tesla.com/findus/list/stores/United+States"
page = Nokogiri::HTML(open(url))
page = Nokogiri::HTML(open("https://www.tesla.com/findus/list/stores/United+States"))
puts page.class
name = []
street_address = []
extended_address = []
city = []
state = []
zip = []
page.css('a.fn.org.url').each do |line|
name << line.text.strip
end
page.css('span.street-address').each do |line|
street_address << line.text
end
page.css('span.extended-address').each do |line|
extended_address << line.text
end
page.css('span.locality').each do |line|
subarray = line.text.strip.split(',')
subarray2 = subarray[1].split(' ')
city << subarray[0]
state << subarray2[0]
zip << subarray2[1]
end
CSV.open("teslaStores.csv", "w") do |file|
file << ["Name", "Street Address", "Street Address Continued", "City", "State", "Zip"]
name.length.times do |i|
file << [name[i], street_address[i], extended_address[i], city[i], state[i], zip[i]]
end
end

Resources