Rules on Parenthesis for Block Variables - arrays

I ran across the following piece of code while reading The Ruby Way:
class Array
def invert
each_with_object({}).with_index { |(elem, hash), index| hash[elem] = index }
end
end
I want to make sure that I understand what the parenthesis are doing in (elem, hash).
The first method (each_with_object({})) will yield two objects to the block. The first object will be the element in the array; the second object will be the hash. The parentheses make sure that those two objects are assigned to different block variables. If I had instead used { |elem, index} #code }, then elem would be an array consisting of the element and the hash. I think that is clear.
My confusion lies with the fact that if I didn't chain these two methods, I would not have to use the parentheses, and instead could use: each_with_object({}) { |elem, obj #code }.
What are the rules about when parentheses are necessary in block variables? Why do they differ between the two examples here? My simplistic explanation is that, when the methods are not chained, then the yield code looks like yield (elem, obj), but when the methods are chained, the code looks like yield([elem, obj], index). (We can surmise that a second array would be passed in if we chained a third method). Is this correct? Is the object(s) passed in from the last chained method not an array?
I guess instead of all this conjecture, the question boils down to: "What does the yield statement look like when chaining methods that accept blocks?

Your question is only tangentially concerned with blocks and block variables. Rather, it concerns the rules for "disambiguating" arrays.
Let's consider your example:
[1,2,3].each_with_object({}).with_index {|(elem, hash), index| hash[elem] = index}
We have:
enum0 = [1,2,3].each_with_object({})
#=> #<Enumerator: [1, 2, 3]:each_with_object({})>
We can see this enumerator's elements by converting it to an array:
enum0.to_a
#=> [[1, {}], [2, {}], [3, {}]]
We next have:
enum1 = enum0.with_index
#=> #<Enumerator: #<Enumerator: [1, 2, 3]:each_with_object({})>:with_index>
enum1.to_a
#=> [[[1, {}], 0], [[2, {}], 1], [[3, {}], 2]]
You might want to think of enum1 as a "compound enumerator", but it's just an enumerator.
You see that enum1 has three elements. These elements are passed to the block by Enumerator#each. The first is:
enum1.first
#=> [[1, {}], 0]
If we had a single block variable, say a, then
a #=> [[1, {}], 0]
We could instead break this down in different ways using "disambiguation". For example, we could write:
a,b = [[1, {}], 0]
a #=> [1, {}]
b #=> 0
Now let's stab out all the elements:
a,b,c = [[1, {}], 0]
a #=> [1, {}]
b #=> 0
c #=> nil
Whoops! That's not what we wanted. We've just experienced the "ambiguous" in "disambiguate". We need to write this so that our intentions are unambiguous. We do that by adding parenthesis. By doing so, you are telling Ruby, "decompose the array in this position to its constituent elements". We have:
(a,b),c = [[1, {}], 0]
a #=> 1
b #=> {}
c #=> 0
Disambiguation can be extremely useful. Suppose, for example, a method returned the array:
[[1,[2,3],[[4,5],{a: 6}]],7]
and we wish to pull out all the individual values. We could do that as follows:
(a,(b,c),((d,e),f)),g = [[1,[2,3],[[4,5],{a: 6}]],7]
a #=> 1
b #=> 2
c #=> 3
d #=> 4
e #=> 5
f #=> {:a=>6}
g #=> 7
Again, you just have to remember that the parentheses simply mean "decompose the array in this position to its constituent elements".

The rule is basic: every enumerator has a “signature.” E.g. it yields two parameters, then the proc to be passed should expect two parameters to receive:
[1,2,3].each_with_index { |o, i| ...}
When the object might be expanded, like hash item, it may be expanded using parenthesis. Assuming, the iterator yields an array, [*arr]-like operation is permitted with.
The following example might shed a light on this:
[1,2,3].each_with_object('first') # yielding |e, obj|
.with_index # yielding |elem, idx|
# but wait! elem might be expanded here ⇑⇑⇑⇑
# |(e, obj), idx|
.each_with_object('second') do |((elem, fst_obj), snd_idx), trd_obj|
puts "e: #{elem}, 1o: #{fst_obj}, 2i: #{snd_idx}, 3o: #{trd_obj}"
end
#⇒ e: 1, 1o: first, 2i: 0, 3o: second
#⇒ e: 2, 1o: first, 2i: 1, 3o: second
#⇒ e: 3, 1o: first, 2i: 2, 3o: second

Related

How to remove a single value from an array in Ruby?

I want to remove the first instance of the lowest value in the array.
arr = [1,2,3,1,2,3]
arr.reject {|i| i == arr.min}
#=> [2,3,2,3]
But my code removes all instances of the lowest value in the array. I want a result such that:
[...]
#=> [2,3,1,2,3]
What's the most elegant solution to this problem?
On first blush, here are a couple of options:
arr.delete_at(arr.index(arr.min))
# or less readable but still valid
arr.delete_at arr.index arr.min
arr.delete_at(arr.each_with_index.min[1])
# or
arr.delete_at(arr.each_with_index.min.pop)
# or
arr.delete_at(arr.each_with_index.min.last)
The first is less code and more readable but makes two passes through the list instead of one. I have doubts as to whether any other construct will surpass option #1 in elegance, as ugly as it may (or may not?) be.
Note that both choices crash on an empty array. Here's a safer version:
arr.delete_at arr.index(arr.min) || 0
Just out of curiosity:
[1,2,3,1,2,3].tap { |a| a.delete_at a.each_with_index.min.last }
#⇒ [2, 3, 1, 2, 3]
You can use Enumerable#drop_while for this purpose
arr = [1,2,3,1,2,3]
arr.drop_while { |i| i == arr.min }
#=> [2, 3, 1, 2, 3]

Sort an array of arrays by the number of same occurencies in Ruby

This question is different from this one.
I have an array of arrays of AR items looking something like:
[[1,2,3], [4,5,6], [7,8,9], [7,8,9], [1,2,3], [7,8,9]]
I would like to sort it by number of same occurences of the second array:
[[7,8,9], [1,2,3], [4,5,6]]
My real data are more complexes, looking something like:
raw_data = {}
raw_data[:grapers] = []
suggested_data = {}
suggested_data[:grapers] = []
varietals = []
similar_vintage.varietals.each do |varietal|
# sub_array
varietals << Graper.new(:name => varietal.grape.name, :grape_id => varietal.grape_id, :percent => varietal.percent)
end
raw_data[:grapers] << varietals
So, I want to sort raw_data[:grapers] by the max occurrencies of each varietals array comparing this value: grape_id inside them.
When I need to sort a classical array of data by max occurencies I do that:
grapers_with_frequency = raw_data[:grapers].inject(Hash.new(0)) { |h,v| h[v] += 1; h }
suggested_data[:grapers] << raw_data[:grapers].max_by { |v| grapers_with_frequency[v] }
This code doesn't work cos there are sub arrays there, including AR models that I need to analyze.
Possible solution:
array.group_by(&:itself) # grouping
.sort_by {|k, v| -v.size } # sorting
.map(&:first) # optional step, depends on your real data
#=> [[7, 8, 9], [1, 2, 3], [4, 5, 6]]
I recommend you take a look at the Ruby documentation for the sort_by method. It allows you to sort an array using anything associated with the elements, rather than the values of the elements.
my_array.sort_by { |elem| -my_array.count(elem) }.uniq
=> [[7, 8, 9], [1, 2, 3], [4, 5, 6]]
This example sorts by the count of each element in the original array. This is preceded with a minus so that the elements with the highest count are first. The uniq is to only have one instance of each element in the final result.
You can include anything you like in the sort_by block.
As Ilya has pointed out, having my_array.count(elem) in each iteration will be costlier than using group_by beforehand. This may or may not be an issue for you.
arr = [[1,2,3], [4,5,6], [7,8,9], [7,8,9], [1,2,3], [7,8,9]]
arr.each_with_object(Hash.new(0)) { |a,h| h[a] += 1 }.
sort_by(&:last).
reverse.
map(&:first)
#=> [[7.8.9]. [1,2,3], [4,5,6]]
This uses the form of Hash::new that takes an argument (here 0) that is the hash's default value.

How do I check to see if an array of arrays has a value within the inner arrays?

Say I have an array of arrays that looks like this:
[[1830, 1], [1859, 1]]
What I want to do is quickly scan the internal arrays to see if any of them contain the number 1830. If it does, I want it to return the entire array that includes the number 1830, aka [1830, 1] from the above example.
I know for a normal array of values, I would just do array.include? 1830, but that doesn't work here, as can be seen here:
#add_lines_num_start
#=> [[1830, 1], [1859, 1]]
#add_lines_num_start.include? 1830
#=> false
#add_lines_num_start.first.include? 1830
#=> true
How do I do that?
a = [[1830, 1], [1859, 1]]
a.find { |ar| ar.grep(1830) }
#=> [1830, 1]
References:
Enumerable#find
Enumerable#grep
edit 1
As #Ilya mentioned in comment, instead of traversing the whole array with grep you could use the method to return the boolean once element that matches the condition is found:
a.find { |ar| ar.include?(1830) }
References:
Enumerable#include?
edit 2 (shamelessly stolen from #Cary's comment under OP)
In case you'll have more than one matching array in your array, you can use Enumerable#find_all:
a = [[1830, 1], [1859, 1], [1893, 1830]]
a.find_all { |ar| ar.include?(1830) }
#=> [[1830, 1], [1893, 1830]]

Using self.dup, but failing rspec test to not modify original array

I'm creating a method to transpose square 2-d arrays. My method passes every test, except the "does not modify original array" one. I'm only working on the duped array, so I'm confused on why the test is failing.
Code:
class Array
def my_transpose
orig_arr = self.dup; array = []
orig_arr[0].length.times do
temp_arr = []
orig_arr.each { |arr| temp_arr << arr.shift }
array << temp_arr
end
array
end
end
RSpec:
describe Array do
describe "#my_transpose" do
let(:arr) { [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
] }
let(:small_arr) { [
[1, 2],
[3, 4]
] }
it "transposes a small matrix" do
expect(small_arr.my_transpose).to eq([
[1, 3],
[2, 4]
])
end
it "transposes a larger matrix" do
expect(arr.my_transpose).to eq([
[1, 4, 7],
[2, 5, 8],
[3, 6, 9]
])
end
it "should not modify the original array" do
small_arr.my_transpose
expect(small_arr).to eq([
[1, 2],
[3, 4]
])
end
it "should not call the built-in #transpose method" do
expect(arr).not_to receive(:transpose)
arr.my_transpose
end
end
end
Output:
7) Array#my_transpose should not modify the original array
Failure/Error: expect(small_arr).to eq([
expected: [[1, 2], [3, 4]]
got: [[], []]
(compared using ==)
# ./spec/00_array_extensions_spec.rb:123:in `block (3 levels) in <top (required)>'
When you call dup on an array, it only duplicates the array itself; the array's contents are not also duplicated. So, for example:
a = [[1,2],[3,4]]
b = a.dup
a.object_id == b.object_id # => false
a[0].object_id == b[0].object_id # => true
Thus, modifications to a itself are not reflected in b (and vice versa), but modifications in the elements of a are reflected in b, because those elements are the same objects.
That being the case, the problem crops up here:
orig_arr.each { |arr| temp_arr << arr.shift }
arr is an element of orig_arr, but it is also an element of self. If you did something like remove it from orig_arr, you would not also remove it from self, but if you change it, it's changed, no matter how you are accessing it, and as it turns out, Array#shift is a destructive operation.
Probably the smallest change you could make to your code to make it work as you expect would be to use each_with_index, and then use the index into arr, rather than calling arr.shift, so:
orig_arr.each_with_index { |arr,i| temp_arr << arr[i] }
In fact, though, once you're doing that, you're not doing any destructive operations at all and you don't need orig_arr, you can just use self.
The original array isn’t being modified, but the arrays within it are, as dup is a shallow clone.
xs = [[1,2],[3,4]]
ids = xs.map(&:object_id)
xs.my_transpose
ids == xs.map(&:object_id) #=> true
Since shift is a mutating operation (being performed on the nested array elements), you need to dup the elements within the array as well, e.g.
orig_arr = dup.map(&:dup)
With this modification, your test should pass.

Why does each_cons yield arrays instead of multiple values?

I just wanted to apply a binary operation to consecutive elements in an array, e.g.:
[1, 2, 3, 4].each_cons(2).map { |a, b| a.quo(b) }
#=> [(1/2), (2/3), (3/4)]
This is a contrived example, the operation doesn't really matter.
I was surprised, that I couldn't just write:
[1, 2, 3, 4].each_cons(2).map(&:quo)
#=> NoMethodError: undefined method `quo' for [1, 2]:Array
This is because each_cons doesn't yield multiple values, but an array containing the values.
It works like this:
def each_cons_arrays
return enum_for(__method__) unless block_given?
yield [1, 2]
yield [2, 3]
yield [3, 4]
end
each_cons_arrays.map(&:quo)
#=> NoMethodError: undefined method `quo' for [1, 2]:Array
And I was hoping for:
def each_cons_values
return enum_for(__method__) unless block_given?
yield 1, 2
yield 2, 3
yield 3, 4
end
each_cons_values.map(&:quo)
#=> [(1/2), (2/3), (3/4)]
What's the rationale behind this? Why could it be preferable to always have an array?
And by the way, with_index on the other hand does yield multiple values:
[1, 1, 1].each.with_index(2).map(&:quo)
#=> [(1/2), (1/3), (1/4)]
From my experience, it helps to think of multiple values in ruby as an array.
It think of it at as
[1,2,3].each_cons(2) do |iter|
a,b = iter
do_stuff(a,b)
end
If you want to do it like that, I'd add the quo method to a custom class, and do
class Foobar
def initialize(a,b)
#a = a
#b = b
end
def quo
do_stuff
end
end
[1,2,3]
.each_cons(2)
.map { |a,b| Foobar.new(a,b) }
.map(:quo)
Would that work for your usecase?

Resources