Ruby group array of hashes by numeric key - arrays

I got an array of hashes like this one:
[{1=>6}, {1=>5}, {4=>1}]
I try to group by the keys.
So the solution with named keys was like: group_by { |h| h['keyName'] }.
How can I get the following array with short Lambda expressions or with group_by:
[{1=>[5, 6], 4=>[1]}]
EDIT - To explain what I am trying to achieve:
I got a database to allocate pupils to courses.
Each pupil is able to vote each year for a course.
The votes look like this:
Vote(id: integer, first: integer, second: integer, active: boolean,
student_id: integer, course_id: integer, year_id: integer,
created_at: datetime, updated_at: datetime)
Now I would like to allocate the pupils automatically to a course, if the course is not overstaffed. To find out how many pupils voted for each course I first tried this:
Year.get_active_year.votes.order(:first).map(&:first).group_by(&:itself)
the result looks like this:
{1=>[1, 1], 4=>[4]}
Now I am able to use the .each function:
Year.get_active_year.votes.order(:first).map(&:first).group_by(&:itself).each do |_key, value|
if Year.get_active_year.courses.where(number: _key).first.max_visitor >= value.count
end
end
each course got an explicit number and the pupils just use the course number to vote.
But if I do all this, I lose the information which pupil voted for that course, so I tried to keep the information like this:
Year.get_active_year.votes.order(:first).map{|c| {c.first=> c.student_id}}

Injecting into a default hash:
arr = [{1=>6}, {1=>5}, {4=>1}]
arr.inject(Hash.new{|h,k| h[k]=[]}){|h, e| h[e.first[0]] << e.first[1]; h}
# => {1=>[6, 5], 4=>[1]}
Or, as suggested in the comments:
arr.each.with_object(Hash.new{|h, k| h[k] = []}){|e, h| h[e.first[0]] << e.first[1]}
# => {1=>[6, 5], 4=>[1]}

def group_values(arr)
arr.reduce(Hash.new {|h,k| h[k]=[]}) do |memo, h|
h.each { |k, v| memo[k] << v }
memo
end
end
xs = [{1=>6}, {1=>5}, {4=>1}]
group_values(xs) # => {1=>[6, 5], 4=>[1]}
Note that this solution also works when the hashes contain multiple entries:
ys = [{1=>6, 4=>2}, {1=>5}, {4=>1}]
group_values(ys) # => {1=>[6, 5], 4=>[2, 1]}

arr = [{1=>6}, {1=>5}, {4=>1}]
arr.flat_map(&:to_a).
group_by(&:first).
transform_values { |arr| arr.transpose.last }
#=> {1=>[6, 5], 4=>[1]}
The steps are as follows.
a = arr.flat_map(&:to_a)
#=> [[1, 6], [1, 5], [4, 1]]
b = a.group_by(&:first)
#=> {1=>[[1, 6], [1, 5]], 4=>[[4, 1]]}
b.transform_values { |arr| arr.transpose.last }
#=> {1=>[6, 5], 4=>[1]}
Note that
b.transform_values { |arr| arr.transpose }
#=> {1=>[[1, 1], [6, 5]], 4=>[[4], [1]]}
and arr.flat_map(&:to_a) can be replaced with arr.map(&:flatten).
Another way:
arr.each_with_object({}) do |g,h|
k,v = g.flatten
h.update(k=>[v]) { |_,o,n| o+n }
end
#=> {1=>[6, 5], 4=>[1]}
This uses the form of Hash#update (aka merge!) that employs the block { |_,o,n| o+n } to determine the values of keys that are present in both hashes being merged. The block variable _ is the common key (represented by an underscore to signal that it is not used in the block calculations). The variables o and n are respectively the values of the common key in the two hashes being merged.

One way to achieve this using #group_by could be to group by the first key of each hash, then #map over the result to return the corresponding values:
arr = [{1=>6}, {1=>5}, {4=>1}]
arr.group_by {|h| h.keys.first}.map {|k, v| {k => v.map {|h| h.values.first}}}
# => [{1=>[6, 5], 4=>[1]}]
Hope this helps!

Related

Turning an array into a hash

I have a ruby array of hashes:
my_array = [{"apples" => 5}, {"oranges" => 12}]
I would like to turn it into a hash, where the hash keys are equal to the array index values +1, so like this:
my_hash = {"1"=>{"apples"=> 5}, "2"=>{"oranges" => 12}}
Any ideas?
You can also Enumerable#zip with a range, then convert Array#to_h:
(1..my_array.size).zip(my_array).to_h
#=> {1=>{"apples"=>5}, 2=>{"oranges"=>12}}
How it works
my_array.size #=> 2 returns the size of the Array as an Integer.
(1..my_array.size) it's an inclusive Range which enumerates integers form 1 to array size, 2 in this case.
A Range responds to Enumerable#zip, so, for example you can do this obtaining an Array of pairs:
(1..3).zip(:a..:c) #=> [[1, :a], [2, :b], [3, :c]]
Finally, an Array of pairs can be converted into an Hash, see Array#to_h:
[[1, :a], [2, :b], [3, :c]].to_h #=> {1=>:a, 2=>:b, 3=>:c}
Since the Range is made of integer, keys of the Hash are integer. But you can tweak the line of code to obtain strings as keys.
my_array = [{"apples" => 5}, {"oranges" => 12}]
my_hash = my_array.each_with_index.map{|h, i| [(i+1).to_s, h]}.to_h
You can try to use Enumerator#with_index to have index and Enumerator#each_with_object to create new hash
my_array = [{"apples"=> 5}, {"oranges" => 12}]
my_hash = my_array.each.with_index.with_object({}){|(hsh, i), e| e[(i+1).to_s] = hsh}
# => {"1"=>{"apples"=> 5}, "2"=>{"oranges" => 12}}

How do I group and add values from nested hashes and arrays with same key?

I am trying to get the sum of points and average grade for each student inside this combination of hashes and arrays but all my attempts only return the general sum for all entries. Any ideas?
student_data =
{"ST4"=>[{:student_id=>"ST4", :points=> 5, :grade=>5},
{:student_id=>"ST4", :points=>10, :grade=>4},
{:student_id=>"ST4", :points=>20, :grade=>5}],
"ST1"=>[{:student_id=>"ST1", :points=>10, :grade=>3},
{:student_id=>"ST1", :points=>30, :grade=>4},
{:student_id=>"ST1", :points=>45, :grade=>2}],
"ST2"=>[{:student_id=>"ST2", :points=>25, :grade=>5},
{:student_id=>"ST2", :points=>15, :grade=>1},
{:student_id=>"ST2", :points=>35, :grade=>3}],
"ST3"=>[{:student_id=>"ST3", :points=> 5, :grade=>5},
{:student_id=>"ST3", :points=>50, :grade=>2}]}
The desired hash can be obtained thusly.
student_data.transform_values do |arr|
points, grades = arr.map { |h| h.values_at(:points, :grade) }.transpose
{ :points=>points.sum, :grades=>grades.sum.fdiv(grades.size) }
end
#=> {"ST4"=>{:points=>35, :grades=>4.666666666666667},
# "ST1"=>{:points=>85, :grades=>3.0},
# "ST2"=>{:points=>75, :grades=>3.0},
# "ST3"=>{:points=>55, :grades=>3.5}}
The first value passed to the block is the value of the first key, 'ST4' and the block variable arr is assigned that value:
a = student_data.first
#=> ["ST4",
# [{:student_id=>"ST4", :points=> 5, :grade=>5},
# {:student_id=>"ST4", :points=>10, :grade=>4},
# {:student_id=>"ST4", :points=>20, :grade=>5}]
# ]
arr = a.last
#=> [{:student_id=>"ST4", :points=> 5, :grade=>5},
# {:student_id=>"ST4", :points=>10, :grade=>4},
# {:student_id=>"ST4", :points=>20, :grade=>5}]
The block calculations are as follows. The first value of arr passed by map to the inner block is
h = arr.first
#=> {:student_id=>"ST4", :points=>5, :grade=>5}
h.values_at(:points, :grade)
#=> [5, 5]
After the remaining two elements of arr are passed to the block we have
b = arr.map { |h| h.values_at(:points, :grade) }
#=> [[5, 5], [10, 4], [20, 5]]
Then
points, grades = b.transpose
#=> [[5, 10, 20], [5, 4, 5]]
points
#=> [5, 10, 20]
grades
#=> [5, 4, 5]
We now simply form the hash that is the value of 'ST4'.
c = points.sum
#=> 35
d = grades.sum
#=> 14
e = grades.size
#=> 3
f = c.fdiv(d)
#=> 4.666666666666667
The value of 'ST4' in student_data therefore maps to the hash
{ :points=>c, :grades=>f }
#=> {:points=>35, :grades=>4.666666666666667}
The mappings of the remaining keys of student_data are computed similarly.
See Hash#transform_values, Enumerable#map, Hash#values_at, Array#transpose, Array#sum and Integer#fdiv.
Whatever you expect can be achieved as below,
student_data.values.map do |z|
z.group_by { |x| x[:student_id] }.transform_values do |v|
{
points: v.map { |x| x[:points] }.sum, # sum of points
grade: (v.map { |x| x[:grade] }.sum/v.count.to_f).round(2) # average of grades
}
end
end
As exact expected output format is not specified, obtained in following way,
=> [
{"ST4"=>{:points=>35, :grade=>4.67}},
{"ST1"=>{:points=>85, :grade=>3.0}},
{"ST2"=>{:points=>75, :grade=>3.0}},
{"ST3"=>{:points=>55, :grade=>3.5}}
]
For Ruby 2.6 using Object#then or Object#yield_self for Ruby 2.5
student_data.transform_values { |st| st
.each_with_object(Hash.new(0)) { |h, hh| hh[:sum_points] += h[:points]; hh[:sum_grade] += h[:grade]; hh[:count] += 1.0 }
.then{ |hh| {tot_points: hh[:sum_points], avg_grade: hh[:sum_grade]/hh[:count] } }
}
How it works?
Given the array for each student:
st = [{:student_id=>"ST4", :points=> 5, :grade=>5}, {:student_id=>"ST4", :points=>10, :grade=>4}, {:student_id=>"ST4", :points=>20, :grade=>5}]
First build a hash adding and counting using Enumerable#each_with_object with a Hash#default set at zero (Hash.new(0))
step1 = st.each_with_object(Hash.new(0)) { |h, hh| hh[:sum_points] += h[:points]; hh[:sum_grade] += h[:grade]; hh[:count] += 1.0 }
#=> {:sum_points=>35, :sum_grade=>14, :count=>3.0}
Then use then! (yield_self for Ruby 2.5)
step2 = step1.then{ |hh| {tot_points: hh[:sum_points], avg_grade: hh[:sum_grade]/hh[:count] }}
#=> {:tot_points=>35, :avg_grade=>4.666666666666667}
Put all together using Hash#transform_values as in the first snippet of code

Sort an array of arrays by the number of same occurencies in Ruby

This question is different from this one.
I have an array of arrays of AR items looking something like:
[[1,2,3], [4,5,6], [7,8,9], [7,8,9], [1,2,3], [7,8,9]]
I would like to sort it by number of same occurences of the second array:
[[7,8,9], [1,2,3], [4,5,6]]
My real data are more complexes, looking something like:
raw_data = {}
raw_data[:grapers] = []
suggested_data = {}
suggested_data[:grapers] = []
varietals = []
similar_vintage.varietals.each do |varietal|
# sub_array
varietals << Graper.new(:name => varietal.grape.name, :grape_id => varietal.grape_id, :percent => varietal.percent)
end
raw_data[:grapers] << varietals
So, I want to sort raw_data[:grapers] by the max occurrencies of each varietals array comparing this value: grape_id inside them.
When I need to sort a classical array of data by max occurencies I do that:
grapers_with_frequency = raw_data[:grapers].inject(Hash.new(0)) { |h,v| h[v] += 1; h }
suggested_data[:grapers] << raw_data[:grapers].max_by { |v| grapers_with_frequency[v] }
This code doesn't work cos there are sub arrays there, including AR models that I need to analyze.
Possible solution:
array.group_by(&:itself) # grouping
.sort_by {|k, v| -v.size } # sorting
.map(&:first) # optional step, depends on your real data
#=> [[7, 8, 9], [1, 2, 3], [4, 5, 6]]
I recommend you take a look at the Ruby documentation for the sort_by method. It allows you to sort an array using anything associated with the elements, rather than the values of the elements.
my_array.sort_by { |elem| -my_array.count(elem) }.uniq
=> [[7, 8, 9], [1, 2, 3], [4, 5, 6]]
This example sorts by the count of each element in the original array. This is preceded with a minus so that the elements with the highest count are first. The uniq is to only have one instance of each element in the final result.
You can include anything you like in the sort_by block.
As Ilya has pointed out, having my_array.count(elem) in each iteration will be costlier than using group_by beforehand. This may or may not be an issue for you.
arr = [[1,2,3], [4,5,6], [7,8,9], [7,8,9], [1,2,3], [7,8,9]]
arr.each_with_object(Hash.new(0)) { |a,h| h[a] += 1 }.
sort_by(&:last).
reverse.
map(&:first)
#=> [[7.8.9]. [1,2,3], [4,5,6]]
This uses the form of Hash::new that takes an argument (here 0) that is the hash's default value.

While converting Array of array in ruby to hash hash does not include all the keys but takes the last one

This is an array which i want to convert into a hash
a = [[1, 3], [3, 2], [1, 2]]
but the hash i am getting is
2.2.0 :004 > a.to_h
=> {1=>2, 3=>2}
why is it so?
Hashes have unique keys. Array#to_h is effectively doing the following:
h = {}.merge(1=>3).merge(3=>2).merge(1=>2)
#=> { 1=>3 }.merge(3=>2).merge(1=>2)
#=> { 1=>3, 3=>2 }.merge(1=>2)
#=> { 1=>2, 3=>2 }
In the last merge the value of the key 1 (3) is replaced with 2.
Note that
h.merge(k=>v)
is (permitted) shorthand for
h.merge({ k=>v })
The keys of a Hash are basically a Set, so no duplicate keys are allowed.
If two pairs are present in your Array with the same first element, only the last pair will be kept in the Hash.
If you want to keep the whole information, you could define arrays as values :
a = [[1, 3], [3, 2], [1, 2]]
hash = Hash.new{|h,k| h[k] = []}
p a.each_with_object(hash) { |(k, v), h| h[k] << v }
#=> {1=>[3, 2], 3=>[2]}
Here's a shorter but less common way to define it :
hash = a.each_with_object(Hash.new{[]}) { |(k, v), h| h[k] <<= v }
Calling hash[1] returns [3,2], which are all the second elements from the pairs of your array having 1 as first element.

Convert a hash into an array

I am trying to create a method that will take a hash:
{"H"=> 1, "e"=> 1, "l"=> 3, "o"=> 2, "W"=> 1, "r"=> 1, "d"=> 1}
as a parameter and return an array of its key-value pairs like such:
arr = [["H", 1], ["e", 1], ..., ["d", 1]]
I have the following, but it is flawed:
def toCountsArray(counts)
arr = []
i = 0
counts.each do |key, value|
arr[i].push [key, value]
i += 1
end
return arr
end
I am not supposed to use the to_a method or any kind of helper like that. Any help or guidance is appreciated.
You're basically there. The arbitrary restriction on to_a is odd, since there's many ways to get effectively the same thing. Still, to fix your original example:
array = [ ]
counts.each do |pair|
array << pair
end
That's a messy way of doing to_a, but it should work. Your mistake was trying to append to a specific element of array, not append to the array itself.
A pattern to use when doing this sort of operation is this:
counts = Hash.new(0)
That creates a Hash with a default value of 0 for each element. This avoids the dance you have to do to assign to an undefined key.
There's a few other things you can do to reduce this and make it more Ruby-like:
def count_chars(string)
string.chars.each_with_object(Hash.new(0)) do |char, counts|
case (char)
when ' '
# Ignored
else
counts[char] += 1
end
end
end
The each_with_object method is handy in that it iterates over an array while passing through an object that each iteration can make use of. Combining the trick of having a Hash with a default value makes this pretty tidy.
If you have a longer list of "to ignore" characters, express that as an array. string.chars - exclusions can then delete the unwanted ones. I've used a case statement here to make adding special behaviour easier.
hash = { "H"=> 1, "e"=> 1, "l"=> 3, "o"=> 2, "W"=> 1, "r"=> 1, "d"=> 1 }
p [*hash]
# => [["H", 1], ["e", 1], ["l", 3], ["o", 2], ["W", 1], ["r", 1], ["d", 1]]
instead of
arr[i].push [key, value]
use
arr.push [key, value]
because arr[i] refers to the i-th element
I would do something like this:
hash = { "H"=> 1, "e"=> 1, "l"=> 3, "o"=> 2, "W"=> 1, "r"=> 1, "d"=> 1 }
hash.each_with_object([]) { |kv, a| a << kv }
#=> [["H",1],["e",1],["l",3],["o",2],["W",1],["r",1],["d",1]]
You can do this:
def to_counts_array(counts)
counts.map { |k, v| [k, v] }
end
h = { "H"=> 1, "e"=> 1, "l"=> 3, "o"=> 2, "W"=> 1, "r"=> 1, "d"=> 1 }
to_counts_array(h)
Although I like the #steenslag's answer as well.
Another way, just map to self:
x.map &:itself #=> [["H", 1], ["e", 1], ["l", 3], ["o", 2], ["W", 1], ["r", 1], ["d", 1]]

Resources