Manipulate array of hashes into grouped hashes with arrays - arrays

I have an array of hashes:
[
{
"June" => { "A" => { 3 => 48.4 } }
},
{
"January" => { "C" => { 2 => 88.0} }
},
{
"January"=> { "B" => { 2 => 44.0} }
},
{
"January"=> { "C" => { 4 => 48.8} }
}
]
I need to group each similar hash key into an array of the subsequent values like the following:
{
"June" => [{ "A" => [{ 3 => 48.4 }]] },
"January" => [
{ "B" => [{ 2 => 44.0}],
{ "C" => [{ 2 => 88.0}, { 4 => 48.8}],
] }
}
I am looking for an efficient method of grouping these elements. Can anyone help me master this hash of hashes?
I am trying to avoid looping through the base array and grouping manually. I was hoping that map (or some other enumerable method) might give what I want. When I used reduce(Hash.new, :merge), it came close but it used the last hash for each month key instead of adding it to an array.

Note: I added the following after gaining a clearer understanding of the question. My original answer is below.
Here is the OP's array of hashes, modified slightly.
arr = [{ "June" =>{ "A"=>{ 3=>48.4 } } },
{ "January"=>{ "C"=>{ 2=>88.0 } } },
{ "January"=>{ "B"=>{ "D"=>{ 2=>44.0 } } } },
{ "January"=>{ "C"=>{ 2=>10.0 } } },
{ "January"=>{ "C"=>{ 4=>48.8 } } }]
The hash to be constructed appears to be the following.
{ "June" =>[{ "A"=>[{ 3=>48.4 }] }],
"January"=>[{ "B"=>[{ "D"=>[{ 2=>44.0 }] }] }],
"C"=>[{ 2=>98.0, 4=>48.8 }] }] }
Note that 88.0 + 10.0 #=> 98.0 in 2=>98.0.
Observe that all the arrays within arr contain a single element, a hash. That being the case, those arrays serve no useful purpose. I therefore suggest the following hash be constructed instead:
{ "June" =>{ "A"=>{ 3=>48.4 } },
"January"=>{ "B"=>{ "D"=>{ 2=>44.0 } } },
"C"=>{ 2=>98.0, 4=>48.8 } } }
This can be produced with the following recursive method.
def recurse(arr)
arr.map(&:flatten).
group_by(&:first).
each_with_object({}) do |(k,v),h|
o = v.map(&:last)
h.update(k=>o.first.is_a?(Hash) ? recurse(o) : o.sum )
end
end
recurse(arr)
#=> {"June"=>{"A"=>{3=>48.4}},
# "January"=>{"C"=>{2=>98.0, 4=>48.8}, "B"=>{"D"=>{2=>44.0}}}}
(Original answer follows)
Here are two ways to obtain the desired hash. I assume that arr is your array of hashes.
#1 Use the form of Hash::new that takes a block
arr.each_with_object(Hash.new { |h,k| h[k] = [] }) do |g,h|
k, v = g.to_a.first
h[k] << v
end
# => {"June"=>[{"A"=>{3=>48.4}}],
# "January"=>[{"C"=>{2=>88.0}}, {"B"=>{2=>44.0}}, {"C"=>{4=>48.8}}]}
#2 Use Enumerable#group_by
arr.map(&:first).
group_by(&:first).
tap { |h| h.keys.each { |k| h[k] = h[k].map(&:last) } }
# => {"June"=>[{"A"=>{3=>48.4}}],
# "January"=>[{"C"=>{2=>88.0}}, {"B"=>{2=>44.0}}, {"C"=>{4=>48.8}}]}
The steps are as follows.
a = arr.map(&:first)
#=> [["June", {"A"=>{3=>48.4}}], ["January", {"C"=>{2=>88.0}}],
# ["January", {"B"=>{2=>44.0}}], ["January", {"C"=>{4=>48.8}}]]
b = a.group_by(&:first)
#=> {"June"=>[["June", {"A"=>{3=>48.4}}]],
# "January"=>[["January", {"C"=>{2=>88.0}}], ["January", {"B"=>{2=>44.0}}],
# ["January", {"C"=>{4=>48.8}}]]}
c = b.tap { |h| h.keys.each { |k| h[k] = h[k].map(&:last) } }
#=> {"June"=>[{"A"=>{3=>48.4}}],
# "January"=>[{"C"=>{2=>88.0}}, {"B"=>{2=>44.0}}, {"C"=>{=>48.8}}]}
Let me elaborate the last step. Inside tap's block, we compute the following.
h = b
d = h.keys
#=> ["June", "January"]
The first element of d is passed to each's block and the block variable is assigned to that element.
k = d.first
#=> "June"
The block calculation is as follows.
e = h[k]
#=> [["June", {"A"=>{3=>48.4}}]]
f = e.map(&:last)
#=> [{"A"=>{3=>48.4}}]
h[k] = f
#=> [{"A"=>{3=>48.4}}]
b #=> {"June"=>[{"A"=>{3=>48.4}}],
# "January"=>[["January", {"C"=>{2=>88.0}}],
# ["January", {"B"=>{2=>44.0}}],
# ["January", {"C"=>{4=>48.8}}]]}
Next, d[1] ("January") is passed to each's block and similar calculations are performed.
Rather than using Object#tap I could have written
h = arr.map(&:first).
group_by(&:first)
h.keys.each { |k| h[k] = h[k].map(&:last) }
h
tap merely avoids the creation of local variable h and the need to have a final line equal to h.

Related

In ruby how do you rearrange an array of objects with id keys, give a new order of the array as an array?

I have an object with an array that looks like this:
some_object = {
some_array: [
{ id: "foo0" },
{ id: "foo1" },
{ id: "foo2" },
{ id: "foo3" },
]
}
And I have an input of another array that I want to rearrange the array in
target_order = [
{ id: "foo0", new_position: 3 },
{ id: "foo3", new_position: 0 },
{ id: "foo1", new_position: 2 },
{ id: "foo2", new_position: 1 }
]
How do I go about using the second target_order array to modify the order of the first some_object[:some_array]?
I recommend you use sort_by with a custom block that finds the position of the item in the new array.
new_array = some_object[:some_array].sort_by do |item|
order = target_order.detect { |order| order[:id] == item[:id] }
next unless order
order[:new_position]
end
This returns the following value.
=> [{:id=>"foo2"}, {:id=>"foo1"}, {:id=>"foo0"}, {:id=>"foo3"}]
Further considerations
Perhaps you wanted to give each item a position in a list instead of just sorting them. For instance
target_order = [
{ id: "foo0", new_position: 0 },
{ id: "foo1", new_position: 2 }
]
would give
=> [{ id: "foo0" }, nil, { id: "foo1" }]
To do this, you should use each_with_object instead of sort_by.
new_array = target_order.each_with_object([]) do |order, memo|
item = some_object[:some_array].detect { |item| item[:id] == order[:id] }
next unless item
memo[order[:new_position]] = item
end
Just to be simplistic, this is what I would do ...
temp_arr = []
target_order.each do |o|
x = some_json_object[:some_array].find { |i| o[:id] == i[:id] }
temp_arr[o[:new_position] - 1] = x
end
some_json_object = {
"some_array": temp_arr
}
If there is one to one correspondence between some_array and target_order elements, maybe you can do a direct assignment, something like:
some_object[:some_array] = target_order.sort_by{ |h| h[:new_position] }.map { |h| h.delete_if { |k, _| k == :new_position } }
So, you'll end up with
some_object #=> {:some_array=>[{:id=>"foo3"}, {:id=>"foo2"}, {:id=>"foo1"}, {:id=>"foo0"}]}
There is no need to sort, which has time-complexity of O(n*log(n)). Here is a O(n) solution.
{ some_array: target_order.each_with_object([]) { |h,a|
a[h[:new_position]] = h.slice(:id) } }
#=> {:some_array=>[{:id=>"foo3"}, {:id=>"foo2"}, {:id=>"foo1"}, {:id=>"foo0"}]}
Note that there is no reference to some_object.
If some_object is to be modified in place:
some_object[:some_array] = target_order.each_with_object([]) { |h,a|
a[h[:new_position]] = h.slice(:id) }
some_object
#=> {:some_array=>[{:id=>"foo3"}, {:id=>"foo2"}, {:id=>"foo1"}, {:id=>"foo0"}]}
Using Enumerable#sort_by, though less efficient, one could write:
{ some_array: target_order.sort_by { |h| h[:new_position] }.map { |h| h.slice(:id) } }
#=> {:some_array=>[{:id=>"foo3"}, {:id=>"foo2"}, {:id=>"foo1"}, {:id=>"foo0"}]}

Ruby: How can I convert this array into this hash?

I have an array of arrays. Each item in the array contains three strings: a leg count, an animal and a sound.
a = [ ['4', 'dog', 'woof'] , ['4', 'cow', 'moo'], ['2', 'human', 'yo'] , ['2', 'yeti', 'wrarghh'] ]
I want to turn the array into this hash:
{
'2' => [ { 'human' => 'yo' }, { 'yeti' => 'wrarghh'} ],
'4' => [ { 'dog' => 'woof' }, { 'cow' => 'moo'} ]
}
I thought reduce would be the way to go but I'm not having much luck. My current stab looks like:
a.reduce({}) do |acc, item|
acc[item.first] = [] unless acc.key? item.first
acc[item.first] << { item[1] => item[2] }
end
But it gets an error:
NoMethodError: undefined method `key?' for [{"dog"=>"woof"}]:Array
What is the best way to achieve this?
a.each_with_object({}) { |(kout, kin, val), h| (h[kout] ||= []) << { kin => val } }
#=> {"4"=>[{"dog"=>"woof"}, {"cow"=>"moo"}], "2"=>[{"man"=>"yo"}, {"yeti"=>"wrarghh"}]}
We have
enum = a.each_with_object({})
#=> #<Enumerator: [["4", "dog", "woof"], ["4", "cow", "moo"], ["2", "man", "yo"],
# ["2", "yeti", "wrarghh"]]:each_with_object({})>
The first value is generated by this enumerator and passed to the block, and the block variables are assigned values:
(kout, kin, val), h = enum.next
#=> [["4", "dog", "woof"], {}]
which is decomposed as follows.
kout
#=> "4"
kin
#=> "dog"
val
#=> "woof"
h #=> {}
The block calculation is therefore
(h[kout] ||= []) << { kin => val }
#=> (h[kout] = h[kout] || []) << { "dog" => "wolf" }
#=> (h["4"] = h["4"] || []) << { "dog" => "wolf" }
#=> (h["4"] = nil ||= []) << { "dog" => "wolf" }
#=> (h["4"] = []) << { "dog" => "wolf" }
#=> [] << { "dog" => "wolf" }
#=> [{ "dog" => "wolf" }]
h["4"] || [] #=> [] since h has no key "4" and therefore h["4"] #=> nil.
The next value of enum is passed to the block and the calculations are repeated.
(kout, kin, val), h = enum.next
#=> [["4", "cow", "moo"], {"4"=>[{"dog"=>"woof"}]}]
kout
#=> "4"
kin
#=> "cow"
val
#=> "moo"
h #=> {"4"=>[{"dog"=>"woof"}]}
(h[kout] ||= []) << { kin => val }
#=> (h[kout] = h[kout] || []) << { "cow" => "moo" }
#=> (h["4"] = h["4"] || []) << { "cow" => "moo" }
#=> (h["4"] = [{"dog"=>"woof"}] ||= []) << { "cow" => "moo" }
#=> (h["4"] = [{"dog"=>"woof"}]) << { "cow" => "moo" }
#=> [{"dog"=>"woof"}] << { "cow" => "moo" }
#=> [{ "dog" => "wolf" }, { "cow" => "moo" }]
This time h["4"] || [] #=> [{ "dog" => "wolf" }] because h now has a key "4" with a truthy value ([{ "dog" => "wolf" }]).
The remaining calculations are similar.
You way works, but, for reduce, the return value (ie, the last line) of the block becomes the next value for (in this case) acc, so all you need to change is:
a.reduce({}) do |acc, item|
acc[item.first] = [] unless acc.key? item.first
acc[item.first] << { item[1] => item[2] }
acc # just add this line
end
Since the return value for Array#<< is the array itself, the second iteration gave acc as the array for the first element. There are, of course, lots of ways to do this, some arguably cleaner, but I find it's useful to know where I went wrong when something I think should work doesn't.

How to merge two arrays of hashes

I have two arrays of hashes:
a = [
{
key: 1,
value: "foo"
},
{
key: 2,
value: "baz"
}
]
b = [
{
key: 1,
value: "bar"
},
{
key: 1000,
value: "something"
}
]
I want to merge them into one array of hashes, so essentially a + b except I want any duplicated key in b to overwrite those in a. In this case, both a and b contain a key 1 and I want the final result to have b's key value pair.
Here's the expected result:
expected = [
{
key: 1,
value: "bar"
},
{
key: 2,
value: "baz"
},
{
key: 1000,
value: "something"
}
]
I got it to work but I was wondering if there's a less wordy way of doing this:
hash_result = {}
a.each do |item|
hash_result[item[:key]] = item[:value]
end
b.each do |item|
hash_result[item[:key]] = item[:value]
end
result = []
hash_result.each do |k,v|
result << {:key => k, :value => v}
end
puts result
puts expected == result # prints true
uniq would work if you concatenate the arrays in reverse order:
(b + a).uniq { |h| h[:key] }
#=> [
# {:key=>1, :value=>"bar"},
# {:key=>1000, :value=>"something"},
# {:key=>2, :value=>"baz"}
# ]
It doesn't however preserve the order.
[a, b].map { |arr| arr.group_by { |e| e[:key] } }
.reduce(&:merge)
.flat_map(&:last)
Here we use hash[:key] as a key to build the new hash, then we merge them overriding everything with the last value and return values.
I would rebuild your data a bit, since there are redundant keys in hashes:
thin_b = b.map { |h| [h[:key], h[:value]] }.to_h
#=> {1=>"bar", 1000=>"something"}
thin_a = b.map { |h| [h[:key], h[:value]] }.to_h
#=> {1=>"bar", 1000=>"something"}
Then you can use just Hash#merge:
thin_a.merge(thin_b)
#=> {1=>"bar", 2=>"baz", 1000=>"something"}
But, if you want, you can get exactly result as mentioned in question:
result.map { |k, v| { key: k, value: v } }
#=> [{:key=>1, :value=>"bar"},
# {:key=>2, :value=>"baz"},
# {:key=>1000, :value=>"something"}]
using Enumerable#group_by and Enumerable#map
(b+a).group_by { |e| e[:key] }.values.map {|arr| arr.first}
If you need to merge two arrays of hashes that should be merged also and there is more than two keys, then next snippet should help:
[a, b].flatten
.compact
.group_by { |v| v[:key] }
.values
.map { |e| e.reduce(&:merge) }

How to array_wrap an hash of hashes

inv = {"C"=>{"CPS"=>{"CP"=>{"name"=>"a"}}}} is my object
I want
inv = {"C"=>{"CPS"=>{"CP"=>[{"name"=>"a"}]}}}
I tried
inv["C"]["CPS"].inject({}) do |result, (k, v)|
k = Array.wrap(v)
end
=> [{"name"=>"a"}]
but still inv={"C"=>{"CPS"=>{"CP"=>{"name"=>"a"}}}}
tries map also
Another option is to use tap
inv["C"]["CPS"].tap do |h|
h["CP"] = [h["CP"]] #or Array.wrap(h["CP"]) in rails
end
inv
#=> {"C"=>{"CPS"=>{"CP"=>[{"name"=>"a"}]}}}
tap will yield the current object so you can modify it in place.
Update
Inspired by #CarySwoveland's broader application you could use something like this as well.
class HashWrapper
attr_reader :original_hash
attr_accessor :target_keys
def initialize(h,*target_keys)
#original_hash = h
#target_keys = target_keys
end
def wrapped_hash
#wrapped_hash ||= {}
end
def wrap_me
original_hash.each do |k,v|
value = v.is_a?(Hash) ? HashWrapper.new(v,*target_keys).wrap_me : v
wrapped_hash[k] = wrap(k,value)
end
wrapped_hash
end
private
def wrap(k,v)
target_keys.include?(k) ? [v] : v
end
end
Then implementation is as follows
wrapper = HashWrapper.new(inv,"CP")
wrapper.wrap_me
#=> {"C"=>
{"CPS"=>
{"CP"=>
[
{"name"=>"a"}
]
}
}
}
new_wrapper = HashWrapper.new(inv,"CP","CPS")
new_wrapper.wrap_me
#=> {"C"=>
{"CPS"=>
[
{"CP"=>
[
{"name"=>"a"}
]
}
]
}
}
This assumes unique keys all the way through the hierarchy otherwise nested keys of the same name will be wrapped in the same fashion from the bottom up.
e.g.
inv = {"C"=>{"CPS"=>{"CP"=>{"name"=>"a"}},"CP" => "higher level"}}
HashWrapper.new(inv,"CP").wrap_me
#=> {"C"=>
{"CPS"=>
{"CP"=>
[
{"name"=>"a"}
]
},
"CP"=>
[
"higher level"
]
}
}
This should do it:
hash = {"C"=>{"CPS"=>{"CP"=>{"name"=>"a"}}}}
val = hash["C"]["CPS"]["CP"]
val_as_arr = [val] # can optionally call flatten here
hash["C"]["CPS"]["CP"] = val_as_arr
puts hash
# => {"C"=>{"CPS"=>{"CP"=> [{"name" => "a"}] }}}
basically
get the value
convert to array
set the value
There is no iteration required here i.e. map or reduce
I suggest you use recursion, in the form of a compact and easily readable method that has broader application than solutions that only work with your specific hash.
def wrap_it(h)
h.each { |k,v| h[k] = v.is_a?(Hash) ? wrap_it(v) : [v] }
h
end
h = { "C"=>{ "CPS"=>{ "CP"=>{ "name"=>"a" } } } }
wrap_it(h)
#=> {"C"=>{"CPS"=>{"CP"=>{"name"=>["a"]}}}}
h = { "C"=>{ "CPS"=>{ "CP"=>{ "CPPS"=> { "name"=>"cat" } } } } }
wrap_it(h)
#=> {"C"=>{"CPS"=>{"CP"=>{"CPPS"=>{"name"=>["cat"]}}}}}
h = { "C"=>{ "CPS"=>{ "CP"=>{ "CPPS"=> { "name"=>"cat" } },
"DP"=>{ "CPPPS"=>"dog" } } } }
wrap_it(h)
#=> {"C"=>{"CPS"=>{"CP"=>{"CPPS"=>{"name"=>["cat"]}}, "DP"=>{"CPPPS"=>["dog"]}}}}

How to recursion this array of hashes

I'm wondering how to sum the "analytic" value from this array of hashes with recursion.
Input :
[{"id"=>"1234",
"id_data"=>
[{"segment"=>{"segment_name"=>"Android"},
"metrics"=>
{
"logins"=>[1000, 2000],
"sign_ups_conversion"=>{
"count"=>[500, 200],
"cost"=>[2, 4]
}
},
},
{"segment"=>{"segment_name"=>"iOS"},
"metrics"=>
{
"logins"=>[5000, 10000],
"sign_ups_conversion"=>{
"count"=>[100, 50],
"cost"=>[6, 8]
}
},
}
]
},
{"id"=>"5678",
"id_data"=>
[{"segment"=>{"segment_name"=>"Android"},
"metrics"=>
{
"logins"=>[3000, 2000],
"sign_ups_conversion"=>{
"count"=>[300, 400],
"cost"=>[2, 4]
}
},
},
{"segment"=>{"segment_name"=>"iOS"},
"metrics"=>
{
"logins"=>[5000, 10000],
"sign_ups_conversion"=>{
"count"=>[100, 50],
"cost"=>[6, 8]
}
},
}
]
}]
Output :
{
"Android"=>{
"ids" => ['1234','5678'],
"segment" => {"segment_name"=>"Android"},
"id_data" => [{
"logins" => [4000, 4000], # sum by index from 'Android' logins ("logins"=>[1000, 2000] & "logins"=>[3000, 2000]),
"sign_ups_conversion" => {
"count" => [800, 600], # sum by index from 'Android' sign ups count ("count"=>[500, 200] & "count"=>[300, 400])
"cost" => [4, 8] # sum by index from 'Android' sign ups cost ("cost"=>[2, 4] & "cost"=>[2, 4])
}
}]
},
"iOS"=>{
"ids" => ['1234','5678'],
"segment" => {"segment_name"=>"iOS"},
"id_data" => [{
"logins" => [10000, 20000], # sum by index from 'iOS' logins ("logins"=>[5000, 10000] & "logins"=>[5000, 10000]),
"sign_ups_conversion" => {
"count" => [200, 100], # sum by index from 'iOS' sign ups count ("count"=>[100, 50] & "count"=>[100, 50])
"cost" => [12, 16] # sum by index from 'iOS' sign ups cost ("cost"=>[6, 8] & "cost"=>[6, 8])
}
}]
}
}
Me, trying to solve it with this methods but it is not counting analytics with hash format (sign_ups_conversion) and still figuring it out how the results should be equal to output.
def aggregate_by_segments(stats_array)
results = {}
stats_array.each do |stats|
stats['id_data'].each do |data|
segment_name = data['segment']['segment_name']
results[segment_name] ||= {}
(results[segment_name]['ids'] ||= []) << stats['id']
results[segment_name]['segment'] ||= data['segment']
results[segment_name]['id_data'] ||= [{}]
data['metrics'].each do |metric, values|
next if skip_metric?(values)
(results[segment_name]['id_data'][0][metric] ||= []) << values
end
end
end
sum_segments(results)
end
def sum_segments(segments)
segments.each do |segment, segment_details|
segment_details['id_data'][0].each do |metric, values|
segment_details['id_data'][0][metric] = sum_segment_metric(values)
end
end
segments
end
def sum_segment_metric(metric_value)
metric_value.transpose.map { |x| x.reduce(:+) }
end
# I skipped hash format for now
def skip_metric?(metric_values)
!metric_values.is_a? Array
end
############################################
# calls it with aggregate_by_segments(input)
############################################
I believe we should use recursion but i'm still figuring it out, anyone can help me?
Thanks in advance!
The problem here is how to acces this data structures, a ruby strategy can be iterate over arrays using each and conctenating keys with concatenated hashes like this:
Supposing that your structure is mantained:
Array[hash[array[hash]]
array_hash.each do |stats|
stats["id_data"].each do |h|
puts h["metrics"]["sign_ups_conversion"]
end
end
# => {"count"=>[500, 200], "cost"=>[2, 4]}
# => {"count"=>[100, 50], "cost"=>[6, 8]}
# => {"count"=>[300, 400], "cost"=>[2, 4]}
# => {"count"=>[100, 50], "cost"=>[6, 8]}
I solved it.
def aggregate_by_segments(stats_array)
results = {}
stats_array.each do |stats|
stats['id_data'].each do |data|
segment_name = data['segment']['segment_name']
results[segment_name] ||= {}
(results[segment_name]['ids'] ||= []) << stats['id']
results[segment_name]['segment'] ||= data['segment']
results[segment_name]['id_data'] ||= [{}]
data['metrics'].each do |metric, values|
hash_values(results[segment_name]['id_data'][0], metric, values) if values.is_a? Hash
next if skip_metric?(values)
(results[segment_name]['id_data'][0][metric] ||= []) << values
end
end
end
sum_segments(results)
end
def hash_values(metrics, metric, hash_values)
hash_values.each do |k, v|
next if skip_metric?(v)
metrics[metric] ||= {}
(metrics[metric][k] ||= []) << v
end
end
def sum_segments(segments)
segments.each do |segment, segment_details|
segment_details['id_data'][0].each do |metric, values|
segment_details['id_data'][0][metric] = sum_segment_metric(values)
end
end
segments
end
def sum_segment_metric(metric_value)
result = metric_value.transpose.map { |x| x.reduce(:+) } if metric_value.is_a? Array
result = metric_value.each do |k, v|
metric_value[k] = sum_segment_metric(v)
end if metric_value.is_a? Hash
result
end
def skip_metric?(metric_values)
!metric_values.is_a? Array
end
I know the code is pretty ugly. I will refactor it later :)
Thank you guys for visiting and commenting with constructive feedback.

Resources