I'm wondering how to sum the "analytic" value from this array of hashes with recursion.
Input :
[{"id"=>"1234",
"id_data"=>
[{"segment"=>{"segment_name"=>"Android"},
"metrics"=>
{
"logins"=>[1000, 2000],
"sign_ups_conversion"=>{
"count"=>[500, 200],
"cost"=>[2, 4]
}
},
},
{"segment"=>{"segment_name"=>"iOS"},
"metrics"=>
{
"logins"=>[5000, 10000],
"sign_ups_conversion"=>{
"count"=>[100, 50],
"cost"=>[6, 8]
}
},
}
]
},
{"id"=>"5678",
"id_data"=>
[{"segment"=>{"segment_name"=>"Android"},
"metrics"=>
{
"logins"=>[3000, 2000],
"sign_ups_conversion"=>{
"count"=>[300, 400],
"cost"=>[2, 4]
}
},
},
{"segment"=>{"segment_name"=>"iOS"},
"metrics"=>
{
"logins"=>[5000, 10000],
"sign_ups_conversion"=>{
"count"=>[100, 50],
"cost"=>[6, 8]
}
},
}
]
}]
Output :
{
"Android"=>{
"ids" => ['1234','5678'],
"segment" => {"segment_name"=>"Android"},
"id_data" => [{
"logins" => [4000, 4000], # sum by index from 'Android' logins ("logins"=>[1000, 2000] & "logins"=>[3000, 2000]),
"sign_ups_conversion" => {
"count" => [800, 600], # sum by index from 'Android' sign ups count ("count"=>[500, 200] & "count"=>[300, 400])
"cost" => [4, 8] # sum by index from 'Android' sign ups cost ("cost"=>[2, 4] & "cost"=>[2, 4])
}
}]
},
"iOS"=>{
"ids" => ['1234','5678'],
"segment" => {"segment_name"=>"iOS"},
"id_data" => [{
"logins" => [10000, 20000], # sum by index from 'iOS' logins ("logins"=>[5000, 10000] & "logins"=>[5000, 10000]),
"sign_ups_conversion" => {
"count" => [200, 100], # sum by index from 'iOS' sign ups count ("count"=>[100, 50] & "count"=>[100, 50])
"cost" => [12, 16] # sum by index from 'iOS' sign ups cost ("cost"=>[6, 8] & "cost"=>[6, 8])
}
}]
}
}
Me, trying to solve it with this methods but it is not counting analytics with hash format (sign_ups_conversion) and still figuring it out how the results should be equal to output.
def aggregate_by_segments(stats_array)
results = {}
stats_array.each do |stats|
stats['id_data'].each do |data|
segment_name = data['segment']['segment_name']
results[segment_name] ||= {}
(results[segment_name]['ids'] ||= []) << stats['id']
results[segment_name]['segment'] ||= data['segment']
results[segment_name]['id_data'] ||= [{}]
data['metrics'].each do |metric, values|
next if skip_metric?(values)
(results[segment_name]['id_data'][0][metric] ||= []) << values
end
end
end
sum_segments(results)
end
def sum_segments(segments)
segments.each do |segment, segment_details|
segment_details['id_data'][0].each do |metric, values|
segment_details['id_data'][0][metric] = sum_segment_metric(values)
end
end
segments
end
def sum_segment_metric(metric_value)
metric_value.transpose.map { |x| x.reduce(:+) }
end
# I skipped hash format for now
def skip_metric?(metric_values)
!metric_values.is_a? Array
end
############################################
# calls it with aggregate_by_segments(input)
############################################
I believe we should use recursion but i'm still figuring it out, anyone can help me?
Thanks in advance!
The problem here is how to acces this data structures, a ruby strategy can be iterate over arrays using each and conctenating keys with concatenated hashes like this:
Supposing that your structure is mantained:
Array[hash[array[hash]]
array_hash.each do |stats|
stats["id_data"].each do |h|
puts h["metrics"]["sign_ups_conversion"]
end
end
# => {"count"=>[500, 200], "cost"=>[2, 4]}
# => {"count"=>[100, 50], "cost"=>[6, 8]}
# => {"count"=>[300, 400], "cost"=>[2, 4]}
# => {"count"=>[100, 50], "cost"=>[6, 8]}
I solved it.
def aggregate_by_segments(stats_array)
results = {}
stats_array.each do |stats|
stats['id_data'].each do |data|
segment_name = data['segment']['segment_name']
results[segment_name] ||= {}
(results[segment_name]['ids'] ||= []) << stats['id']
results[segment_name]['segment'] ||= data['segment']
results[segment_name]['id_data'] ||= [{}]
data['metrics'].each do |metric, values|
hash_values(results[segment_name]['id_data'][0], metric, values) if values.is_a? Hash
next if skip_metric?(values)
(results[segment_name]['id_data'][0][metric] ||= []) << values
end
end
end
sum_segments(results)
end
def hash_values(metrics, metric, hash_values)
hash_values.each do |k, v|
next if skip_metric?(v)
metrics[metric] ||= {}
(metrics[metric][k] ||= []) << v
end
end
def sum_segments(segments)
segments.each do |segment, segment_details|
segment_details['id_data'][0].each do |metric, values|
segment_details['id_data'][0][metric] = sum_segment_metric(values)
end
end
segments
end
def sum_segment_metric(metric_value)
result = metric_value.transpose.map { |x| x.reduce(:+) } if metric_value.is_a? Array
result = metric_value.each do |k, v|
metric_value[k] = sum_segment_metric(v)
end if metric_value.is_a? Hash
result
end
def skip_metric?(metric_values)
!metric_values.is_a? Array
end
I know the code is pretty ugly. I will refactor it later :)
Thank you guys for visiting and commenting with constructive feedback.
Related
I have an array of arrays. Each item in the array contains three strings: a leg count, an animal and a sound.
a = [ ['4', 'dog', 'woof'] , ['4', 'cow', 'moo'], ['2', 'human', 'yo'] , ['2', 'yeti', 'wrarghh'] ]
I want to turn the array into this hash:
{
'2' => [ { 'human' => 'yo' }, { 'yeti' => 'wrarghh'} ],
'4' => [ { 'dog' => 'woof' }, { 'cow' => 'moo'} ]
}
I thought reduce would be the way to go but I'm not having much luck. My current stab looks like:
a.reduce({}) do |acc, item|
acc[item.first] = [] unless acc.key? item.first
acc[item.first] << { item[1] => item[2] }
end
But it gets an error:
NoMethodError: undefined method `key?' for [{"dog"=>"woof"}]:Array
What is the best way to achieve this?
a.each_with_object({}) { |(kout, kin, val), h| (h[kout] ||= []) << { kin => val } }
#=> {"4"=>[{"dog"=>"woof"}, {"cow"=>"moo"}], "2"=>[{"man"=>"yo"}, {"yeti"=>"wrarghh"}]}
We have
enum = a.each_with_object({})
#=> #<Enumerator: [["4", "dog", "woof"], ["4", "cow", "moo"], ["2", "man", "yo"],
# ["2", "yeti", "wrarghh"]]:each_with_object({})>
The first value is generated by this enumerator and passed to the block, and the block variables are assigned values:
(kout, kin, val), h = enum.next
#=> [["4", "dog", "woof"], {}]
which is decomposed as follows.
kout
#=> "4"
kin
#=> "dog"
val
#=> "woof"
h #=> {}
The block calculation is therefore
(h[kout] ||= []) << { kin => val }
#=> (h[kout] = h[kout] || []) << { "dog" => "wolf" }
#=> (h["4"] = h["4"] || []) << { "dog" => "wolf" }
#=> (h["4"] = nil ||= []) << { "dog" => "wolf" }
#=> (h["4"] = []) << { "dog" => "wolf" }
#=> [] << { "dog" => "wolf" }
#=> [{ "dog" => "wolf" }]
h["4"] || [] #=> [] since h has no key "4" and therefore h["4"] #=> nil.
The next value of enum is passed to the block and the calculations are repeated.
(kout, kin, val), h = enum.next
#=> [["4", "cow", "moo"], {"4"=>[{"dog"=>"woof"}]}]
kout
#=> "4"
kin
#=> "cow"
val
#=> "moo"
h #=> {"4"=>[{"dog"=>"woof"}]}
(h[kout] ||= []) << { kin => val }
#=> (h[kout] = h[kout] || []) << { "cow" => "moo" }
#=> (h["4"] = h["4"] || []) << { "cow" => "moo" }
#=> (h["4"] = [{"dog"=>"woof"}] ||= []) << { "cow" => "moo" }
#=> (h["4"] = [{"dog"=>"woof"}]) << { "cow" => "moo" }
#=> [{"dog"=>"woof"}] << { "cow" => "moo" }
#=> [{ "dog" => "wolf" }, { "cow" => "moo" }]
This time h["4"] || [] #=> [{ "dog" => "wolf" }] because h now has a key "4" with a truthy value ([{ "dog" => "wolf" }]).
The remaining calculations are similar.
You way works, but, for reduce, the return value (ie, the last line) of the block becomes the next value for (in this case) acc, so all you need to change is:
a.reduce({}) do |acc, item|
acc[item.first] = [] unless acc.key? item.first
acc[item.first] << { item[1] => item[2] }
acc # just add this line
end
Since the return value for Array#<< is the array itself, the second iteration gave acc as the array for the first element. There are, of course, lots of ways to do this, some arguably cleaner, but I find it's useful to know where I went wrong when something I think should work doesn't.
With this code I can find most occurrences of items in an array:
letters.max_by { |i| letters.count(i) }
But this will return 2 for
a = [1, 2, 2, 3, 3]
although 3 has the same occurrence. How can I find out, if there really is an item with most occurrences? I would like to get false if there is no single champion.
This is pretty ugly and in need of refinement, but:
def champion(array)
grouped = array.group_by(&:itself).values.group_by(&:length)
best = grouped[grouped.keys.max]
if (best.length == 1)
best[0][0]
else
false
end
end
I'm not sure there's an easy single-shot solution for this, at least not one that's not O(n^2) or worse, which is unusual.
I guess you could do this if you don't care about performance:
def max_occurrences(arr)
arr.sort.max_by { |v| arr.count(v) } != arr.sort.reverse.max_by { |v| arr.count(v) } ? false : arr.max_by { |v| arr.count(v) }
end
I would do something like this:
def max_occurrences(arr)
counts = Hash.new { |h, k| h[k] = 0 }
grouped_by_count = Hash.new { |h, k| h[k] = [] }
arr.each { |el| counts[el] += 1 } # O(n)
counts.each { |el, count| grouped_by_count[count] << el } # O(n)
max = grouped_by_count.sort { |x, y| y[0] <=> x[0] }.first[1] # O(n log n)
max.length == 1 ? max[0] : false
end
It's no snazzy one-liner, but it's readable and runs in less than O(n log n).
a = [1, 2, 2, 3, 3]
occurrences = a.inject(Hash.new(0)){ |h, el| h[el] += 1; h } # => {1=>1, 2=>2, 3=>2}
max_occurences = occurrences.max_by{ |_, v| v } # => [2, 2]
max_occurences.count > 1 ? false : occurrences.key(max_occurences.first)
I have an array of hashes:
[
{
"June" => { "A" => { 3 => 48.4 } }
},
{
"January" => { "C" => { 2 => 88.0} }
},
{
"January"=> { "B" => { 2 => 44.0} }
},
{
"January"=> { "C" => { 4 => 48.8} }
}
]
I need to group each similar hash key into an array of the subsequent values like the following:
{
"June" => [{ "A" => [{ 3 => 48.4 }]] },
"January" => [
{ "B" => [{ 2 => 44.0}],
{ "C" => [{ 2 => 88.0}, { 4 => 48.8}],
] }
}
I am looking for an efficient method of grouping these elements. Can anyone help me master this hash of hashes?
I am trying to avoid looping through the base array and grouping manually. I was hoping that map (or some other enumerable method) might give what I want. When I used reduce(Hash.new, :merge), it came close but it used the last hash for each month key instead of adding it to an array.
Note: I added the following after gaining a clearer understanding of the question. My original answer is below.
Here is the OP's array of hashes, modified slightly.
arr = [{ "June" =>{ "A"=>{ 3=>48.4 } } },
{ "January"=>{ "C"=>{ 2=>88.0 } } },
{ "January"=>{ "B"=>{ "D"=>{ 2=>44.0 } } } },
{ "January"=>{ "C"=>{ 2=>10.0 } } },
{ "January"=>{ "C"=>{ 4=>48.8 } } }]
The hash to be constructed appears to be the following.
{ "June" =>[{ "A"=>[{ 3=>48.4 }] }],
"January"=>[{ "B"=>[{ "D"=>[{ 2=>44.0 }] }] }],
"C"=>[{ 2=>98.0, 4=>48.8 }] }] }
Note that 88.0 + 10.0 #=> 98.0 in 2=>98.0.
Observe that all the arrays within arr contain a single element, a hash. That being the case, those arrays serve no useful purpose. I therefore suggest the following hash be constructed instead:
{ "June" =>{ "A"=>{ 3=>48.4 } },
"January"=>{ "B"=>{ "D"=>{ 2=>44.0 } } },
"C"=>{ 2=>98.0, 4=>48.8 } } }
This can be produced with the following recursive method.
def recurse(arr)
arr.map(&:flatten).
group_by(&:first).
each_with_object({}) do |(k,v),h|
o = v.map(&:last)
h.update(k=>o.first.is_a?(Hash) ? recurse(o) : o.sum )
end
end
recurse(arr)
#=> {"June"=>{"A"=>{3=>48.4}},
# "January"=>{"C"=>{2=>98.0, 4=>48.8}, "B"=>{"D"=>{2=>44.0}}}}
(Original answer follows)
Here are two ways to obtain the desired hash. I assume that arr is your array of hashes.
#1 Use the form of Hash::new that takes a block
arr.each_with_object(Hash.new { |h,k| h[k] = [] }) do |g,h|
k, v = g.to_a.first
h[k] << v
end
# => {"June"=>[{"A"=>{3=>48.4}}],
# "January"=>[{"C"=>{2=>88.0}}, {"B"=>{2=>44.0}}, {"C"=>{4=>48.8}}]}
#2 Use Enumerable#group_by
arr.map(&:first).
group_by(&:first).
tap { |h| h.keys.each { |k| h[k] = h[k].map(&:last) } }
# => {"June"=>[{"A"=>{3=>48.4}}],
# "January"=>[{"C"=>{2=>88.0}}, {"B"=>{2=>44.0}}, {"C"=>{4=>48.8}}]}
The steps are as follows.
a = arr.map(&:first)
#=> [["June", {"A"=>{3=>48.4}}], ["January", {"C"=>{2=>88.0}}],
# ["January", {"B"=>{2=>44.0}}], ["January", {"C"=>{4=>48.8}}]]
b = a.group_by(&:first)
#=> {"June"=>[["June", {"A"=>{3=>48.4}}]],
# "January"=>[["January", {"C"=>{2=>88.0}}], ["January", {"B"=>{2=>44.0}}],
# ["January", {"C"=>{4=>48.8}}]]}
c = b.tap { |h| h.keys.each { |k| h[k] = h[k].map(&:last) } }
#=> {"June"=>[{"A"=>{3=>48.4}}],
# "January"=>[{"C"=>{2=>88.0}}, {"B"=>{2=>44.0}}, {"C"=>{=>48.8}}]}
Let me elaborate the last step. Inside tap's block, we compute the following.
h = b
d = h.keys
#=> ["June", "January"]
The first element of d is passed to each's block and the block variable is assigned to that element.
k = d.first
#=> "June"
The block calculation is as follows.
e = h[k]
#=> [["June", {"A"=>{3=>48.4}}]]
f = e.map(&:last)
#=> [{"A"=>{3=>48.4}}]
h[k] = f
#=> [{"A"=>{3=>48.4}}]
b #=> {"June"=>[{"A"=>{3=>48.4}}],
# "January"=>[["January", {"C"=>{2=>88.0}}],
# ["January", {"B"=>{2=>44.0}}],
# ["January", {"C"=>{4=>48.8}}]]}
Next, d[1] ("January") is passed to each's block and similar calculations are performed.
Rather than using Object#tap I could have written
h = arr.map(&:first).
group_by(&:first)
h.keys.each { |k| h[k] = h[k].map(&:last) }
h
tap merely avoids the creation of local variable h and the need to have a final line equal to h.
I have two arrays of hashes:
a = [
{
key: 1,
value: "foo"
},
{
key: 2,
value: "baz"
}
]
b = [
{
key: 1,
value: "bar"
},
{
key: 1000,
value: "something"
}
]
I want to merge them into one array of hashes, so essentially a + b except I want any duplicated key in b to overwrite those in a. In this case, both a and b contain a key 1 and I want the final result to have b's key value pair.
Here's the expected result:
expected = [
{
key: 1,
value: "bar"
},
{
key: 2,
value: "baz"
},
{
key: 1000,
value: "something"
}
]
I got it to work but I was wondering if there's a less wordy way of doing this:
hash_result = {}
a.each do |item|
hash_result[item[:key]] = item[:value]
end
b.each do |item|
hash_result[item[:key]] = item[:value]
end
result = []
hash_result.each do |k,v|
result << {:key => k, :value => v}
end
puts result
puts expected == result # prints true
uniq would work if you concatenate the arrays in reverse order:
(b + a).uniq { |h| h[:key] }
#=> [
# {:key=>1, :value=>"bar"},
# {:key=>1000, :value=>"something"},
# {:key=>2, :value=>"baz"}
# ]
It doesn't however preserve the order.
[a, b].map { |arr| arr.group_by { |e| e[:key] } }
.reduce(&:merge)
.flat_map(&:last)
Here we use hash[:key] as a key to build the new hash, then we merge them overriding everything with the last value and return values.
I would rebuild your data a bit, since there are redundant keys in hashes:
thin_b = b.map { |h| [h[:key], h[:value]] }.to_h
#=> {1=>"bar", 1000=>"something"}
thin_a = b.map { |h| [h[:key], h[:value]] }.to_h
#=> {1=>"bar", 1000=>"something"}
Then you can use just Hash#merge:
thin_a.merge(thin_b)
#=> {1=>"bar", 2=>"baz", 1000=>"something"}
But, if you want, you can get exactly result as mentioned in question:
result.map { |k, v| { key: k, value: v } }
#=> [{:key=>1, :value=>"bar"},
# {:key=>2, :value=>"baz"},
# {:key=>1000, :value=>"something"}]
using Enumerable#group_by and Enumerable#map
(b+a).group_by { |e| e[:key] }.values.map {|arr| arr.first}
If you need to merge two arrays of hashes that should be merged also and there is more than two keys, then next snippet should help:
[a, b].flatten
.compact
.group_by { |v| v[:key] }
.values
.map { |e| e.reduce(&:merge) }
I have an array looking like this:
data =[[01, 777], [02, 888]]
Now I want to create a hash from it like below:
n_clip = [{"name"=>"01", "rep"=>"777"},{"name"=>"02", rep=>"888"}]
I tried to do this in that way:
n_clip = []
data.each do |a|
n_clip << Array[Hash[a.map {|| ["name", a.first]}], Hash[a.map {|| ["rep", a.last]}]]
end
but it doesn't work because I get:
n_clip = [[{"name"=>"01"},{"rep"="777"}], [{"name"=>"01"},{"rep"="777"}]]
and definitively it isn't what I expected.
data.map { |arr| { 'name' => arr[0], 'rep' => arr[1] } }
i would rather use symbols as hash keys
data.map { |arr| { name: arr[0], rep: arr[1] } }
If you wish to create an array of two hashes, each having the same two keys, the other answers are fine. The following handles the case where there are an arbitrary number of keys and data may contain an arbitrary number of elements.
def hashify(keys, arr_of_vals)
[keys].product(arr_of_vals).map { |ak,av| Hash[ak.zip(av)] }
end
keys = %w| name rep |
#=> ["name", "rep"]
arr_of_vals = [["01", "777"], ["02", "888"]]
hashify(keys, arr_of_vals)
#=> [{"name"=>"01", "rep"=>"777"}, {"name"=>"02", "rep"=>"888"}]
In your problem arr_of_vals must first be derived from [[1, 777], [02, 888]], but that is a secondary (rather mundane) problem that I will not address.
Another example:
keys = %w| name rep group |
#=> ["name", "rep", "group"]
arr_of_vals = [[1, 777, 51], [2, 888, 52], [1, 2, 53], [3, 4, 54]]
hashify(keys, arr_of_vals)
#=> [{"name"=>1, "rep"=>777, "group"=>51}, {"name"=>2, "rep"=>888, "group"=>52},
# {"name"=>1, "rep"=>2, "group"=>53}, {"name"=>3, "rep"=>4, "group"=>54}]
data.map { |name, rep| { 'name' => name.to_s, 'rep' => rep.to_s } }