Ruby Array of Hash parsing - arrays

I have a yaml file in the format:
parameters:
- param_name: age
requires:
- name
- param_name: height
requires:
- name
Based on this format I would like to accept a hash of keys and values and determine if the combination of keys and values is valid. For example based on the above example if someone submitted a hash with the values:
{'age' => 15, 'height' => '6ft'}
it would be considered invalid since the parameter name is required. So a valid submission would look like
{'age' => 15, 'height' => '6ft', 'name' => 'Abe Lincoln'}.
Essentially what I want is this:
For each parameter object, if it has a requires array underneath it. Check all parameter param_names for elements in that array, if any are missing exit.
I have a very ugly double loop that checks for this but I want to tighten the code up. I think I can use blocks in order to validate the data I need. Here is what I have come up with so far:
require 'yaml'
requirements = YAML.load_file('./require.yaml')
require_fields = Array.new
requirements['parameters'].each do |param|
require_fields.concat(param['require']) if param.has_key? 'require'
end
require_fields.each do |requirement|
found = false
requirements['parameters'].each do |param|
if param['param_name'] == requirement
found = true
end
end
abort "#{requirement} is a required field" unless found
end

You can clean this up a lot if you make it more idiomatic Ruby:
require 'yaml'
requirements = YAML.load_file('./require.yaml')
require_fields = requirements['parameters'].select do |param|
param.has_key?('require')
end.map do |param|
param['require']
end
require_fields.each do |requirement|
found = requirements['parameters'].any? do |param|
param['param_name'] == requirement
end
abort "#{requirement} is a required field" unless found
end
You could also do this:
require_fields = requirements['parameters'].map do |param|
param['require']
end.compact
Where that's probably good enough so long as your require is either something or nil.
You could also transform that input YAML into a simple hash structure of dependencies:
dependencies = requirements.map do ||
[ param['param_name'], param['requires'] ]
end.to_h
Then you can test really easily:
dependencies.each do |name, requirements|
found = requirements.find do |required_name|
!dependencies[required_name]
end
abort "#{found} is a required field" unless found
end
This is a really rough adaptation of your code, but I hope it gives you some ideas.

I would go with subsequent checks, collecting errors and reporting all at once:
req = YAML.load 'parameters:
- param_name: age
requires:
- name
- param_name: height
requires:
- name'
input = {'age' => 15, 'height' => '6ft'}
req['parameters'].each_with_object([]) do |req, err|
next unless input[req['param_name']] # nothing to check
missed = req['requires'].reject { |param| input[param] }
errors = missed.map do |param|
[req['param_name'], param].join(' requires ')
end
err.concat(errors)
end
#⇒ ["age requires name", "height requires name"]
Or, chaining:
req['parameters'].each_with_object(Hash.new { |h, k| h[k] = [] }) do |req, err|
next unless input[req['param_name']] # nothing to check
req['requires'].each do |param|
err[param] << req['param_name'] unless input[param]
end
end.map do |missing, required|
"Missing #{missing} parameter, required for: [#{required.join(', ')}]"
end.join(',')
#⇒ "Missing name parameter, required for: [age, height]"

Related

How could I do the sum of all values of a nested hash?

I have a nested hash like this
Aranea={
"Aranéomorphes"=>{
"Agelenidae"=>[80,1327],
"Amaurobiidae"=>[49,270],
"Ammoxenidae"=>[4,18],
"Anapidae"=>[58,233],
"Anyphaenidae"=>[56,572],
"Araneidae"=>[175,3074],
"Archaeidae"=>[5,90],
"Arkydiae"=>[2,38],
"Austrochilidae"=>[3,10],
"Caponiidae"=>[18,119],
"Cheiracanthiidae"=>[12,353],
"Cithaeronidae"=>[2,8],
"Clubionidae"=>[16,639],
"Corinnidae"=>[68,489],
"Ctenidae"=>[48,519],......
For each key (spiders families), the array represents [number of genders, number of species].
Iwould like to get the sum of all first elements....i.e all the genders in total....
I tried different things without success like :
genre = []
#total = genre.transpose.map {|x| x.reduce(:+)}
Or....
def sum_deeply(h)
h.values.inject(0) { |m, v|
m + (Hash === v[0] ? sum_deeply(v[0]) : v[0].to_i)
}
end
puts sum_deeply(Aranea)
But none does work for with transpose I get a no implicit conversion error...
Could anyone enligthen me on this ? Thanks
!!! Update.... 08.07.2020... solution found with
families = Aranea
num_genders = families.flat_map do |_family_name, species_hash|
num_genders, _num_species = species_hash.values.transpose
num_genders
Thanks to Kache for his help on this.
This should do what you want:
families = Aranea
num_genders = families.flat_map do |_family_name, species_hash|
num_genders, _num_species = species_hash.values.transpose
num_genders
end
num_genders.inject(:+)
Just a tip: splitting out the "data extraction" and "data processing" (i.e. accessing the num_genders value vs summing them) will make your code easier to follow.
I don't think there'll be any part of the above that you won't understand, but if there is, just let me know what parts you'd like to have explained.

Ruby: Extract elements from deeply nested JSON structure based on criteria

Want to extract every marketID from every market that has a marketName == 'Moneyline'. Tried a few combinations of .maps, .rejects, and/or .selects but can't narrow it down as the complicated structure is confusing me.
There are many markets in events, and there are many events as well. A sample of the structure (tried to edit it for brevity):
{"currencyCode"=>"GBP",
"eventTypes"=>[
{"eventTypeId"=>6423,
"eventNodes"=>[
{"eventId"=>28017227,
"event"=>
{"eventName"=>"Philadelphia # Seattle"
},
"marketNodes"=>[
{"marketId"=>"1.128274650",
"description"=>
{"marketName"=>"Moneyline"}
},
{"marketId"=>"1.128274625",
"description"=>
{"marketName"=>"Winning Margin"}
}}}]},
{"eventId"=>28018251,
"event"=>
{"eventName"=>"Arkansas # Mississippi State"
},
"marketNodes"=>[
{"marketId"=>"1.128299882",
"description"=>
{"marketName"=>"Under/Over 60.5pts"}
},
{"marketId"=>"1.128299881",
"description"=>
{"marketName"=>"Moneyline"}
}}}]},
{"eventId"=> etc....
Tried all kinds of things, for example,
markets = json["eventTypes"].first["eventNodes"].map {|e| e["marketNodes"].map { |e| e["marketId"] } if (e["marketNodes"].map {|e| e["marketName"] == 'Moneyline'})}
markets.flatten
# => yields every marketId not every marketId with marketName of 'Moneyline'
Getting a simple array with every marketId from Moneyline markets with no other information is sufficient. Using Rails methods is fine too if preferred.
Sorry if my editing messed up the syntax. Here's the source. It looks like this only with => instead of : after parsing the JSON.
Thank you!
I love nested maps and selects :D
require 'json'
hash = JSON.parse(File.read('data.json'))
moneyline_market_ids = hash["eventTypes"].map{|type|
type["eventNodes"].map{|node|
node["marketNodes"].select{|market|
market["description"]["marketName"] == 'Moneyline'
}.map{|market| market["marketId"]}
}
}.flatten
puts moneyline_market_ids.join(', ')
#=> 1.128255531, 1.128272164, 1.128255516, 1.128272159, 1.128278718, 1.128272176, 1.128272174, 1.128272169, 1.128272148, 1.128272146, 1.128255464, 1.128255448, 1.128272157, 1.128272155, 1.128255499, 1.128272153, 1.128255484, 1.128272150, 1.128255748, 1.128272185, 1.128278720, 1.128272183, 1.128272178, 1.128255729, 1.128360712, 1.128255371, 1.128255433, 1.128255418, 1.128255403, 1.128255387
Just for fun, here's another possible answer, this time with regexen. It is shorter but might break depending on your input data. It reads the json data directly as String :
json = File.read('data.json')
market_ids = json.scan(/(?<="marketId":")[\d\.]+/)
market_names = json.scan(/(?<="marketName":")[^"]+/)
moneyline_market_ids = market_ids.zip(market_names).select{|id,name| name=="Moneyline"}.map{|id,_| id}
puts moneyline_market_ids.join(', ')
#=> 1.128255531, 1.128272164, 1.128255516, 1.128272159, 1.128278718, 1.128272176, 1.128272174, 1.128272169, 1.128272148, 1.128272146, 1.128255464, 1.128255448, 1.128272157, 1.128272155, 1.128255499, 1.128272153, 1.128255484, 1.128272150, 1.128255748, 1.128272185, 1.128278720, 1.128272183, 1.128272178, 1.128255729, 1.128360712, 1.128255371, 1.128255433, 1.128255418, 1.128255403, 1.128255387
It outputs the same result as the other answer.

How to build an array comprised of two others using only particular elements of each?

I writing a little program to generate some bogus top-ten sales numbers for book sales. I'm trying to do this in as compact a fashion as possible and do it without using MySQL or another DB.
I have written out what I want to happen. I've created a bogus catalog array and a bogus sales array corresponding sales to the index of the catalog entries. That part all works great.
I want to create a third array that includes all the titles from the catalog array with the sales numbers from the sales array, like a join in a DB, but without any DB. I can't figure out how to do that part of it though. I think once I have it in there I can sort it the way I want it, but making that third array is killing. I cannot figure out what I'm doing wrong or how to do it right.
So given the following code:
require 'random_word'
class BestOnline
def initialize
#catalog = Array.new
#sales = Array.new
#topten = Array.new
inventory = rand(50) + 10
days = rand(1..50)
now = Time.now
yesterday = now - 86400
saleshistory = now - (days * 86400)
(1..inventory).each do
#catalog << {
:title => "#{RandomWord.adjs.next.capitalize} #{RandomWord.nouns.next.capitalize}",
:price => rand(5.99..29.99).round(2)}
end
(0..days).each do
#sales << {
:id => rand(0..#catalog.count),
:salescount => rand(0..24),
:date => rand(saleshistory..now) }
end
end
def bestsellers
#sales.each do
# THIS DOESNT WORK AND I'M STUCK AS HOW TO FIX IT.
# #topten << {
# :title => #catalog[:id],
# :salescount => #sales[:salescount]
# }
end
puts #topten.group_by{ |tt| tt[:salescount]}.sort_by{ |k,v| -k}.first(10)
end
end
BestOnline.new.bestsellers
How can I create a third array that contains the titles and number of sales and output the result of the top-ten books sold?
Try this out:
def bestsellers
#sales.each do |sale|
#topten << {
title: #catalog[sale[:id]][:title],
salescount: sale[:salescount] }
end
#topten.sort! { |x, y| y[:salescount] <=> x[:salescount] }
puts #topten.first(10)
end
I suggest you write:
def bestsellers(sales)
sales.max_by(10) { |h| h[:salescount][:salescount]] }
end
puts bestsellers(sales)
Enumerable#max_by was permitted to have an argument in Ruby v2.2.
There are several problems with the way you've structured your code. Now that you have running code (by incorporating #fbonds66's answer), I suggest you post it at SO's sister-site Code Review. The purpose of CR is to suggest improvements to working code. If you read through some of the questions and answers there I think you will be impressed.
I was doing the dereferencing wrong trying to build the 3rd array of the 1st two:
#sales.each do |sale|
#topten << {
:title => #catalog[sale[:id]][:title],
:salescount => sale[:salescount]
}
end
I needed to work on the hash returned from .each as |sale| and use correct syntax to get what I was after from the other arrays.

how to fetch value from an array and check its range in rails

I have included given code and its working fine
def check_ip
start = IPAddr.new(10.10.0.10).to_i
last = IPAddr.new(20.10.10.16).to_i
begin
ip_pool = IpPool.pluck(:start_ip, :end_ip)
# [["10.10.10.12", "10.10.10.15"], ["192.168.1.13", "192.168.1.13"]]
low = IPAddr.new("10.10.10.12").to_i
high = IPAddr.new("10.10.10.15").to_i
# it will check so on with ["192.168.1.13", "192.168.1.13"] values too
raise ArgumentError, I18n.t('errors.start') if ((low..high)===start or (low..high)===last
end
rescue ArgumentError => msg
self.errors.add(:start, msg)
return false
end
return true
end
Please guide me on how to implement this code without giving static value IPAddr.new("10.10.10.12").to_i I want to add values dynamically which I am fetching in ip_pool array so in low and high I am giving static values which are present in an array how could I give this values dynamically.
Since you have an array of low/high, you probably want to check all items in it:
begin
IpPool.pluck(:start_ip, :end_ip).each do |(low,high)|
raise ArgumentError, I18n.t('errors.start') \
if (low..high) === start || (low..high) === last
end
true
rescue ArgumentError => msg
self.errors.add(:start, msg)
false
end
Please note that I have the code a bit cleaned up:
removed superfluous returns;
corrected begin-rescue clause (there was a superfluous end right before rescue, that actually addressed rescue to the whole function body.

Fastest and most effective way of comparing two array of hashes of different format

I have two arrays of hashes with the format:
hash1
[{:root => root_value, :child1 => child1_value, :subchild1 => subchild1_value, bases => hit1,hit2,hit3}...]
hash2
[{:path => root_value/child1_value/subchild1_value, :hit1_exist => t ,hit2_exist => t,hit3_exist => f}...]
IF I do this
Def sample
results = nil
project = Project.find(params[:project_id])
testrun_query = "SELECT root_name, suite_name, case_name, ic_name, executed_platforms FROM testrun_caches WHERE start_date >= '#{params[:start_date]}' AND start_date < '#{params[:end_date]}' AND project_id = #{params[:project_id]} AND result <> 'SKIP' AND result <> 'N/A'"
if !params[:platform].nil? && params[:platform] != [""]
#yell_and_log "platform not nil"
platform_query = nil
params[:platform].each do |platform|
if platform_query.nil?
platform_query = " AND (executed_platforms LIKE '%#{platform.to_s},%'"
else
platform_query += " OR executed_platforms LIKE '%#{platform.to_s},%'"
end
end
testrun_query += ")" + platform_query
end
if !params[:location].nil? &&!params[:location].empty?
#yell_and_log "location not nil"
testrun_query += "AND location LIKE '#{params[:location].to_s}%'"
end
testrun_query += " GROUP BY root_name, suite_name, case_name, ic_name, executed_platforms ORDER BY root_name, suite_name, case_name, ic_name"
ic_query = "SELECT ics.path, memberships.pts8210, memberships.sv6, memberships.sv7, memberships.pts14k, memberships.pts22k, memberships.pts24k, memberships.spb32, memberships.spb64, memberships.sde, projects.name FROM ics INNER JOIN memberships on memberships.ic_id = ics.id INNER JOIN test_groups ON test_groups.id = memberships.test_group_id INNER JOIN projects ON test_groups.project_id = projects.id WHERE deleted = 'false' AND (memberships.pts8210 = true OR memberships.sv6 = true OR memberships.sv7 = true OR memberships.pts14k = true OR memberships.pts22k = true OR memberships.pts24k = true OR memberships.spb32 = true OR memberships.spb64 = true OR memberships.sde = true) AND projects.name = '#{project.name}' GROUP BY path, memberships.pts8210, memberships.sv6, memberships.sv7, memberships.pts14k, memberships.pts22k, memberships.pts24k, memberships.spb32, memberships.spb64, memberships.sde, projects.name ORDER BY ics.path"
if params[:ic_type] == "never_run"
runtest = TestrunCache.connection.select_all(testrun_query)
alltest = TrsIc.connection.select_all(ic_query)
(alltest.length).times do |i|
#exec_pltfrm = test['executed_platforms'].split(",")
unfinishedtest = comparison(runtest[i],alltest[i])
yell_and_log("test = #{unfinishedtest}")
yell_and_log("#{runtest[i]}")
yell_and_log("#{alltest[i]}")
end
end
end
I get in my log:
test = true
array of hash 1 = {"root_name"=>"BSDPLATFORM", "suite_name"=>"cli", "case_name"=>"functional", "ic_name"=>"cli_sanity_test", "executed_platforms"=>"pts22k,pts24k,sv7,"}
array of hash 2 = {"path"=>"BSDPLATFORM/cli/functional/cli_sanity_test", "pts8210"=>"f", "sv6"=>"f", "sv7"=>"t", "pts14k"=>nil, "pts22k"=>"t", "pts24k"=>"t", "spb32"=>nil, "spb64"=>nil, "sde"=>nil, "name"=>"pts_6_20"}
test = false
array of hash 1 = {"root_name"=>"BSDPLATFORM", "suite_name"=>"infrastructure", "case_name"=>"bypass_pts14k_copper", "ic_name"=>"ic_packet_9", "executed_platforms"=>"sv6,"}
array of hash 2 = {"path"=>"BSDPLATFORM/infrastructure/build/copyrights", "pts8210"=>"f", "sv6"=>"t", "sv7"=>"t", "pts14k"=>"f", "pts22k"=>"t", "pts24k"=>"t", "spb32"=>"f", "spb64"=>nil, "sde"=>nil, "name"=>"pts_6_20"}
test = false
array of hash 1 = {"root_name"=>"BSDPLATFORM", "suite_name"=>"infrastructure", "case_name"=>"bypass_pts14k_copper", "ic_name"=>"ic_status_1", "executed_platforms"=>"sv6,"}
array of hash 2 = {"path"=>"BSDPLATFORM/infrastructure/build/ic_1", "pts8210"=>"f", "sv6"=>"t", "sv7"=>"t", "pts14k"=>"f", "pts22k"=>"t", "pts24k"=>"t", "spb32"=>"f", "spb64"=>nil, "sde"=>nil, "name"=>"pts_6_20"}
test = false
array of hash 1 = {"root_name"=>"BSDPLATFORM", "suite_name"=>"infrastructure", "case_name"=>"bypass_pts14k_copper", "ic_name"=>"ic_status_2", "executed_platforms"=>"sv6,"}
array of hash 2 = {"path"=>"BSDPLATFORM/infrastructure/build/ic_files", "pts8210"=>"f", "sv6"=>"t", "sv7"=>"f", "pts14k"=>"f", "pts22k"=>"t", "pts24k"=>"t", "spb32"=>"f", "spb64"=>nil, "sde"=>nil, "name"=>"pts_6_20"}
SO I get only the first to match but rest becomes different and I get result of one instead of 4230
I would like some way to match by path and root/suite/case/ic and then compare the executed platforms passed in array of hashes 1 vs platforms set to true in array of hash2
Not sure if this is fastest, and I wrote this based on your original question that didn't provide sample code, but:
def compare(h1, h2)
(h2[:path] == "#{h1[:root]}/#{h1[:child1]}/#{h1[:subchild1]}") && \
(h2[:hit1_exist] == ((h1[:bases][0] == nil) ? 'f' : 't')) && \
(h2[:hit2_exist] == ((h1[:bases][1] == nil) ? 'f' : 't')) && \
(h2[:hit3_exist] == ((h1[:bases][2] == nil) ? 'f' : 't'))
end
def compare_arr(h1a, h2a)
(h1a.length).times do |i|
compare(h1a[i],h2a[i])
end
end
Test:
require "benchmark"
h1a = []
h2a = []
def rstr
# from http://stackoverflow.com/a/88341/178651
(0...2).map{65.+(rand(26)).chr}.join
end
def rnil
rand(2) > 0 ? '' : nil
end
10000.times do
h1a << {:root => rstr(), :child1 => rstr(), :subchild1 => rstr(), :bases => [rnil,rnil,rnil]}
h2a << {:path => '#{rstr()}/#{rstr()}/#{rstr()}', :hit1_exist => 't', :hit2_exist => 't', :hit3_exist => 'f'}
end
Benchmark.measure do
compare_arr(h1a,h2a)
end
Results:
=> 0.020000 0.000000 0.020000 ( 0.024039)
Now that I'm looking at your code, I think it could be optimized by removing array creations, and splits and joins which are creating arrays and strings that need to be garbage collected which also will slow things down, but not by as much as you mention.
Your database queries may be slow. Run explain/analyze or similar on them to see why each is slow, optimize/reduce your queries, add indexes where needed, etc. Also, check cpu and memory utilization, etc. It might not just be the code.
But, there are some definite things that need to be fixed. You also have several risks of SQL injection attack, e.g.:
... start_date >= '#{params[:start_date]}' AND start_date < '#{params[:end_date]}' AND project_id = #{params[:project_id]} ...
Anywhere that params and variables are put directly into the SQL may be a danger. You'll want to make sure to use prepared statements or at least SQL escape the values. Read this all the way through: http://guides.rubyonrails.org/active_record_querying.html
([element_being_tested].each do |el|
[hash_array_1, hash_array_2].reject do |x, y|
x[el] == y[el]
end
end).each {|x, y| puts (x[bases] | y[bases])}
Enumerate the hash elements to test.
[element_being_tested].each do |el|
Then iterate through the hash arrays themselves, comparing the given hashes by the elements of the given comparison defined by the outer loop, rejecting those not appropriately equal. (The == may actually need to be != but you can figure that much out)
[hash_array_1, hash_array_2].reject do |x, y|
x[el] == y[el]
end
Finally, you again compare the hashes taking the set union of their elements.
.each {|x, y| puts (x[bases] | y[bases])}
You may need to test the code. It's not meant for production so much as demonstration because I wasn't sure I read your code right. Please post a larger sample of the source including the data structures in question if this answer is unsatisfactory.
Regarding speed: if you're iterating through a large data set and comparing multiple there's probably nothing you can do. Perhaps you can invert the loops I presented and make the hash arrays the outer loop. You're not going to get lightning speed here in Ruby (really any language) if the data structure is large.

Resources