How to read CSV data into a hash [duplicate] - arrays

This question already has answers here:
Strange, unexpected behavior (disappearing/changing values) when using Hash default value, e.g. Hash.new([])
(4 answers)
Closed 2 years ago.
Given this CSV file:
date,name,st,code,num
2020-03-25,AB,53,2585,130
2020-03-26,AB,53,3208,151
2020-03-26,BA,35,136,1
2020-03-27,BA,35,191,1
I want to create the following hash with the given data:
{:AB=>[["2020-03-25", "2585"], ["2020-03-26", "3208"]], :BA=>[["2020-03-26", "136"], ["2020-03-27", "191"]]}
I tried this:
require 'csv'
h=Hash.new([])
CSV.foreach('file.csv', headers: true) do |row|
h[row['st']] << [[row['date'], row['code']]]
end
but all I get is an empty hash h.

Let's first create the CSV file.
str =<<~_
date,name,st,code,num
2020-03-25,AB,53,2585,130
2020-03-26,AB,53,3208,151
2020-03-26,BA,35,136,1
2020-03-27,BA,35,191,1
_
FName = 't'
File.write(FName, str)
#=> 120
Now we can simply read the file line-by-line, using CSV::foreach, which, without a block, returns an enumerator, and build the hash as we go along.
require 'csv'
CSV.foreach(FName, headers: true).
with_object(Hash.new { |h,k| h[k] = [] }) do |row,h|
h[row['name'].to_sym] << [row['date'], row['code']]
end
#=> {:AB=>[["2020-03-25", "2585"], ["2020-03-26", "3208"]],
# :BA=>[["2020-03-26", "136"], ["2020-03-27", "191"]]}
I've used the method Hash::new with a block to create a hash h such that if h does not have a key k, h[k] causes h[k] #=> []. That way, h[k] << 123, when h has no key k results in h[k] #=> [123].
Alternatively, one could write:
CSV.foreach(FName, headers: true).with_object({}) do |row,h|
(h[row['name'].to_sym] ||= []) << [row['date'], row['code']]
end
One could also use a converter to convert the values of name to symbols, but some might see that as over-kill here:
CSV.foreach(FName, headers: true,
converters: [->(v) { v.match?(/\p{Alpha}+/) ? v.to_sym : v }] ).
with_object(Hash.new { |h,k| h[k] = [] }) do |row,h|
h[row['name']] << [row['date'], row['code']]
end

There is no need to read a CSV file as a text file or whatever, you can use the CSV file as you intended and address the actual issues at hand.
There are three issues here:
This won't work:
h = Hash.new([])
use this instead:
h = Hash.new {|h, k| h[k] = [] }
See "Strange, unexpected behavior (disappearing/changing values) when using Hash default value, e.g. Hash.new([])" as #jack commented.
You need headers: true because the first row is a headers row in your case.
You are only pushing to the values array. You need to overwrite it like:
h[row['name']] = h[row['name']] << [row['date'], row['code']]
This will work for you:
require 'csv'
h = Hash.new { |h, k| h[k] = [] }
CSV.foreach('file.csv', headers: true) do |row|
h[row['name']] = h[row['name']] << [row['date'], row['code']]
end
h.transform_keys(&:to_sym)
#=> {:AB=>[["2020-03-25", "2585"], ["2020-03-26", "3208"]], :BA=>[["2020-03-26", "136"], ["2020-03-27", "191"]]}

Related

Get related articles based on tags in Ruby

I’m trying to display a related section based on the article’s tags. Any articles that have similar tags should be displayed.
The idea is to iterate the article’s tags and see if any other articles have those tags.
If yes, then add that article to a related = [] array of articles I can retrieve later.
Article A: tags: [chris, mark, scott]
Article B: tags: [mark, scott]
Article C: tags: [alex, mike, john]
Article A has as related the Article B and vice-versa.
Here’s the code:
files = Dir[ROOT + 'articles/*']
# parse file
def parse(fn)
res = meta(fn)
res[:body] = PandocRuby.new(body(fn), from: 'markdown').to_html
res[:pagedescription] = res[:description]
res[:taglist] = []
if res[:tags]
res[:tags] = res[:tags].map do |x|
res[:taglist] << '%s' % [x, x]
'%s' % [x, x]
end.join(', ')
end
res
end
# get related articles
def related_articles(articles)
related = []
articles[:tags].each do |tag|
articles.each do |item|
if item[:tags] != nil && item[:tags].include?(tag)
related << item unless articles.include?(item)
end
end
end
related
end
articles = files.map {|fn| parse(fn)}.sort_by {|x| x[:date]}
articles = related_articles(articles)
Throws this error:
no implicit conversion of Symbol into Integer (TypeError)
Another thing I tried was this:
# To generate related articles
def related_articles(articles)
related = []
articles.each do |article|
article[:tags].each do |tag|
articles.each do |item|
if item[:tags] != nil && item[:tags].include?(tag)
related << item unless articles.include?(item)
end
end
end
end
related
end
But now the error says:
undefined method `each' for "tagname":String (NoMethodError)
Help a Ruby noob? What am I doing wrong? Thanks!
As an aside to the main question, I tried rewriting the tag section of the code, but still no luck:
res[:taglist] = []
if res[:tags]
res[:tags] = res[:tags].map do |x|
res[:taglist] << '' + x + ''
'' + x + ''
end.join(', ')
end
In your first attempt, the problem is in articles[:tags]. articles is an array, so you cannot access it using a symbol key.
The second attempt fails because article[:tags] is a string (from the parse function, you get the original tags, transform to HTML and then join). The :taglist key instead contains an array, you could use it.
Finally, the "related" array should be per-article so neither implementation could possibly solve your issue, as both return a single array for all your set of articles.
You probably need a two pass:
def parse(fn)
res = meta(fn)
res[:body] = PandocRuby.new(body(fn), from: 'markdown').to_html
res[:pagedescription] = res[:description]
res[:tags] ||= [] # and don't touch it
res[:tags_as_links] = res[:tags].map { |x| "#{x}" }
res[:tags_as_string] = res[:tags_as_links].join(', ')
res
end
articles = files.map { |fn| parse(fn) }
# convert each article into a hash like
# {tag1 => [self], tag2 => [self]}
# and then reduce by merge
taggings = articles
.map { |a| a[:tags].product([[a]]).to_h }
.reduce { |a, b| a.merge(b) { |_, v1, v2| v1 | v2 } }
# now read them back into the articles
articles.each do |article|
article[:related] = article[:tags]
.flat_map { |tag| taggings[tag] }
.uniq
# remove the article itself
article[:related] -= [article]
end

Iterate over array of objects. Then access object method if correct one is found. Otherwise create a new object in the array

I start with an empty array, and a Hash of key, values.
I would like to iterate over the Hash and compare it against the empty array. If the value for each k,v pair doesn't already exist in the array, I would like to create an object with that value and then access an object method to append the key to an array inside the object.
This is my code
class Test
def initialize(name)
#name = name
#values = []
end
attr_accessor :name
def values=(value)
#values << value
end
def add(value)
#values.push(value)
end
end
l = []
n = {'server_1': 'cluster_x', 'server_2': 'cluster_y', 'server_3': 'cluster_z', 'server_4': 'cluster_x', 'server_5': 'cluster_y'}
n.each do |key, value|
l.any? do |a|
if a.name == value
a.add(key)
else
t = Test.new(value)
t.add(key)
l << t
end
end
end
p l
I would expect to see this:
[
#<Test:0x007ff8d10cd3a8 #name=:cluster_x, #values=["server_1, server_4"]>,
#<Test:0x007ff8d10cd2e0 #name=:cluster_y, #values=["server_2, server_5"]>,
#<Test:0x007ff8d10cd1f0 #name=:cluster_z, #values=["server_3"]>
]
Instead I just get an empty array.
I think that the condition if a.name == value is not being met and then the add method isn't being called.
#Cyzanfar gave me a clue as to what to look for, and I found the answer here
https://stackoverflow.com/a/34904864/5006720
n.each do |key, value|
found = l.detect {|e| e.name == value}
if found
found.add(key)
else
t = Test.new(value)
t.add(key)
l << t
end
end
#ARL you're almost there! The last thing you need to consider is when found actually returns an object since detect will find a matching one at some point.
n.each do |key, value|
found = l.detect {|e| e.name == value}
if found
found.add(key)
else
t = Test.new(value)
t.add(key)
l << t
end
end
You actually only want to add a new instance of Test when found return nil. This code should yield your desired output:
[
#<Test:0x007ff8d10cd3a8 #name=:cluster_x, #values=["server_1, server_4"]>,
#<Test:0x007ff8d10cd2e0 #name=:cluster_y, #values=["server_2, server_5"]>,
#<Test:0x007ff8d10cd1f0 #name=:cluster_z, #values=["server_3"]>
]
I observe two things in your code :
def values=(value)
#values << value
def add(value)
#values.push(value)
two methods do the same thing, pushing a value, as << is a kind of syntactic sugar meaning push
you have changed the meaning of values=, which is usually reserved for a setter method, equivalent to attire_writer :values.
Just to illustrate that there are many ways to do things in Ruby, I propose the following :
class Test
def initialize(name, value)
#name = name
#values = [value]
end
def add(value)
#values << value
end
end
h_cluster = {} # intermediate hash whose key is the cluster name
n = {'server_1': 'cluster_x', 'server_2': 'cluster_y', 'server_3': 'cluster_z',
'server_4': 'cluster_x', 'server_5': 'cluster_y'}
n.each do | server, cluster |
puts "server=#{server}, cluster=#{cluster}"
cluster_found = h_cluster[cluster] # does the key exist ? => nil or Test
# instance with servers list
puts "cluster_found=#{cluster_found.inspect}"
if cluster_found
then # add server to existing cluster
cluster_found.add(server)
else # create a new cluster
h_cluster[cluster] = Test.new(cluster, server)
end
end
p h_cluster.collect { | cluster, servers | servers }
Execution :
$ ruby -w t.rb
server=server_1, cluster=cluster_x
cluster_found=nil
server=server_2, cluster=cluster_y
cluster_found=nil
server=server_3, cluster=cluster_z
cluster_found=nil
server=server_4, cluster=cluster_x
cluster_found=#<Test:0x007fa7a619ae10 #name="cluster_x", #values=[:server_1]>
server=server_5, cluster=cluster_y
cluster_found=#<Test:0x007fa7a619ac58 #name="cluster_y", #values=[:server_2]>
[#<Test:0x007fa7a619ae10 #name="cluster_x", #values=[:server_1, :server_4]>,
#<Test:0x007fa7a619ac58 #name="cluster_y", #values=[:server_2, :server_5]>,
#<Test:0x007fa7a619aac8 #name="cluster_z", #values=[:server_3]>]

Ruby - Filtering array of hashes based on another array

I am trying to filter an array of hashes based on another array. What's the best way to accomplish this? Here are the 2 brutes I've right now:
x=[1,2,3]
y = [{dis:4,as:"hi"},{dis:2,as:"li"}]
1) aa = []
x.each do |a|
qq = y.select{|k,v| k[:dis]==a}
aa+=qq unless qq.empty?
end
2) q = []
y.each do |k,v|
x.each do |ele|
if k[:dis]==ele
q << {dis: ele,as: k[:as]}
end
end
end
Here's the output I'm intending:
[{dis:2,as:"li"}]
If you want to select only the elements where the value of :dis is included in x:
y.select{|h| x.include? h[:dis]}
You can delete the nonconforming elements of y in place with with .keep_if
> y.keep_if { |h| x.include? h[:dis] }
Or reverse the logic with .delete_if:
> y.delete_if { |h| !x.include? h[:dis] }
All produce:
> y
=> [{:dis=>2, :as=>"li"}]
Yes use select, nonetheless here's another way which works:
y.each_with_object([]) { |hash,obj| obj << hash if x.include? hash[:dis] }

Ruby 2d array to csv?

There is an array, how correctly to deduce in csv a file?
arr1 = [["A","B"], ["C","D"], ["E","F"], ["G","H"]]
Expected result in csv:
A,B
C,D
E,F
G,H
I do so:
out_file = File.open('file.csv', 'w')
arr1.each_index do |inx|
arr1[inx].each do |val|
out_file.puts val
end
end
But, Prints all in one column:
A
B
C
D
..
If you output to the console through p val, then in each value is / r:
"A\r"
"B\r"
"C\r"
"D\r"
What do I do wrong?
Edit:
result csv Excel
result csv Vim
You are not writing to file.
require 'csv'
CSV.open('file.csv', 'w') do |csv|
arr1.each { |ar| csv << ar }
end
If all you want is to simply print out the CSV string, then you can do it like this:
csv_string = CSV.generate { |csv| array2d.each { |row| csv << row }
Here's a worked example:
> # array2d contains the raw data
> csv_string = CSV.generate { |csv| array2d.each { |row| csv << row } }
> puts csv_string
5014,"John O""Neill",4295,1,Finance Plus
314,"Thomas, Duncan",436,2,Finance Plus
1930,Fraser Smith,436,12,Finance Plus
5057,Fred McDonald,436,12,Finance Plus
Note that it handles double-quotes and commas inside strings.
See: https://ruby-doc.org/stdlib-2.6.1/libdoc/csv/rdoc/CSV.html
It works,
require 'csv'
CSV.open('file.csv', 'w') do |csv|
arr1.each { |ar| csv << ar }
end
But it was necessary to finish in front of it:
arr1.each_index do |inx|
arr1[inx].each do |val|
val.chop!
end
end
To delete a line break \r
and it works:
File.write('file.csv', [["A","B"], ["C","D"], ["E","F"], ["G","H"]].map { |e| e.join(",") }.join($/))

Finding an element in an array inside a hash

#members = {
approved: ["Jill"],
unapproved: ["Daniel"],
removed: ["John"],
banned: ["Daniel", "Jane"]
}
Very simply: making a program to track membership. In the above hash you can see the four membership status keys each with an array containing names.
I'm trying to create a find_member method which allows the user to enter a name and then searches each array for the name and tells the user which key the name was found in.
I'm not very good with hashes and in attempting to do this I've created a mess of loops and I imagine there's a very easy solution, I just haven't found it so far. Is there a really simple way to do this?
I've tried a few things and don't have all my past efforts still, but this is the latest mess I've ended up with, which is probably worse than what I had previously:
def find_member
puts "==Find Member=="
puts "Name: "
#name = gets.chomp
#members.each do |key|
key.values.each do |array|
array.each do |element|
if #name == element
puts "#{#name} found in #{key}"
else
puts "#{#name} not found in #{key}"
end
end
end
end
end
Thanks.
The most efficient way to do this is to create a one-to-many mapping of names to keys, and update that mapping only when #members changes.
def find_member(name)
update_names_to_keys_if_necessary
#member_to_keys[name]
end
def update_names_to_keys_if_necessary
new_hashcode = #members.hash
return if #old_members.hashcode == new_hashcode
#member_to_keys = #members.each_with_object(Hash.new { |h,k| h[k] = [] }) { |(k,v),h|
v.each { |name| h[name] << k } }
#old_members_hashcode = new_hashcode
end
Note that #old_members_hashcode evaluates to nil the first time update_names_to_keys_if_necessary is called, so #member_to_keys will be created at that time.
Initially we obtain
#member_to_keys
#=> {"Jill"=>[:approved], "Daniel"=>[:unapproved, :banned],
# "John"=>[:removed], "Jane"=>[:banned]}
Try it.
find_member("Jill")
#=> [:approved]
find_member("Daniel")
#=> [:unapproved, :banned]
find_member("John")
#=> [:removed]
find_member("Jane")
#=> [:banned]
find_member("Billy-Bob")
#=> []
You can use this itteration with include? method.
#members = {
approved: ["Jill"],
unapproved: ["Daniel"],
removed: ["John"],
banned: ["Daniel", "Jane"]
}
def find_member_group(name)
#members.each { |group, names| return group if names.include?(name) }
nil
end
#name = 'Jane'
group_name = find_member_group(#name)
puts group_name ? "#{#name} found in #{group_name}." : "#{#name} not found."
# => Jane found in banned.
Hash#select is the method to use here:
def find_member(name)
#members.select {|k,v| v.include? name }.keys
end
find_member("Jill") #=> [:approved]
find_member("Daniel") #=> [:unapproved, :banned]
find_member("John") #=> [:removed]
find_member("Jane") #=> [:banned]
Explanation:
select as the name suggests selects and maps only those elements that satisfy the condition in the corresponding code-block. The code-block negates the need for an if statement. Within the code-block we check each key-value pair and if its value includes the name argument, then that key-value pair is selected and mapped to the final output. Finally we're only interested in the memberships (namely the keys), so we apply the keys method to get these in the form of an array.

Resources