Related
Background:
Hey all, I am experimenting with external APIs and am trying to pull in all of the followers of a User from a site and apply some sorting.
I have refactored a lot of the code, HOWEVER, there is one part that is giving me a really tough time. I am convinced there is an easier way to implement this than what I have included and would be really grateful on any tips to do this in a much more eloquent way.
My goal is simple. I want to collapse an array of arrays of hashes (I hope that is the correct way to explain it) into one array of hashes.
Problem Description:
I have an array named f_collectionswhich has 5 elements. Each element is an array of size 200. Each sub-element of these arrays is a hash of about 10 key-value pairs. My best representation of this is as follows:
f_collections = [ collection1, collection2, ..., collection5 ]
collection1 = [ hash1, hash2, ..., hash200]
hash1 = { user_id: 1, user_name: "bob", ...}
I am trying to collapse this multi-dimensional array into one array of hashes. Since there are five collection arrays, this means the results array would have 1000 elements - all of which would be hashes.
followers = [hash1, hash2, ..., hash1000]
Code (i.e. my attempt which I do not want to keep):
I have gotten this to work with a very ugly piece of code (see below), with nested if statements, blocks, for loops, etc... This thing is a nightmare to read and I have tried my hardest to research ways to do this in a simpler way, I just cannot figure out how. I have tried flatten but it doesn't seem to work.
I am mostly just including this code to show I have tried very hard to solve this problem, and while yes I solved it, there must be a better way!
Note: I have simplified some variables to integers in the code below to make it more readable.
for n in 1..5 do
if n < 5
(0..199).each do |j|
if n == 1
nj = j
else
nj = (n - 1) * 200 + j
end
#followers[nj] = #f_collections[n-1].collection[j]
end
else
(0..199).each do |jj|
njj = (4) * 200 + jj
#followers[njj] = #f_collections[n-1].collection[jj]
end
end
end
Oh... so It is not an array objects that hold collections of hashes. Kind of. Lets give it another try:
flat = f_collection.map do |col|
col.collection
end.flatten
which can be shortened (and is more performant) to:
flat = f_collection.flat_map do |col|
col.collection
end
This works because the items in the f_collection array are objects that have a collection attribute, which in turn is an array.
So it is "array of things that have an array that contains hashes"
Old Answer follows below. I leave it here for documentation purpose. It was based on the assumption that the data structure is an array of array of hashes.
Just use #flatten (or #flatten! if you want this to be "inline")
flat = f_collections.flatten
Example
sub1 = [{a: 1}, {a: 2}]
sub2 = [{a: 3}, {a: 4}]
collection = [sub1, sub2]
flat = collection.flatten # returns a new collection
puts flat #> [{:a=>1}, {:a=>2}, {:a=>3}, {:a=>4}]
# or use the "inplace"/"destructive" version
collection.flatten! # modifies existing collection
puts collection #> [{:a=>1}, {:a=>2}, {:a=>3}, {:a=>4}]
Some recommendations for your existing code:
Do not use for n in 1..5, use Ruby-Style enumeration:
["some", "values"].each do |value|
puts value
end
Like this you do not need to hardcode the length (5) of the array (did not realize you removed the variables that specify these magic numbers). If you you want to detect the last iteration you can use each_with_index:
a = ["some", "home", "rome"]
a.each_with_index do |value, index|
if index == a.length - 1
puts "Last value is #{value}"
else
puts "Values before last: #{value}"
end
end
While #flatten will solve your problem you might want to see how DIY-solution could look like:
def flatten_recursive(collection, target = [])
collection.each do |item|
if item.is_a?(Array)
flatten_recursive(item, target)
else
target << item
end
end
target
end
Or an iterative solution (that is limited to two levels):
def flatten_iterative(collection)
target = []
collection.each do |sub|
sub.each do |item|
target << item
end
end
target
end
Using .flatten is a handy little trick to take an array of sub-arrays and turn it into a single array.
For example: [[1,3],2,[5,8]].flatten => [1,3,2,5,8]
You can even include nil [1,[2,nil],3].flatten will result in [1,2,nil,3].
This kind of method is very useful when nesting a .map method, but how would you account for an empty sub-array? For example: [1,[2,3],[],4].flatten would return [1,2,3,4]... but what if I need to keep track of the empty sub array maybe turn the result into [1,2,3,0,4] or [1,2,3,nil,4]
Is there any elegant way to do this? Or would I need to write some method to iterate through each individual sub-array and check it one by one?
If you don't need to recursively check nested sub-arrays:
[1,[2,3],[],4].map { |a| a == [] ? nil : a }.flatten
First map the empty arrays into nils, then flatten
[1,2,[1,2,3],[]].map{|x| if x.is_a? Array and x.empty? then nil else x end}.flatten
I have two arrays
main_array = [[0,[1,4,5]], [1,[4,6,8]], [2,[5,6,7,8]], [3,[9,8]], [4,[7,2]]]
other_array = [[1,4,5],[9,8]]
What I want to do is to delete those elements in main_array if it can be found in other_array. (Meaning I will be left with [[1,[4,6,8]], [2,[5,6,7,8]],[4,[7,2]]] ) So I did the following:
other_array.each { |x|
for i in 0..main_array.length-1
main_array.delete(x)
}
It didn't work. Any clues on how I should approach this?
main_array.reject { |_,a| other_array.include?(a) }
#=> [[1,[4,6,8]], [2,[5,6,7,8]], [4,[7,2]]]
You can use Enumerable#reject (this is a module that is included in almost all collections in Ruby, such as Arrays, Hashes, Sets and etc.), it returns new array with removed elements, based on some condition:
main_array.reject { |item| other_array.include? item[1] }
Here item is every item in main_array. You can also "unwrap" your item if you want to control its elements individually:
main_array.reject { |(key, val)| other_array.include? val }
I'd recommend you to go and check the link above, there are also lots of interesting things over there.
I have two arrays of hashes
sent_array = [{:sellersku=>"0421077128", :asin=>"B00ND80WKY"},
{:sellersku=>"0320248102", :asin=>"B00WTEF9FG"},
{:sellersku=>"0324823180", :asin=>"B00HXZLB4E"}]
active_array = [{:price=>39.99, :asin1=>"B00ND80WKY"},
{:price=>7.99, :asin1=>"B00YSN9QOG"},
{:price=>10, :asin1=>"B00HXZLB4E"}]
I want to loop through sent_array, and find where the value in :asin is equal to the value in :asin1 in active_array, then copy the key & value of :price to sent_array. Resulting in this:
final_array = [{:sellersku=>"0421077128", :asin=>"B00ND80WKY", :price=>39.99},
{:sellersku=>"0320248102", :asin=>"B00WTEF9FG"},
{:sellersku=>"0324823180", :asin=>"B00HXZLB4E", :price=>10}]
I tried this, but I get a TypeError - no implicit conversion of Symbol into Integer (TypeError)
sent_array.each do |x|
x.detect { |key, value|
if value == active_array[:asin1]
x[:price] << active_array[:price]
end
}
end
For reasons of both efficiency and readability, it makes sense to first construct a lookup hash on active_array:
h = active_array.each_with_object({}) { |g,h| h[g[:asin1]] = g[:price] }
#=> {"B00ND80WKY"=>39.99, "B00YSN9QOG"=>7.99, "B00HXZLB4E"=>10}
We now merely step through sent_array, updating the hashes:
sent_array.each { |g| g[:price] = h[g[:asin]] if h.key?(g[:asin]) }
#=> [{:sellersku=>"0421077128", :asin=>"B00ND80WKY", :price=>39.99},
# {:sellersku=>"0320248102", :asin=>"B00WTEF9FG"},
# {:sellersku=>"0324823180", :asin=>"B00HXZLB4E", :price=>10}]
Retrieving a key-value pair from a hash (h) is much faster, of course, than searching for a key-value pair in an array of hashes.
This does the trick. Iterate over your sent array and attempt to find a record in your active_array that has that :asin. If you find something, set the price and you are done.
Your code I believe used detect/find incorrectly. What you want out of that method is the hash that matches and then do something with that. You were trying to do everything inside of detect.
sent_array.each do |sent|
item = active_array.find{ |i| i.has_value? sent[:asin] }
sent[:price] = item[:price] if item
end
=> [{:sellersku=>"0421077128", :asin=>"B00ND80WKY", :price=>39.99}, {:sellersku=>"0320248102", :asin=>"B00WTEF9FG"}, {:sellersku=>"0324823180", :asin=>"B00HXZLB4E", :price=>10}]
I am assuming second element of both sent_array and active_array has B00WTEF9FG as asin and asin1 respectively. (seeing your final result)
Now:
a = active_array.group_by{|a| a[:asin1]}
b = sent_array.group_by{|a| a[:asin]}
a.map { |k,v|
v[0].merge(b[k][0])
}
# => [{:price=>39.99, :asin1=>"B00ND80WKY", :sellersku=>"0421077128", :asin=>"B00ND80WKY"}, {:price=>7.99, :asin1=>"B00WTEF9FG", :sellersku=>"0320248102", :asin=>"B00WTEF9FG"}, {:price=>10, :asin1=>"B00HXZLB4E", :sellersku=>"0324823180", :asin=>"B00HXZLB4E"}]
Why were you getting TypeError?
You are doing active_array[:asin1]. Remember active_array itself is an Array. Unless you iterate over it, you cannot look for keys.
Another issue with your approach is, you are using Hash#detect
find is implemented in terms of each. And each, when called on a
Hash, returns key-value pairs in form of arrays with 2 elements
each. That's why find returns an array.
source
Same is true for detect. So x.detect { |key, value| .. } is not going to work as you are expecting it to.
Solution without assumption
a.map { |k,v|
b[k] ? v[0].merge(b[k][0]) : v[0]
}.compact
# => [{:price=>39.99, :asin1=>"B00ND80WKY", :sellersku=>"0421077128", :asin=>"B00ND80WKY"}, {:price=>7.99, :asin1=>"B00YSN9QOG"}, {:price=>10, :asin1=>"B00HXZLB4E", :sellersku=>"0324823180", :asin=>"B00HXZLB4E"}]
Here since asin1 => "B00ND80WKY" has no match, it cannot get sellersku from other hash.
I have two arrays like this:
a = [{'one'=>1, 'two'=>2},{'uno'=>1, 'dos'=>2}]
b = ['english', 'spanish']
I need to add a key-value pair to each hash in a to get this:
a = [{'one'=>1, 'two'=>2, 'language'=>'english'},{'uno'=>1, 'dos'=>2, 'language'=>'spanish'}]
I attempted this:
(0..a.length).each {|c| a[c]['language']=b[c]}
and it does not work. With this:
a[1]['language']=b[1]
(0..a.length).each {|c| puts c}
an error is shown:
NoMethodError (undefined method '[]=' for nil:NilClass)
How can I fix this?
a.zip(b){|h, v| h["language"] = v}
a # => [
# {"one"=>1, "two"=>2, "language"=>"english"},
# {"uno"=>1, "dos"=>2, "language"=>"spanish"}
# ]
When the each iterator over your Range reaches the last element (i.e. a.length), you will attempt to access a nonexisting element of a.
In your example, a.length is 2, so on the last iteration of your each, you will attempt to access a[2], which doesn't exist. (a only contains 2 elements wich indices 0 and 1.) a[2] evaluates to nil, so you will now attempt to call nil['language']=b[2], which is syntactic sugar for nil.[]=('language', b[2]), and since nil doesn't have a []= method, you get a NoMethodError.
The immediate fix is to not iterate off the end of a, by using an exclusive Range:
(0...a.length).each {|c| a[c]['language'] = b[c] }
By the way, the code you posted:
(0..a.length).each {|c| puts c }
should clearly have shown you that you iterate till 2 instead of 1.
That's only the immediate fix, however. The real fix is to simply never iterate over a datastructure manually. That's what iterators are for.
Something like this, where Ruby will keep track of the index for you:
a.each_with_index do |hsh, i| hsh['language'] = b[i] end
Or, without fiddling with indices at all:
a.zip(b.zip(['language'].cycle).map(&:reverse).map(&Array.method(:[])).map(&:to_h)).map {|x, y| x.merge!(y) }
[Note: this last one doesn't mutate the original Arrays and Hashes unlike the other ones.]
The problem you're having is that your (0..a.length) is inclusive. a.length = 2 so you want to modify it to be 0...a.length which is exclusive.
On a side note, you could use Array#each_with_index like this so you don't have to worry about the length and so on.
a.each_with_index do |hash, index|
hash['language'] = b[index]
end
Here is another method you could use
b.each_with_index.with_object(a) do |(lang,i),obj|
obj[i]["language"] = lang
obj
end
#=>[
{"one"=>1, "two"=>2, "language"=>"english"},
{"uno"=>1, "dos"=>2, "language"=>"spanish"}
]
What this does is creates an Enumerator for b with [element,index] then it calls with_object using a as the object. It then iterates over the Enumerator passing in each language and its index along with the a object. It then uses the index from b to find the proper index in a and adds a language key to the hash that is equal to the language.
Please know this is a destructive method where the objects in a will mutate during the process. You could make it non destructive using with_object(a.map(&:dup)) this will dup the hashes in a and the originals will remain untouched.
All that being said I think YAML would be better suited for a task like this but I am not sure what your constraints are. As an example:
yml = <<YML
-
one: 1
two: 2
language: "english"
-
uno: 1
dos: 2
language: "spanish"
YML
require 'yaml'
YAML.load(yml)
#=>[
{"one"=>1, "two"=>2, "language"=>"english"},
{"uno"=>1, "dos"=>2, "language"=>"spanish"}
]
Although using YAML I would change the structure for numbers to be more like language => Array of numbers by index e.g. {"english" => ["zero","one","two"]}. That way you can can access them like ["english"][0] #=> "zero"