Ruby match elements from the first with second array - arrays

I have two arrays. The first one will be an array of string with a name and the amount. The second is an array of letters.
a1 = ["ASLK 50", "BSKD 150", "ZZZZ 100", "BSDF 50"]
a2 = ["B", "Z"]
I want to create a third array to sort the contents from a1 based off a2 and return the number based on the information from first array. Since a2 has "B" and "Z", I need to scan first array for all entry starting with letter B and Z and add up the numbers.
My ultimate goal is to return the sum on third array, something like this:
a3 = ["B = 200", "Z = 100"]
Since "A" was not on a2, it is not counted.
I was able to extract the information from a1:
arr = a1.map{|el| el[0] + " : " + el.gsub(/\D/, '\1')}
#=> ["A : 50", "B : 150", "Z : 100", "B : 50"]
I am having trouble comparing a1with a2. I have tried different methods, such as:
a1.find_all{|i| i[0] == a2[0]} #=> only returns the first element of a2. How can I iterate through a2?
alternatively,
i = 0
arr_result = []
while i < (arr.length + 1)
#(append to arr_result the number from a1 if a1[i][0] matches a2[i])
I think either would solve it, but I can't put neither idea down to working code. How can I implement either method? Is there a more efficient way to do it?

Running with your requirements, that you want to turn this:
a1 = ["ASLK 50", "BSKD 150", "ZZZZ 100", "BSDF 50"]
a2 = ["B", "Z"]
into this: a3 = ["B = 200", "Z = 100"]
a3 = a2.map do |char|
sum = a1.reduce(0) do |sum, item|
name, price = item.split(" ")
sum += price.to_i if name[0].eql?(char)
sum
end
"#{char} = #{sum}"
end

Here is how I would do this:
a1 = ["ASLK 50", "BSKD 150", "ZZZZ 100", "BSDF 50"]
a2 = ["B", "Z"]
a3 = a1.each_with_object(Hash.new(0)) do |a,obj|
obj[a[0]] += a.split.last.to_i if a2.include?(a[0])
end.map {|a| a.join(" = ")}
#=>["B = 200", "Z = 100"]
First step adds them all into a Hash by summing the values by each first letter that is contained in the second Array.
Second step provides the desired output
If you want a Hash instead just take off the last call to map and you'll have.
{"B" => 200, "Z" => 100}

Rather than defining an Array like ["B = 200", "Z = 100"], it would be more sensible to define this as a mapping - i.e. the following Hash object:
{"B" => 200, "Z" => 100}
As for the implementation, there are many ways to do it. Here is just one approach:
a1 = ["ASLK 50", "BSKD 150", "ZZZZ 100", "BSDF 50"]
a2 = ["B", "Z"]
result = a2.map do |letter|
[
letter,
a1.select {|str| str[0] == letter}
.inject(0) {|sum, str| sum += str[/\d+/].to_i}
]
end.to_h
puts result # => {"B"=>200, "Z"=>100}
Explanation:
At the top level, I've used Array#to_h to convert the array of pairs: [["B", 200], ["Z", 100]] into a Hash: {"B" => 200, "Z" => 100}.
a1.select {|str| str[0] == letter} selects only the elements from a1 whose first letter is that of the hash key.
inject(0) {|sum, str| sum += str[/\d+/].to_i} adds up all the numbers, with safe-guards to default to zero (rather than having nil thrown around unexpectedly).

This is quite similar to #engineersmnky's answer.
r = /
\A[A-Z] # match an upper case letter at the beginning of the string
| # or
\d+ # match one or more digits
/x # free-spacing regex definition mode
a1.each_with_object(Hash.new(0)) do |s,h|
start_letter, value = s.scan(r)
h[start_letter] += value.to_i if a2.include?(start_letter)
end.map { |k,v| "#{k} = #{v}" }
#=> ["B = 200", "Z = 100"]
Hash.new(0) is often referred to as a "counting hash". See the doc for the class method Hash::new for an explanation.
The regex matches the first letter of the string or one or more digits. For example,
"ASLK 50".scan(r)
#=> ["A", "50"]

Related

Working with Transpose functions result in error

consider the following array
arr = [["Locator", "Test1", "string1","string2","string3","string4"],
["$LogicalName", "Create Individual Contact","value1","value2"]]
Desired result:
[Test1=>{"string1"=>"value1","string2"=>"value2","string3"=>"","string4"=>""}]
When I do transpose, it gives me the error by saying second element of the array is not the length of the first element in the array,
Uncaught exception: element size differs (2 should be 4)
so is there any to add empty string in the place where there is no element and can perform the transpose and then create the hash as I have given above? The array may consist of many elements with different length but according to the size of the first element in the array, every other inner array has to change by inserting empty string and then I can do the transpose. Is there any way?
It sounds like you might want Enumerable#zip:
headers, *data_rows = input_data
headers.zip(*data_rows)
# => [["Locator", "$LogicalName"], ["Test1", "Create Individual Contact"],
# ["string1", "value1"], ["string2", "value2"], ["string3", nil], ["string4", nil]]
If you wish to transpose an array of arrays, each element of the array must be the same size. Here you would need to do something like the following.
arr = [["Locator", "Test1", "string1","string2","string3","string4"],
["$LogicalName", "Create Individual Contact","value1","value2"]]
keys, vals = arr
#=> [["Locator", "Test1", "string1", "string2", "string3", "string4"],
# ["$LogicalName", "Create Individual Contact", "value1", "value2"]]
idx = keys.index("Test1") + 1
#=> 2
{ "Test1" => [keys[idx..-1],
vals[idx..-1].
concat(['']*(keys.size - vals.size))].
transpose.
to_h }
#=> {"Test1"=>{"string1"=>"value1", "string2"=>"value2", "string3"=>"", "string4"=>""}}
It is not strictly necessary to define the variables keys and vals, but that avoids the need to create those arrays multiple times. It reads better as well, in my opinion.
The steps are as follows. Note keys.size #=> 6 and vals.size #=> 4.
a = vals[idx..-1]
#=> vals[2..-1]
#=> ["value1", "value2"]
b = [""]*(keys.size - vals.size)
#=> [""]*(4 - 2)
#=> ["", ""]
c = a.concat(b)
#=> ["value1", "value2", "", ""]
d = keys[idx..-1]
#=> ["string1", "string2", "string3", "string4"]
e = [d, c].transpose
#=> [["string1", "value1"], ["string2", "value2"], ["string3", ""], ["string4", ""]]
f = e.to_h
#=> {"string1"=>"value1", "string2"=>"value2", "string3"=>"", "string4"=>""}
f = e.to_h
#=> { "Test1" => f }
Find the longest Element in your Array and make sure every other element has the same length - loop and add maxLength - element(i).length amount of "" elements.

Get the longest prefix in arrays Ruby

I have an array of arrays. Within each subarray, if two or more elements share a prefix whose length equals to or is greater than eight, then I want to replace those elements by their longest prefix. For this array:
m = [
["A", "97455589955", "97455589920", "97455589921"],
["B", "2348045101518", "2348090001559"]
]
I expect an output like this:
n = [
["A", "974555899"],
["B", "2348045101518", "2348090001559"]
]
For first subarray in m, the longest prefix is "974555899" of length nine.
974555899-55
974555899-20
974555899-21
For the second subarray, the longest prefix is "23480" of length five, and that is shorter than eight. In this case, the second subarray is left as is.
23480-45101518
23480-90001559
For this input:
m = [
["A", "2491250873330", "249111222333", "2491250872214", "2491250872213"],
["B", "221709900000"],
["C", "6590247968", "6590247969", "6598540040", "65985400217"]
]
The output should be like this:
[
["A", "2491250873330", "249111222333", "249125087221"],
["B", "221709900000"],
["C", "659024796", "65985400"]
]
For array m[0], there is no prefix long enough between its four numbers, but there is a prefix 249125087221 of length twelve between m[0][2] and m[0][3]. For array m[2], there is prefix "659024796" of length nine between m[2][0] and m[2][1], and there is another prefix "65985400" of length eight between m[2][2] and m[2][3].
I constructed the code below:
m.map{|x, *y|
[x, y.map{|z| z[0..7]}.uniq].flatten
}
With my code with the first input, I get this output.
[
["A", "97455589"],
["B", "23480451", "23480900"]
]
I'm stuck on how to get dynamically the common prefix without setting a fixed length.
Code
def doit(arr, min_common_length)
arr.map do |label, *values|
[label, values.group_by { |s| s[0, min_common_length] }.
map { |_,a| a.first[0, nbr_common_digits(a, min_common_length)] }]
end
end
def nbr_common_digits(a, min_common_length)
max_digits = a.map(&:size).min
return max_digits if max_digits == min_common_length + 1
(min_common_length..max_digits).find { |i|
a.map { |s| s[i] }.uniq.size > 1 } || max_digits
end
Example
arr = [["A","2491250873330","249111222333","2491250872214","2491250872213"],
["B","221709900000"],
["C","6590247968","6590247969","6598540040","65985400217"]]
doit(arr, 8)
#=> [["A", ["249125087", "249111222333"]],
# ["B", ["221709900000"]],
# ["C", ["659024796", "65985400"]]]
Explanation
Let's first consider the helper method, nbr_common_digits. Suppose
a = ["123467", "12345", "1234789"]
min_common_length = 2
then the steps are as follows.
max_digits = a.map(&:size).min
#=> 5 (the length of "12345")
max_digits == min_common_length + 1
#=> 5 == 2 + 1
#=> false, so do not return max_digits
b = (min_common_length..max_digits).find { |i| a.map { |s| s[i] }.uniq.size > 1 }
#=> (2..5).find { |i| a.map { |s| s[i] }.uniq.size > 1 }
#=> 4
At this point we must consider the possibility that b will equal nil, which occurs when the first 5 characters of all strings are equal. In that case we should return max_digits, which is why we require the following.
b || max_digits
#=> 4
In doit the steps are as follows.
min_common_length = 8
Firstly, we use Enumerable#group_by to group values by their first min_common_length digits.
arr.map { |label, *values| [label,
values.group_by { |s| s[0, min_common_length] }] }
#=> [["A", {"24912508"=>["2491250873330", "2491250872214", "2491250872213"],
# "24911122"=>["249111222333"]}],
# ["B", {"22170990"=>["221709900000"]}],
# ["C", {"65902479"=>["6590247968", "6590247969"],
# "65985400"=>["6598540040", "65985400217"]}]]
The second step is to compute the longest common lengths and replace values as required.
arr.map do |label, *values| [label,
values.group_by { |s| s[0, min_common_length] }.
map { |_,a| a.first[0, nbr_common_digits(a, min_common_length)] }]
end
#=> [["A", ["249125087", "249111222333"]],
# ["B", ["221709900000"]],
# ["C", ["659024796", "65985400"]]]
The first block variable in the second map's block (whose value equals a string with nbr_common_length characters--group_by's grouping criterion) is represented by an underscore (a legitimate local variable) to signify that it is not used in the block calculation.
This is an interesting problem. Here's my solution:
def lcn(lim, *arr)
# compute all substrings of lengths >= lim and build a lookup by length
lookup = lcn_explode(lim, arr)
# first pass: look for largest common number among all elements
res, = lcn_filter(arr, lookup) { |size| size == arr.size }
return res unless res.empty?
# second pass: look for largest common number among some elements
res, rem = lcn_filter(arr, lookup) { |size| size > 1 }
# append remaining candidates with no matches
res.concat(rem)
end
def lcn_explode(lim, arr)
memo = Hash.new { |h, k| h[k] = Array.new }
arr.uniq.each do |n|
lim.upto([n.size, lim].max) do |i|
memo[i] << [n[0, i], n]
end
end
memo
end
def lcn_filter(arr, lookup)
memo = []
lookup.keys.sort!.reverse_each do |i|
break if arr.empty?
matches = Hash.new { |h, k| h[k] = Array.new }
lookup[i].each do |m, n|
matches[m] << n if arr.include?(n)
end
matches.each_pair do |m, v|
next unless yield v.size
memo << m
# remove elements from input array so they won't be reused
arr -= v
end
end
return memo, arr
end
You use it like so:
p lcn(8, "97455589955", "97455589920", "97455589921") => ["974555899"]
Or:
m.each do |key, *arr|
p [key, *lcn(8, *arr)]
end
Which prints:
["A", "249125087221", "2491250873330", "249111222333"]
["B", "221709900000"]
["C", "659024796", "65985400"]
Your task can be splitten into two: calculating Largest Common Number and modifying original array.
Largest Common Number operates on arrays, therefore, it should a method of Array.
After calculating LCN you can just compare its length with the limit (i.e. 8).
class Array
def lcn
first.length.times do |index|
numb = first[0..index]
return numb unless self[1..-1].all? { |n| n.start_with?(numb) }
end
first
end
end
def task(m, limit = 8)
m.map { |i,*n| [i, n.lcn.length >= limit ? n.lcn : n].flatten }
end
task(m) # => [["A", "9745558995"], ["B", "2348045101518", "2348090001559"]]
In your solution you do not actually implement lcn finding and filtering output.

Ruby array += vs push

I have an array of arrays and want to append elements to the sub-arrays. += does what I want, but I'd like to understand why push does not.
Behavior I expect (and works with +=):
b = Array.new(3,[])
b[0] += ["apple"]
b[1] += ["orange"]
b[2] += ["frog"]
b => [["apple"], ["orange"], ["frog"]]
With push I get the pushed element appended to EACH sub-array (why?):
a = Array.new(3,[])
a[0].push("apple")
a[1].push("orange")
a[2].push("frog")
a => [["apple", "orange", "frog"], ["apple", "orange", "frog"], ["apple", "orange", "frog"]]
Any help on this much appreciated.
The issue here is b = Array.new(3, []) uses the same object as the base value for all the array cells:
b = Array.new(3, [])
b[0].object_id #=> 28424380
b[1].object_id #=> 28424380
b[2].object_id #=> 28424380
So when you use b[0].push, it adds the item to "each" sub-array because they are all, in fact, the same array.
So why does b[0] += ["value"] work? Well, looking at the ruby docs:
ary + other_ary → new_ary
Concatenation — Returns a new array built by concatenating the two arrays together to produce a third array.
[ 1, 2, 3 ] + [ 4, 5 ] #=> [ 1, 2, 3, 4, 5 ]
a = [ "a", "b", "c" ]
c = a + [ "d", "e", "f" ]
c #=> [ "a", "b", "c", "d", "e", "f" ]
a #=> [ "a", "b", "c" ]
Note that
x += y
is the same as
x = x + y
This means that it produces a new array. As a consequence, repeated use of += on arrays can be quite inefficient.
So when you use +=, it replaces the array entirely, meaning the array in b[0] is no longer the same as b[1] or b[2].
As you can see:
b = Array.new(3, [])
b[0].push("test")
b #=> [["test"], ["test"], ["test"]]
b[0].object_id #=> 28424380
b[1].object_id #=> 28424380
b[2].object_id #=> 28424380
b[0] += ["foo"]
b #=> [["test", "foo"], ["test"], ["test"]]
b[0].object_id #=> 38275912
b[1].object_id #=> 28424380
b[2].object_id #=> 28424380
If you're wondering how to ensure each array is unique when initializing an array of arrays, you can do so like this:
b = Array.new(3) { [] }
This different syntax lets you pass a block of code which gets run for each cell to calculate its original value. Since the block is run for each cell, a separate array is created each time.
It's because in the second code section, you're selecting the sub-array and pushing to it, if you want an array of array's you need to push the array to the main array.
a = Array.new(3,[])
a.push(["apple"])
a.push(["orange"])
a.push(["frog"])
to get the same result as the first one.
EDIT: I forgot to mention, because you initialize the array with blank array's as elements, you will have three empty elements in front of the pushed elements,

How can I generate a percentage for a regex string match in Ruby?

I'm trying to build a simple method to look at about 100 entries in a database for a last name and pull out all the ones that match above a specific percentage of letters. My current approach is:
Pull all 100 entries from the database into an array
Iterate through them while performing the following action
Split the last name into an array of letters
Subtract that array from another array that contains the letters for the name I am trying to match which leaves only the letters that weren't matched.
Take the size of the result and divide by the original size of the array from step 3 to get a percentage.
If the percentage is above a predefined threshold, push that database object into a results array.
This works, but I feel like there must be some cool ruby/regex/active record method of doing this more efficiently. I have googled quite a bit but can't find anything.
To comment on the merit of the measure you suggested would require speculation, which is out-of-bounds at SO. I therefore will merely demonstrate how you might implement your proposed approach.
Code
First define a helper method:
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
In short, if
a = [3,1,2,3,4,3,2,2,4]
b = [2,3,4,4,3,4]
then
a - b #=> [1]
whereas
a.difference(b) #=> [1, 3, 2, 2]
This method is elaborated in my answer to this SO question. I've found so many uses for it that I've proposed it be added to the Ruby Core.
The following method produces a hash whose keys are the elements of names (strings) and whose values are the fractions of the letters in the target string that are contained in each string in names.
def target_fractions(names, target)
target_arr = target.downcase.scan(/[a-z]/)
target_size = target_arr.size
names.each_with_object({}) do |s,h|
s_arr = s.downcase.scan(/[a-z]/)
target_remaining = target_arr.difference(s_arr)
h[s] = (target_size-target_remaining.size)/target_size.to_f
end
end
Example
target = "Jimmy S. Bond"
and the names you are comparing are given by
names = ["Jill Dandy", "Boomer Asad", "Josefine Simbad"]
then
target_fractions(names, target)
#=> {"Jill Dandy"=>0.5, "Boomer Asad"=>0.5, "Josefine Simbad"=>0.8}
Explanation
For the above values of names and target,
target_arr = target.downcase.scan(/[a-z]/)
#=> ["j", "i", "m", "m", "y", "s", "b", "o", "n", "d"]
target_size = target_arr.size
#=> 10
Now consider
s = "Jill Dandy"
h = {}
then
s_arr = s.downcase.scan(/[a-z]/)
#=> ["j", "i", "l", "l", "d", "a", "n", "d", "y"]
target_remaining = target_arr.difference(s_arr)
#=> ["m", "m", "s", "b", "o"]
h[s] = (target_size-target_remaining.size)/target_size.to_f
#=> (10-5)/10.0 => 0.5
h #=> {"Jill Dandy"=>0.5}
The calculations are similar for Boomer and Josefine.

I want to return the string that has the highest sum in an array for Ruby

So far I am able to return the sums of the strings:
puts "Please enter strings of #'s to find the number that has the greatest sum."
puts "Use commas to separate #'s."
user_input = gets.chomp
array = user_input.split(",")
array.map do |num_string|
num_string.chars.map(&:to_i).inject(:+)
end
But I wish to return the string that adds up to the highest value. For instance: If I have array = ["123","324","644"] I need it to return the sums of each value so the result should be 6,9,and 14 respectively. Since 1+2+3=6 etc. I am this far but now I need to return "644" as the answer since it is the string that sums to the highest value.
I suggest you use Enumerable#max_by:
arr = ["123","324","644"]
arr.max_by { |s| s.each_char.reduce(0) { |t,c| t+c.to_i } }
#=> "644"
Let's see how this works. Enumerable#max_by computes a value for each element of arr and returns the element of arr whose computed value is greatest. The calculation of the value for each element is done by max_by's block.
enum0 = arr.max_by
#=> #<Enumerator: ["123", "324", "644"]:max_by>
You can see the three elements of this enumerator. Sometimes it's not so obvious, but you can always see what they are by converting the enumerator to an array:
enum0.to_a
#=> ["123", "324", "644"]
Elements of enum0 are passed to the block by the method Enumerator#each (which in turn calls Array#each). You would find that:
enum0.each { |s| s.each_char.reduce(0) { |t,c| t+c.to_i } }
returns "644".
The first element of the enumerator ("123") is passed to the block by each and assigned to the block variable s. We can simulate that with the method Enumerator#next:
s = enum0.next
#=> "123"
Within the block we have another enumerator:
enum1 = s.each_char
#=> #<Enumerator: "123":each_char>
enum1.to_a
#=> ["1", "2", "3"]
enum1.reduce(0) { |t,c| t+c.to_i }
#=> 6
This last statement is equivalent to:
0 # initial value of t
0 + "1".to_i #=> 1 (new value of t)
1 + "2".to_i #=> 3 (new value of t)
3 + "3".to_i #=> 6 (new value of t)
6 is then returned by reduce.
For the next element of enum0:
s = enum0.next
#=> "324"
s.each_char.reduce(0) { |t,c| t+c.to_i }
#=> 9
and for the last element enum0:
s = enum0.next
#=> "644"
s.each_char.reduce(0) { |t,c| t+c.to_i }
#=> 14
Since 14 is the largest integer in [6, 9, 14], max_by returns the last element of arr, "644".
yourArray = ["123","324","644"]
highest = -1
pos = -1
yourArray.each_with_index{ |str, i|
sum = str.split("").map(&:to_i).reduce(:+)
highest = [highest,sum].max
pos = i if highest == sum
}
puts "highest value is " + yourArray[pos]
lets look at each step, map lets us enumerate the array and returns a new array with whatever we return from map:
yourArray.map{ |str| ... }
Here we're taking our strings in the array and splitting them into arrays of ["1","2","3"] for example, then the to_i portion converts this to an array such as [1,2,3] then finally reduce gives us our sum:
sum = str.split("").map(&:to_i).reduce(:+)
Here we're keeping track throughout the loop of the highest sum we've seen so far:
highest = [highest,sum].max
If ever the current sum is the highest, store its position in the array
pos = i if highest = sum
finally, use the stored array position to print out whatever exists there in the original array (the positions line up):
puts "highest value is " + yourArray[pos]

Resources