Difference Between Arrays Preserving Duplicate Elements in Ruby - arrays

I'm quite new to Ruby, and was hoping to get the difference between two arrays.
I am aware of the usual method:
a = [...]
b = [...]
difference = (a-b)+(b-a)
But the problem is that this is computing the set difference, because in ruby, the statement (a-b) defines the set compliment of a, relative to b.
This means [1,2,2,3,4,5,5,5,5] - [5] = [1,2,2,3,4], because it takes out all of occurrences of 5 in the first set, not just one, behaving like a filter on the data.
I want it to remove differences only once, so for example, the difference of [1,2,2,3,4,5,5,5,5], and [5] should be [1,2,2,3,4,5,5,5], removing just one 5.
I could do this iteratively:
a = [...]
b = [...]
complimentAbyB = a.dup
complimentBbyA = b.dup
b.each do |bValue|
complimentAbyB.delete_at(complimentAbyB.index(bValue) || complimentAbyB.length)
end
a.each do |aValue|
complimentBbyA.delete_at(complimentBbyA.index(aValue) || complimentBbyA.length)
end
difference = complimentAbyB + complimentBbyA
But this seems awfully verbose and inefficient. I have to imagine there is a more elegant solution than this. So my question is basically, what is the most elegant way of finding the difference of two arrays, where if one array has more occurrences of a single element then the other, they will not all be removed?

I recently proposed that such a method, Ruby#difference, be added to Ruby's core. For your example, it would be written:
a = [1,2,2,3,4,5,5,5,5]
b = [5]
a.difference b
#=> [1,2,2,3,4,5,5,5]
The example I've often given is:
a = [3,1,2,3,4,3,2,2,4]
b = [2,3,4,4,3,4]
a.difference b
#=> [1, 3, 2, 2]
I first suggested this method in my answer here. There you will find an explanation and links to other SO questions where I proposed use of the method.
As shown at the links, the method could be written as follows:
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end

.....
ha = a.group_by(&:itself).map{|k, v| [k, v.length]}.to_h
hb = b.group_by(&:itself).map{|k, v| [k, v.length]}.to_h
ha.merge(hb){|_, va, vb| (va - vb).abs}.inject([]){|a, (k, v)| a + [k] * v}
ha and hb are hashes with the element in the original array as the key and the number of occurrences as the value. The following merge puts them together and creates a hash whose value is the difference of the number of occurrences in the two arrays. inject converts that to an array that has each element repeated by the number given in the hash.
Another way:
ha = a.group_by(&:itself)
hb = b.group_by(&:itself)
ha.merge(hb){|k, va, vb| [k] * (va.length - vb.length).abs}.values.flatten

Related

merge the array of array in ruby on rails

I have one array like below
[["GJ","MP"],["HR","MH"],["MP","KL"],["KL","HR"]]
And I want result like below
"GJ, MP, KL, HR, MH"
First element of array ["GJ","MP"]
Added is in the answer_string = "GJ, MP"
Now Find MP which is the last element of this array in the other where is should be first element like this ["MP","KL"]
after this I have to add KL in to the answer_string = "GJ, MP, KL"
This is What I want as output
Given
ary = [["GJ","MP"],["HR","MH"],["MP","KL"],["KL","HR"]]
(where each element is in fact an edge in a simple graph that you need to traverse) your task can be solved in a quite straightforward way:
acc = ary.first.dup
ary.size.times do
# Find an edge whose "from" value is equal to the latest "to" one
next_edge = ary.find { |a, _| a == acc.last }
acc << next_edge.last if next_edge
end
acc
#=> ["GJ", "MP", "KL", "HR", "MH"]
Bad thing here is its quadratic time (you search through the whole array on each iteration) that would hit you badly if the initial array is large enough. It would be faster to use some auxiliary data structure with the faster lookup (hash, for instance). Smth. like
head, *tail = ary
edges = tail.to_h
tail.reduce(head.dup) { |acc, (k, v)| acc << edges[acc.last] }
#=> ["GJ", "MP", "KL", "HR", "MH"]
(I'm not joining the resulting array into a string but this is kinda straightforward)
d = [["GJ","MP"],["HR","MH"],["MP","KL"],["KL","HR"]]
o = [] # List for output
c = d[0][0] # Save the current first object
loop do # Keep looping through until there are no matching pairs
o.push(c) # Push the current first object to the output
n = d.index { |a| a[0] == c } # Get the index of the first matched pair of the current `c`
break if n == nil # If there are no found index, we've essentially gotten to the end of the graph
c = d[n][1] # Update the current first object
end
puts o.join(',') # Join the results
Updated as the question was dramatically changed. Essentially, you navigating a graph.
I use arr.size.times to loop
def check arr
new_arr = arr.first #new_arr = ["GJ","MP"]
arr.delete_at(0) # remove the first of arr. arr = [["HR","MH"],["MP","KL"],["KL","HR"]]
arr.size.times do
find = arr.find {|e| e.first == new_arr.last}
new_arr << find.last if find
end
new_arr.join(',')
end
array = [["GJ","MP"],["HR","MH"],["MP","KL"],["KL","HR"]]
p check(array)
#=> "GJ,MP,KL,HR,MH"
Assumptions:
a is an Array or a Hash
a is in the form provided in the Original Post
For each element b in a b[0] is unique
First thing I would do is, if a is an Array, then convert a to Hash for faster easier lookup up (this is not technically necessary but it simplifies implementation and should increase performance)
a = [["GJ","MP"],["HR","MH"],["MP","KL"],["KL","HR"]]
a.to_h
#=> {"GJ"=>"MP", "HR"=>"MH", "MP"=>"KL", "KL"=>"HR"}
UPDATE
If the path will always be from first to end of the chain and the elements are always a complete chain, then borrowing from #KonstantinStrukov's inspiration: (If you prefer this option then please given him the credit ✔️)
a.to_h.then {|edges| edges.reduce { |acc,_| acc << edges[acc.last] }}.join(",")
#=> "GJ,MP,KL,HR,MH"
Caveat: If there are disconnected elements in the original this result will contain nil (represented as trailing commas). This could be solved with the addition of Array#compact but it will also cause unnecessary traversals for each disconnected element.
ORIGINAL
We can use a recursive method to lookup the path from a given key to the end of the path. Default key is a[0][0]
def navigate(h,from:h.keys.first)
return unless h.key?(from)
[from, *navigate(h,from:h[from]) || h[from]].join(",")
end
Explanation:
navigation(h,from:h.keys.first) - Hash to traverse and the starting point for traversal
return unless h.key?(key) if the Hash does not contain the from key return nil (end of the chain)
[from, *navigate(h,from:h[from]) || h[from]].join(",") - build a Array of from key and the recursive result of looking up the value for that from key if the recursion returns nil then append the last value. Then simply convert the Array to a String joining the elements with a comma.
Usage:
a = [["GJ","MP"],["HR","MH"],["MP","KL"],["KL","HR"]].to_h
navigate(a)
#=> "GJ,MP,KL,HR,MH"
navigate(a,from: "KL")
#=> "KL,HR,MH"
navigate(a,from: "X")
#=> nil

Find difference between two arrays considering duplicates [duplicate]

[1,2,3,3] - [1,2,3] produces the empty array []. Is it possible to retain duplicates so it returns [3]?
I am so glad you asked. I would like to see such a method added to the class Array in some future version of Ruby, as I have found many uses for it:
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
A description of the method and links to some of its applications are given here.
By way of example:
a = [1,2,3,4,3,2,4,2]
b = [2,3,4,4,4]
a - b #=> [1]
a.difference b #=> [1,2,3,2]
Ruby v2.7 gave us the method Enumerable#tally, allowing us to replace the first line of the method with
h = other.tally
As far as I know, you can't do this with a built-in operation. Can't see anything in the ruby docs either. Simplest way to do this would be to extend the array class like this:
class Array
def difference(array2)
final_array = []
self.each do |item|
if array2.include?(item)
array2.delete_at(array2.find_index(item))
else
final_array << item
end
end
end
end
For all I know there's a more efficient way to do this, also
EDIT:
As suggested by user2864740 in question comments, using Array#slice! is a much more elegant solution
def arr_sub(a,b)
a = a.dup #if you want to preserve the original array
b.each {|del| a.slice!(a.index(del)) if a.include?(del) }
return a
end
Credit:
My original answer
def arr_sub(a,b)
b = b.each_with_object(Hash.new(0)){ |v,h| h[v] += 1 }
a = a.each_with_object([]) do |v, arr|
arr << v if b[v] < 1
b[v] -= 1
end
end
arr_sub([1,2,3,3],[1,2,3]) # a => [3]
arr_sub([1,2,3,3,4,4,4],[1,2,3,4,4]) # => [3, 4]
arr_sub([4,4,4,5,5,5,5],[4,4,5,5,5,5,6,6]) # => [4]

Creating an array from variables without nil values in Ruby

What's the most idiomatic way to create an array from several variables without nil values?
Given these variables:
a = 1
b = nil
c = 3
I would like to create an array ary:
ary #=> [1, 3]
I could use Array#compact:
ary = [a, b, c].compact
ary #=> [1, 3]
But putting everything in an array just to remove certain elements afterwards doesn't feel right.
Using if statements on the other hand produces more code:
ary = []
ary << a if a
ary << b if b
ary << c if c
ary #=> [1, 3]
Is there another or a preferred way or are there any advantages or drawbacks using either of the above?
PS: false doesn't necessarily have to be considered. The variables are either truthy (numbers / strings / arrays / hashes) or nil.
If you are concerned about performance, best way would be probably to use destructive #compact! to avoid allocating memory for second array.
I was hoping for a way to somehow "skip" the nil values during array creation. But after thinking about this for a while, I realized that this can't be achieved because of Ruby's way to handle multiple values. There's no concept of a "list" of values, multiple values are always represented as an array.
If you assign multiple values, Ruby creates an array:
ary = 1, nil, 3
#=> [1, nil, 3]
Same for a method taking a variable number of arguments:
def foo(*args)
args
end
foo(1, nil, 3)
#=> [1, nil, 3]
So even if I would patch Array with a class method new_without_nil, I would end up with:
def Array.new_without_nil(*values)
values.compact!
values
end
This just moves the code elsewhere.
Everything is an object
From an OO point of view, there's nothing special about nil - it's an object like any other. Therefore, removing nil's is not different from removing 1's.
Using a bunch of if statements on the other hand is something I'm trying to avoid when writing object oriented code. I prefer sending messages to objects.
Regarding "advantages or drawbacks":
[...] with compact / compact!
creates full array and shrinks it as needed
short code, often fits in one line
is easily recognized
evaluates each item once
faster (compiled C code)
[...] with << and if statements
creates empty array and grows it as needed
long code, one line per item
purpose might not be as obvious
items can easily be commented / uncommented
evaluates each item twice
slower (interpreted Ruby code)
Verdict:
I'll use compact, might have been obvious.
Here is a solution that uses a hash.
With these values put in an array:
a = 1; b = nil; c = 3; d = nil; e = 10;
ary = [a, b, c, d, e]
There are two nil items in the result which would require a compact to remove both "nil" items.
However the same variables added to a hash:
a = 1; b = nil; c = 3; d = nil; e = 10;
hash = {a => nil, b => nil, c => nil, d => nil, e => nil}
There is just one "nil" item in the result which can easily be removed by hash.delete(nil).

Counting matching elements in an array

Given two arrays of equal size, how can I find the number of matching elements disregarding the position?
For example:
[0,0,5] and [0,5,5] would return a match of 2 since there is one 0 and one 5 in common;
[1,0,0,3] and [0,0,1,4] would return a match of 3 since there are two matches of 0 and one match of 1;
[1,2,2,3] and [1,2,3,4] would return a match of 3.
I tried a number of ideas, but they all tend to get rather gnarly and convoluted. I'm guessing there is some nice Ruby idiom, or perhaps a regex that would be an elegant answer to this solution.
You can accomplish it with count:
a.count{|e| index = b.index(e) and b.delete_at index }
Demonstration
or with inject:
a.inject(0){|count, e| count + ((index = b.index(e) and b.delete_at index) ? 1 : 0)}
Demonstration
or with select and length (or it's alias – size):
a.select{|e| (index = b.index(e) and b.delete_at index)}.size
Demonstration
Results:
a, b = [0,0,5], [0,5,5] output: => 2;
a, b = [1,2,2,3], [1,2,3,4] output: => 3;
a, b = [1,0,0,3], [0,0,1,4] output => 3.
(arr1 & arr2).map { |i| [arr1.count(i), arr2.count(i)].min }.inject(0, &:+)
Here (arr1 & arr2) return list of uniq values that both arrays contain, arr.count(i) counts the number of items i in the array.
Another use for the mighty (and much needed) Array#difference, which I defined in my answer here. This method is similar to Array#-. The difference between the two methods is illustrated in the following example:
a = [1,2,3,4,3,2,4,2]
b = [2,3,4,4,4]
a - b #=> [1]
a.difference b #=> [1, 3, 2, 2]
For the present application:
def number_matches(a,b)
left_in_b = b
a.reduce(0) do |t,e|
if left_in_b.include?(e)
left_in_b = left_in_b.difference [e]
t+1
else
t
end
end
end
number_matches [0,0,5], [0,5,5] #=> 2
number_matches [1,0,0,3], [0,0,1,4] #=> 3
number_matches [1,0,0,3], [0,0,1,4] #=> 3
Using the multiset gem:
(Multiset.new(a) & Multiset.new(b)).size
Multiset is like Set, but allows duplicate values. & is the "set intersection" operator (return all things that are in both sets).
I don't think this is an ideal answer, because it's a bit complex, but...
def count(arr)
arr.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
end
def matches(a1, a2)
m = 0
a1_counts = count(a1)
a2_counts = count(a2)
a1_counts.each do |e, c|
m += [a1_counts, a2_counts].min
end
m
end
Basically, first write a method that creates a hash from an array of the number of times each element appears. Then, use those to sum up the smallest number of times each element appears in both arrays.

Methods of List comprehension, join a list with a different string at set intervals. (language agnostic)

I'm interested in possible ways that different languages can Join an array, but rather than using a single join string, using a different join string at given intervals.
For example (hypothetical language):
Array.modJoin([mod, char], ...)
e.g. [1,2,3,4,5].modJoin( [1, ","], [2, ":"] )
Where the arguments specify an array or object containing a modulo and a join char, the implementation would check which order of modulo took precedence (the latest one), and apply the join char. (requiring that the [mod,char]'s were provided in ascending mod order)
i.e.
if (index % (nth mod) == 0)
append (join char)
continue to next index
else
(nth mod)-- repeat
when complete join with ""
For example, I've come up with the following in Ruby, but I suspect better / more elegant methods exist, and that's what I'd like to see.
#Create test Array
#9472 - 9727 as HTML Entities (unicode box chars)
range = (9472..9727).to_a.map{|u| "&##{u};" }
Assuming we have a list of mod's and join chars, we have to stipulate that mods increase in value as the list progresses.
mod_joins = [{m:1, c:",", m:12, c:"<br/>"]
Now process the range with mod_joins
processed = range.each_with_index.map { |e, i|
m = nil
mods.reverse_each {|j|
m = j and break if i % j[:m] == 0
}
# use the lowest mod for index 0
m = mods[0] if i == 0
m = nil ? e : "#{e}#{m[:c]}"
}
#output the result string
puts processed.join ""
From this we have a list of htmlEntities, separated by , unless it's index is a 12th modulo in which case it's a <br/>
So, I'm interested for ways this can be done more elegantly, primarily in functional languages like Haskell, F#, Common Lisp (Scheme, Clojure) etc. but also cool ways to achieve this in general purpose languages that have list comprehension extensions such as C# with Linq, Ruby and Python or even Perl.
Here's a pure-functional version written in Python. I'm sure it can be adapted easily enough to other languages.
import itertools
def calcsep(num, sepspecs):
'''
num: current character position
sepspecs: dict containing modulus:separator entries
'''
mods = reversed(sorted(sepspecs))
return sepspecs[next(x for x in mods if num % x == 0)]
vector = [str(x) for x in range(12)]
result = [y for ix, el in enumerate(vector)
for y in (calcsep(ix, {1:'.', 3:',', 5:';'}), el)]
print ''.join(itertools.islice(result, 1, None))
Here’s a simpler and more readable solution in Ruby
array = (9472..9727).map{|u|"&##{u};"}
array.each_slice(12).collect{|each|each.join(",")}.join("<br/>")
or for the general case
module Enumerable
def fancy_join(instructions)
return self.join if instructions.empty?
separator = instructions.delete(mod = instructions.keys.max)
self.each_slice(mod).collect{|each|each.fancy_join(instructions.dup)}.join(separator)
end
end
range = (9472..9727).map{|u|"&##{u};"}
instructions = {1=>",",12=>"<br/>"}
puts array.fancy_join(instructions)

Resources