Ruby possible combination of array values - performance - arrays

i need to quickly determine the possible uniq combinations of elements in an array based on a condition.
They have the following structure:
[[id,parent_id]]
I have no problems with smaller arrays. If all the parent_ids are uniq. Example:
a = (1..6).to_a.map{ |a| [a,a] }
=> [[1, 1], [2, 2], [3, 3], [4, 4], [5, 5], [6, 6]]
a.combination(3).size # => 20
answers immediately.
If I have ids with reoccurring parent_ids I can still use combination and iterate through all combinations.
a = (1..7).to_a.map{ |a| [a,a] };a[6] = [7,6]
=> [[1, 1], [2, 2], [3, 3], [4, 4], [5, 5], [6, 6], [7, 6]]
a.combination(3).size # => 35
valid_combos = a.combination(3).to_a.select { |c| c.map(&:last).uniq.size == c.size }.size # => 30
This is still quick on small arrays. But if the arrays has 33 entries with 1 reoccurring parent_id I'll have to check 1166803110 combinations. This is slow. Of course.
Any ideas or hints on how to get this solved quickly and efficient are welcome.
I like the combination method for the Array class. But i would use a Hash or set too.
Also there could be arrays like:
a = [[1, 1], [2, 1], [3, 1], [4, 2], [5, 2], [6, 2], [7, 3], [8, 3]]
a.combination(3).size #=> 56
But only 18 are "valid".
Any help is appreciated.
EDIT:
Valid input no reoccurring parent_ids:
[[1, 1], [2, 2], [3, 3], [4, 4], [5, 5]]
Valid output with combination of 4 each (5 uniq combos):
[[[1, 1], [2, 2], [3, 3], [4, 4]], [[1, 1], [2, 2], [3, 3], [5, 5]], [[1, 1], [2, 2], [4, 4], [5, 5]], [[1, 1], [3, 3], [4, 4], [5, 5]], [[2, 2], [3, 3], [4, 4], [5, 5]]]
Valid input 1 reoccurring parent_ids:
[[1, 1], [2, 2], [3, 3], [4, 4], [5, 5], [6,5]]
Valid output with combination of 4 each (9 uniq combos):
[[[1, 1], [2, 2], [3, 3], [4, 4]], [[1, 1], [2, 2], [3, 3], [5, 5]], [[1, 1], [2, 2], [3, 3], [6, 5]], [[1, 1], [2, 2], [4, 4], [5, 5]], [[1, 1], [2, 2], [4, 4], [6, 5]], [[1, 1], [3, 3], [4, 4], [5, 5]], [[1, 1], [3, 3], [4, 4], [6, 5]], [[2, 2], [3, 3], [4, 4], [5, 5]], [[2, 2], [3, 3], [4, 4], [6, 5]]]
These are the invalid combos [5,5] and [6,5] aren't allowed:
[[[1, 1], [2, 2], [5, 5], [6, 5]], [[1, 1], [3, 3], [5, 5], [6, 5]], [[1, 1], [4, 4], [5, 5], [6, 5]], [[2, 2], [3, 3], [5, 5], [6, 5]], [[2, 2], [4, 4], [5, 5], [6, 5]], [[3, 3], [4, 4], [5, 5], [6, 5]]]

If I understand correctly, you want all possible combinations of ids where the ids don't share a parent id. I had a go at something different, just for fun, with no real idea if performance will improve.
x = [[1, 1], [2, 2], [3, 3], [4, 4], [5, 5], [6,5]]
First, let's flip reduce it.
hash = x.reduce({}) {|hash, pair| (hash[pair.last] ||= []).push pair.first}
#=> {1=>[1], 2=>[2], 3=>[3], 4=>[4], 5=>[5, 6]}
Now we get all possible combinations of the parent IDs.
parents = hash.keys.combination(4).to_a
#=> [[1, 2, 3, 4], [1, 2, 3, 5], [1, 2, 4, 5], [1, 3, 4, 5], [2, 3, 4, 5]]
Now we map each parent ID to it's child ids.
children = parents.map do |array|
array.map {|parent| hash[parent]}
end
#=> [[[1], [2], [3], [4]], [[1], [2], [3], [5, 6]], [[1], [2], [4], [5, 6]], [[1], [3], [4], [5, 6]], [[2], [3], [4], [5, 6]]]
We're knee deep in arrays at this point. Now, we take the product of each sub-array to get all possible combinations, and we don't even need to uniq them.
children.map {|array| array.first.product *array.drop(1)}.flatten(1)
#=> [[1, 2, 3, 4], [1, 2, 3, 5], [1, 2, 3, 6], [1, 2, 4, 5], [1, 2, 4, 6], [1, 3, 4, 5], [1, 3, 4, 6], [2, 3, 4, 5], [2, 3, 4, 6]]
Now you have all combinations of ids, and could use those to look up parent ids if you still need them using the opposite of the hash table.
What about performance? I benchmarked by running this file.
With 50 entries, 25 repeated, and combination of 4:
3957124
Original: 8.719000 0.110000 8.829000 ( 8.860909)
3957124
Simons: 4.875000 0.094000 4.969000 ( 6.458309)
So it looks quicker in theory. But, with 125 entries, 25 repeated, and combination of 4:
9811174
Original: 22.875000 0.281000 23.156000 ( 23.213483)
9811174
Simons: 20.703000 0.391000 21.094000 ( 21.232167)
Which is not much faster. This is because for so many combinations Ruby spends the majority of it's time doing memory allocation (try watching in Task Manager or top), which in Ruby is dog-slow. There's not really any helpful way to allocate the memory up-front, so beyond a certain point you're at a hard limit.
But this is only happening because you're forcing Ruby to collect all of the array items together at once. If you're specific use case allows you to deal with each combination individually, you can avoid most of the memory allocation. By calling yield with every child array (this file):
9811174
Simons: 8.485000 0.000000 8.485000 ( 8.476653)
Much quicker. You will also observe the memory usage remains constant. It's still gonna take a while though. However, if you have multiple cores you could in principle parallelise because once you have the hash each combination can be worked on independently of the others. I'll leave that for you to try :)

You can do that as follows.
Code
def combos(pairs, group_size)
pairs.group_by(&:last).
values.
combination(group_size).
flat_map { |a| a.shift.product(*a) }
end
Examples
pairs = [[1, 1], [2, 2], [3, 3], [4, 4], [5, 5], [6,5]]
combos(pairs, 4)
#=> [[[1, 1], [2, 2], [3, 3], [4, 4]],
# [[1, 1], [2, 2], [3, 3], [5, 5]],
# [[1, 1], [2, 2], [3, 3], [6, 5]],
# [[1, 1], [2, 2], [4, 4], [5, 5]],
# [[1, 1], [2, 2], [4, 4], [6, 5]],
# [[1, 1], [3, 3], [4, 4], [5, 5]],
# [[1, 1], [3, 3], [4, 4], [6, 5]],
# [[2, 2], [3, 3], [4, 4], [5, 5]],
# [[2, 2], [3, 3], [4, 4], [6, 5]]]
combos(pairs, 5)
#=> [[[1, 1], [2, 2], [3, 3], [4, 4], [5, 5]],
# [[1, 1], [2, 2], [3, 3], [4, 4], [6, 5]]]
combos(pairs, 1).size #=> 6
combos(pairs, 2).size #=> 14
combos(pairs, 3).size #=> 16
combos(pairs, 4).size #=> 9
combos(pairs, 5).size #=> 2
Explanation
For the array pairs used in the examples, and
group_size = 4
we perform the following calculations. First we group the elements of pairs by the last element of each pair (i.e., parent_id):
h = pairs.group_by(&:last)
#=> {1=>[[1, 1]], 2=>[[2, 2]], 3=>[[3, 3]], 4=>[[4, 4]], 5=>[[5, 5], [6, 5]]}
We only need the values from this hash:
b = h.values
#=> [[[1, 1]], [[2, 2]], [[3, 3]], [[4, 4]], [[5, 5], [6, 5]]]
We now obtain combinations of the elements of b:
enum = b.combination(group_size)
#=> b.combination(4)
#=> #<Enumerator: [[[1, 1]], [[2, 2]], [[3, 3]], [[4, 4]],
# [[5, 5], [6, 5]]]:combination(4)>
We can view the (5) elements of this enumerator by converting it to an array:
enum.to_a
#=> [[[[1, 1]], [[2, 2]], [[3, 3]], [[4, 4]]],
# [[[1, 1]], [[2, 2]], [[3, 3]], [[5, 5], [6, 5]]],
# [[[1, 1]], [[2, 2]], [[4, 4]], [[5, 5], [6, 5]]],
# [[[1, 1]], [[3, 3]], [[4, 4]], [[5, 5], [6, 5]]],
# [[[2, 2]], [[3, 3]], [[4, 4]], [[5, 5], [6, 5]]]]
The last step is to map each element of enum to the product of its elements (each element of enum being an array of pairs). We use Enumerable#flat_map so we don't have to subsequently do any flattening:
enum.flat_map { |a| a.shift.product(*a) }
returns the array given in the examples for group_size = 4.
Let's look more carefully as what is happening in the last statement:
enum1 = enum.flat_map
#=> #<Enumerator: #<Enumerator: [[[1, 1]], [[2, 2]], [[3, 3]], [[4, 4]],
# [[5, 5], [6, 5]]]:combination(4)>:flat_map>
You might want to think of enum1 as a "compound enumerator". The elements of enum1 are passed into it's block by Enumerator#each (which will call Array#each) and assigned to the block variable a. Let's look at the second value passed to the block.
Skip the first:
a = enum1.next
#=> [[[1, 1]], [[2, 2]], [[3, 3]], [[4, 4]]]
Pass in the second:
a = enum1.next
#=> [[[1, 1]], [[2, 2]], [[3, 3]], [[5, 5], [6, 5]]]
We take the product of these four arrays as follows:
a[0].product(a[1], a[2], a[3])
#=> [[[1, 1], [2, 2], [3, 3], [5, 5]],
# [[1, 1], [2, 2], [3, 3], [6, 5]]]
which we could also write:
a[0].product(*a[1..-1])
or, as I have done:
a.shift.product(*a)
Note that, in the last expression, a of *a is what's left of a after a.shift is executed.

Related

Show only the combinations of two permutated arrays that have a sum less than or equal to target number

I have two arrays:
teams = [1,2,3] and drivers = [4,5,6]. Using permutations I have managed to show all combinations of the two arrays, but have managed to define what number of values I'd like to use from each array. So from 'Teams' I have used 1 value and 'Drivers' I have used two. I would like to only show the combinations where the sum is less than or equal to 10 and remove any duplicates.
teams = [1,2,3]
drivers = [4,5,6]
team = teams.permutation(1).to_a
driver = drivers.permutation(2).to_a
array = team.product(driver)
target = 11
This is successfully outputting all combinations of the two arrays using 1 number from teams and 2 from drivers as follows:
[[1], [4, 5]], [[1], [4, 6]], [[1], [5, 4]], [[1], [5, 6]], [[1], [6, 4]], [[1], [6, 5]], [[2], [4, 5]], etc...
To only show values less than or equal to 10 my expected outcome would be: [[1], [4, 5]], [[1], [5, 4]],
and then no duplicates would leave me with just:
[[1], [4, 5]]
I have tried adding the below line of code but am getting an undefined method `<=' error:
#array = array[0].product(*array[1..-1]).select { |a| a.reduce(:+) <= target }
I have also tried this with no luck:
result = array.combination(1).select{|combi| combi.sum <= target}
#array = result
I'm guessing it's something to do with the permutation?
teams = [1,2,3]
drivers = [2,5,4,5,6,4,5,7]
max_driver_sum = 10
I have assumed that drivers can contain duplicate elements (as in my example), but I will explain at the end how the calculations would simplify if there are no duplicates.
As a first step let's partition drivers between values that are repeated and those that are not.
counts = drivers.tally
#=> {2=>1, 5=>3, 4=>2, 6=>1, 7=>1}
dup_drivers, uniq_drivers = counts.partition { |_d,n| n > 1 }
.map { |arr| arr.map(&:first) }​
#=> [[5, 4], [2, 6, 7]]
​Therefore,
dup_drivers
#=> [5, 4]
uniq_drivers
#=> [2, 6, 7]
See Enumerable#tally and Enumerable#partition.
Here,
counts.partition { |_d,n| n > 1 }
#=> [[[5, 3], [4, 2]], [[2, 1], [6, 1], [7, 1]]]
First compute the unique combinations in which the two drivers are equal:
dup_combos = teams.each_with_object([]) do |t,arr|
max_driver = (max_driver_sum - t)/2
dup_drivers.each do |d|
arr << [[t],[d,d]] if d <= max_driver
end
end
#=> [[[1], [4, 4]], [[2], [4, 4]]]
Next, compute the unique combinations in which the two drivers are not equal:
all_uniq = uniq_drivers + dup_drivers
#=> [2, 6, 7, 5, 4]
all_uniq_combos = all_uniq.combination(2).to_a
#=> [[2, 6], [2, 7], [2, 5], [2, 4], [6, 7], [6, 5],
# [6, 4], [7, 5], [7, 4], [5, 4]]
uniq_combos = teams.each_with_object([]) do |t,arr|
adj_driver_sum = max_driver_sum - t
all_uniq_combos.each do |combo|
arr << [[t],combo] if combo.sum <= adj_driver_sum
end
end
#=> [[[1], [2, 6]], [[1], [2, 7]], [[1], [2, 5]], [[1], [2, 4]],
# [[1], [5, 4]], [[2], [2, 6]], [[2], [2, 5]], [[2], [2, 4]],
# [[3], [2, 5]], [[3], [2, 4]]]
See Array#combination.
The final step is to combine the two groups of combinations:
a1 = dup_combos + uniq_combos
#=> [[[1], [4, 4]], [[2], [4, 4]], [[1], [2, 6]], [[1], [2, 7]],
# [[1], [2, 5]], [[1], [2, 4]], [[1], [5, 4]], [[2], [2, 6]],
# [[2], [2, 5]], [[2], [2, 4]], [[3], [2, 5]], [[3], [2, 4]]]
Sorted, this result is as follows.
a1.sort
#=> [[[1], [2, 4]], [[1], [2, 5]], [[1], [2, 6]], [[1], [2, 7]],
# [[1], [4, 4]], [[1], [5, 4]],
# [[2], [2, 4]], [[2], [2, 5]], [[2], [2, 6]], [[2], [4, 4]],
# [[3], [2, 4]], [[3], [2, 5]]]
Notice that Array#uniq was not used in the foregoing. If desired, one could of course substitute out some of the variables above.
If drivers contains no duplicates the desired array is given by uniq_combos where all_uniq is replaced by drivers in the calculation of all_uniq_combos. If, for example,
teams = [1,2,3]
drivers = [2,5,4,6,7]
max_driver_sum = 10
then
all_uniq_combos = drivers.combination(2).to_a
#=> [[2, 5], [2, 4], [2, 6], [2, 7], [5, 4], [5, 6],
# [5, 7], [4, 6], [4, 7], [6, 7]]
combos = teams.each_with_object([]) do |t,arr|
adj_driver_sum = max_driver_sum - t
all_uniq_combos.each do |combo|
arr << [[t],combo] if combo.sum <= adj_driver_sum
end
end ​
#=> [[[1], [2, 5]], [[1], [2, 4]], [[1], [2, 6]], [[1], [2, 7]],
# [[1], [5, 4]], [[2], [2, 5]], [[2], [2, 4]], [[2], [2, 6]],
# [[3], [2, 5]], [[3], [2, 4]]]
combos.sort
#=> [[[1], [2, 4]], [[1], [2, 5]], [[1], [2, 6]], [[1], [2, 7]],
# [[1], [5, 4]],
# [[2], [2, 4]], [[2], [2, 5]], [[2], [2, 6]],
# [[3], [2, 4]], [[3], [2, 5]]]
Here's an approach
teams = [1, 2, 3]
drivers = [2, 6, 5, 4]
team = teams.permutation(1).to_a
driver = drivers.permutation(2).to_a
array = team.product(driver)
target = 10
res = array.select {|i| i.map(&:sum).sum <= target}.compact
==> [[[1], [2, 6]], [[1], [2, 5]], [[1], [2, 4]], [[1], [6, 2]],
[[1], [5, 2]], [[1], [5, 4]], [[1], [4, 2]], [[1], [4, 5]],
[[2], [2, 6]], [[2], [2, 5]], [[2], [2, 4]], [[2], [6, 2]],
[[2], [5, 2]], [[2], [4, 2]], [[3], [2, 5]], [[3], [2, 4]],
[[3], [5, 2]], [[3], [4, 2]]]
Getting the unique items (modified to also work for values of teams > drivers)
t1 = res.map {|i| i[0]}
d2 = res.map {|i| i[1].flatten.sort}
t1.zip(d2).uniq
==> [[[1], [2, 6]], [[1], [2, 5]], [[1], [2, 4]], [[1], [4, 5]],
[[2], [2, 6]], [[2], [2, 5]], [[2], [2, 4]], [[3], [2, 5]],
[[3], [2, 4]]]

What is the best way to merge two arrays (element + element), if elements itself are arrays

I have two nested arrays with equal size:
Array1 =[[1, 2], [], [2, 3]]
Array2= [[1, 4], [8, 11], [3, 6]]
I need to merge them in one array, like this:
Array = [[1,2,1,4], [8,11], [2,3,3,6]],
so each elements of new Array[x] = Array1[x] + Array2[x]
I understand how to do it with for(each) cycle, but I am sure Ruby has an elegant solution for that. It is also possible that the solution will produce by changing Array1.
Array1.each_index.map { |i| Array1[i] + Array2[i] }
#=> [[1,2,1,4], [8,11], [2,3,3,6]]
This has the advantage that it avoids the creation of a temporary array [Array1, Array2].transpose or Array1.zip(Array2).
[Array1, Array2].transpose.map(&:flatten)
=> [[1, 2, 1, 4], [8, 11], [2, 3, 3, 6]]
RubyGuides: "Turn Rows Into Columns With The Ruby Transpose Method"
Each step explained:
[Array1, Array2]
=> [[[1, 2], [], [2, 3]],
[[1, 4], [8, 11], [3, 6]]]
Create a grid like array.
[Array1, Array2].transpose
=> [[[1, 2], [1, 4]], [[], [8, 11]], [[2, 3], [3, 6]]]
transpose switches rows and columns (close to what we want)
[Array1, Array2].transpose.map(&:flatten)
=> [[1, 2, 1, 4], [8, 11], [2, 3, 3, 6]]
flatten gets rid of the unnecessary nested arrays (here combined with map to access nested arrays)
I would do something like:
array1 =[[1, 2], [], [2, 3]]
array2= [[1, 4], [8, 11], [3, 6]]
array1.zip(array2).map(&:flatten)
# => [[1, 2, 1, 4], [8, 11], [2, 3, 3, 6]]

How to convert multidimensional array into a 2 dimensional array?

Having the following nested array
[[[0, 0], [0, 1], [0, 2], [0, 3], [0, 4], [0, 5]], [[1, 0], [1, 1], [1, 2], [1, 3], [1, 4], [1, 5]], [[2, 0], [2, 1], [2, 2], [2, 3], [2, 4], [2, 5]], [[3, 0], [3, 1], [3, 2], [3, 3], [3, 4], [3, 5]], [[4, 0], [4, 1], [4, 2], [4, 3], [4, 4], [4, 5]], [[5, 0], [5, 1], [5, 2], [5, 3], [5, 4], [5, 5]]]
I'd like to remove subarray containers until it becomes a 2 dimensional array like:
[[0,0], [5,1], [5,4]...]
.flatten removes everything and I need to keep the groups of 2 within subarrays.
also, next time you can try to read documentation :)
a = [[[0, 0], [0, 1], [0, 2], [0, 3], [0, 4], [0, 5]], [[1, 0], [1, 1], [1, 2], [1, 3], [1, 4], [1, 5]], [[2, 0], [2, 1], [2, 2], [2, 3], [2, 4], [2, 5]], [[3, 0], [3, 1], [3, 2], [3, 3], [3, 4], [3, 5]], [[4, 0], [4, 1], [4, 2], [4, 3], [4, 4], [4, 5]], [[5, 0], [5, 1], [5, 2], [5, 3], [5, 4], [5, 5]]]
a.flatten(1)
>[[0, 0], [0, 1], [0, 2], [0, 3], [0, 4], [0, 5], [1, 0], [1, 1], [1, 2], [1, 3], [1, 4], [1, 5], [2, 0], [2, 1], [2, 2], [2, 3], [2, 4], [2, 5], [3, 0], [3, 1], [3, 2], [3, 3], [3, 4], [3, 5], [4, 0], [4, 1], [4, 2], [4, 3], [4, 4], [4, 5], [5, 0], [5, 1], [5, 2], [5, 3], [5, 4], [5, 5]]

Joining two ranges into 2d array Ruby

How do I join two ranges into a 2d array as such in ruby? Using zip doesn't provide the result I need.
(0..2) and (0..2)
# should become => [[0,0],[0,1],[0,2], [1,0],[1,1],[1,2], [2,0],[2,1],[2,2]]
Ruby has a built in method for this: repeated_permutation.
(0..2).to_a.repeated_permutation(2).to_a
I'm puzzled. Here it is a day after the question was posted and nobody has suggested the obvious: Array#product:
[*0..2].product [*1..3]
#=> [[0, 1], [0, 2], [0, 3], [1, 1], [1, 2], [1, 3], [2, 1], [2, 2], [2, 3]]
range_a = (0..2)
range_b = (5..8)
def custom_join(a, b)
a.inject([]){|carry, a_val| carry += b.collect{|b_val| [a_val, b_val]}}
end
p custom_join(range_a, range_b)
Output:
[[0, 5], [0, 6], [0, 7], [0, 8], [1, 5], [1, 6], [1, 7], [1, 8], [2, 5], [2, 6], [2, 7], [2, 8]]
straight forward solution:
range_a = (0..2)
range_b = (5..8)
def custom_join(a, b)
[].tap{|result| a.map{|i| b.map{|j| result << [i, j]; } } }
end
p custom_join(range_a, range_b)
Output:
[[0, 5], [0, 6], [0, 7], [0, 8], [1, 5], [1, 6], [1, 7], [1, 8], [2, 5], [2, 6], [2, 7], [2, 8]]
Simply, this will do it:
a = (0...2).to_a
b = (0..2).to_a
result = []
a.each { |ae| b.each { |be| result << [ae, be] } }
p result
# => [[0, 0], [0, 1], [0, 2], [1, 0], [1, 1], [1, 2]]

Combinaison ruby Array multidimensionnel to get a Array two dimensional

I have a Array multidimensionnel like:
[[1, 1, 4], [2],[2, 3]]
How to get a combinaison each element except the combinaison in the same array: [1, 1],[1, 4],[2, 3]
I want to get:
[1, 2],[1, 3],[4, 2],[4, 3],[2, 3]
Thanks.
Short answer is:
[[1, 1, 4], [2],[2, 3]].combination(2).flat_map {|x,y| x.product(y)}.uniq
# => [[1, 2], [4, 2], [1, 3], [4, 3], [2, 2], [2, 3]]
Step by step
step1 = [[1, 1, 4], [2],[2, 3]].combination(2)
# => [[[1, 1, 4], [2]], [[1, 1, 4], [2, 3]], [[2], [2, 3]]]
step2 = step1.flat_map {|x,y| x.product(y)}
# => [[1, 2], [1, 2], [4, 2], [1, 2], [1, 3], [1, 2], [1, 3], [4, 2], [4, 3], [2, 2], [2, 3]]
result = step2.uniq
# => [[1, 2], [4, 2], [1, 3], [4, 3], [2, 2], [2, 3]]
Update
For full uniqueness you could use:
[[1, 1, 4], [2],[2, 3, 4]].combination(2).flat_map {|x,y| x.product(y)}.map(&:sort).uniq
arr = [[1, 1, 4], [2], [2, 3]]
a = arr.map(&:uniq)
(arr.size-1).times.flat_map { |i| arr[i].product(arr[i+1..-1].flatten.uniq)}.uniq
#=> [[1,2],[1,3],[4,2],[4,3],[2,2],[2,3]]
Here's another way that uses the method Array#difference that I defined here:
arr.flatten.combination(2).to_a.difference(arr.flat_map { |a| a.combination(2).to_a }).uniq
Array#difference is similar to Array#-. The difference is illustrated in the following example:
a = [1,2,3,4,3,2,2,4]
b = [2,3,4,4,4]
a - b #=> [1]
a.difference b #=> [1, 3, 2, 2]

Resources