Yielding partitions of a multiset with Ruby

Yielding partitions of a multiset with Ruby - arrays

I would like to get all the possible partitions (disjoint subsets of a set which union is the original set) of a multiset (some elements are equal and non-distinguishable from each other).
Simpler case when one would like to yield the partitions of a simple set, in which there are no elements with multiplicity, in other words all elements are different. For this scenario I found this Ruby code on StackOwerflow which is very efficient, as not storing all the possible partitions, but yielding them to a block:
def partitions(set)
yield [] if set.empty?
(0 ... 2 ** set.size / 2).each do |i|
parts = [[], []]
set.each do |item|
parts[i & 1] << item
i >>= 1
end
partitions(parts[1]) do |b|
result = [parts[0]] + b
result = result.reject do |e|
e.empty?
end
yield result
end
end
end
Example:
partitions([1,2,3]){|e| puts e.inspect}
outputs:
[[1, 2, 3]]
[[2, 3], [1]]
[[1, 3], [2]]
[[3], [1, 2]]
[[3], [2], [1]]
As there are 5 different partitioning of the set [1,2,3] (Bell-number anyway: https://en.wikipedia.org/wiki/Bell_number)
However the another set which is in fact a multiset contains elements with multiplicity, then above code doesn't work of course:
partitions([1,1,2]){|e| puts e.inspect}
outputs:
[[1, 1, 2]]
[[1, 2], [1]] *
[[1, 2], [1]] *
[[2], [1, 1]]
[[2], [1], [1]]
One can see two identical partitions, denoted with *, which should be yielded only once.
My question is: how can I modify the def partitions() method to work with multisets too, or how can I filter out the identical partitionings, duplications in an efficient way? Are those identical partitionings coming always followed by each other in a consecutive manner?
My goal is to organize images with different aspect ratio to a montage, and the picture rows of the montage would be those set partitions. I would like to minimalize the difference of the heights between the picture rows (or the standard deviation equivalently) among the possible partitionings, but many times there are pictures with same aspect ratios this is why I try to deal with a multiset.
Yielding not partitons but powersets (all possibe subsets) of a multiset, filtering out the duplicates by simple memoization:
Montage optimization by backtracking on YouTube

You could put it in an array and use uniq:
arr = []
partitions([1,1,2]) { |e| arr << e }
puts arr.to_s
#-> [[[1, 1, 2]], [[1, 2], [1]], [[1, 2], [1]], [[2], [1, 1]], [[2], [1], [1]]]
puts arr.uniq.to_s
#-> [[[1, 1, 2]], [[1, 2], [1]], [[2], [1, 1]], [[2], [1], [1]]]

Related

Find all array with second highest elements in a list

Assuming that I have a list of arrays in Python 3.2, and I want to output an array that contains every array elements, together with their index position in the list, which have the highest second elements. How can I achieve this goal in the most scalable way (i.e., without having to use the nested for-loop )?
Input
a = [[2,3], [1,4,5], [1,4,6,2], [3,3,5], [9,4]]
Expected Output
res = [[[1,4,5], 1], [[1, 4, 6,2], 2], [[9,4], 4]]
Can someone please help assist me on how to do this without using nested for-loop?

You could do:
b = max(a, key=lambda x:x[1])[1]
[[j, i] for i, j in enumerate(a) if j[1]==b]
Out[6]: [[[1, 4, 5], 1], [[1, 4, 6, 2], 2], [[9, 4], 4]]

Calculating the sum of integers in a nested array, Ruby

I'm fairly new to learning Ruby so please bear with me. I am working on a 7 kyu Ruby coding challenge and I've been tasked with finding how many people are left on the bus (first value represents people on, second value, people off) please look at comments in code for more detail.
below is a test example:
([[10, 0], [3, 5], [5, 8]]), # => should return 5"
This is my solution so far:
def number(bus_stops)
bus_stops.each{ | on, off | on[0] -= off[1] }
end
bus_stops
# loop through the array
# for the first array in the nested array subtract second value from first
# add the sum of last nested array to first value of second array and repeat
# subtract value of last element in nested array and repeat
How can I approach this? any resources you would recommend?

There would be many ways to achieve this. Here is one with inject
arr.map { |inner_array| inner_array.inject(&:-) }.inject(&:+)
Iterate over the arrays and calculate the count at each position of how many people would have been left on the bus (this can return negative integers). This will return
[10, -2, -3]
[10 on, none off][3 on, 5 off][5 on, 8 off]
Then inject a + operator between each element to calculate the sum of people left on the bus. This only works if you count from 0 people on and 0 people off.

Here are two other ways to compute the desired result.
arr = [[10, 0], [3, 5], [5, 8]]
Use Array#transpose
arr.transpose.map(&:sum).reduce(:-)
#=> 5
The steps are as follows.
a = arr.transpose
#=> [[10, 3, 5], [0, 5, 8]]
b = a.map(&:sum)
#=> [18, 13] ([total ons, total offs])
b.reduce(:-)
#=> 5
Use Matrix methods
require 'matrix'
(Matrix.row_vector([1] * arr.size) * Matrix[*arr] * Matrix.column_vector([1,-1]))[0,0]
#=> 5
The steps are as follows.
a = [1] * arr.size
#=> [1, 1, 1]
b = Matrix.row_vector(a)
#=> Matrix[[1, 1, 1]]
c = Matrix[*arr]
#=> Matrix[[10, 0], [3, 5], [5, 8]]
d = b * c
#=> Matrix[[18, 13]]
e = Matrix.column_vector([1,-1])
#=> Matrix[[1], [-1]]
f = d * e
#=> Matrix[[5]]
f[0,0]
#=> 5
See Matrix::[], Matrix::row_vector, Matrix::column_vector and Matrix#[]. Notice that the instance method [] is documented in Object.

sum takes a block, which is really simple in this case:
arr = [[10, 0], [3, 5], [5, 8]]
p arr.sum{|on, off| on - off} # => 5
So you were very close.

Algorithm Logic, Splitting Arrays

I'm not looking for a solution just pseudo code or logic that would help me derive an answer.
Given an array:
[1,2,3,4]
I want to split this into two arrays of varying lengths and contents whose sum lengths are equal to the length of the given array. It would be ideal without repetition.
Example output:
[[1],[2, 3, 4]]
[[1, 2], [3, 4]]
[[1, 3], [2, 4]]
[[1, 4],[2, 3]]
[[1, 2, 3], [4]]
[[2], [1, 3, 4]]
[[2, 4], [1, 3]]
[[3], [1, 2, 4]]
More example:
[[1, 3, 4, 6, 8], [2, 5, 7]] //this is a possible combination of 1 through 8
//array
Intuitions:
First attempt involved pushing the starting number array[i] to the result array[0], the second loop moving the index for the third loop to start iterating as is grabbed sublists. Then fill the other list with remaining indices. Was poorly conceived...
Second idea is permutations. Write an algorithm that reorganizes the array into every possible combination. Then, perform the same split operation on those lists at different indexes keeping track of unique lists as strings in a dictionary.
[1,2,3,4,5,6,7,8]
^
split
[1,2,3,4,5,6,7,8]
^
split
[1,3,4,5,6,7,8,2]
^
split
I'm confident that this will produce the lists i'm looking for. However! i'm afraid it may be less efficient than I'd like due to the need for sorting when checking for duplicates and permutations is expensive in the first place.
Please respond with how you would approach this problem, and why.

Pseudocode. The idea is to start with an item in one of the bags, and then to place the next item once in the same bag, once in the other.
function f(A):
// Recursive function to collect arrangements
function g(l, r, i):
// Base case: no more items
if i == length(A):
return [[l, r]]
// Place the item in the left bag
return g(l with A[i], r, i + 1)
// Also return a version where the item
// is placed in the right bag
concatenated with g(l, r with A[i], i + 1)
// Check that we have at least one item
if A is empty:
return []
// Start the recursion with one item placed
return g([A[0]], [], 1)
(PS see revisions for JavaScript code.)

Prevent identical pairs when shuffling and slicing Ruby array

I'd like to prevent producing pairs with the same items when producing a random set of pairs in a Ruby array.
For example:
[1,1,2,2,3,4].shuffle.each_slice(2).to_a
might produce:
[[1, 1], [3, 4], [2, 2]]
I'd like to be able to ensure that it produces a result such as:
[[4, 1], [1, 2], [3, 2]]
Thanks in advance for the help!

arr = [1,1,2,2,3,4]
loop do
sliced = arr.shuffle.each_slice(2).to_a
break sliced if sliced.none? { |a| a.reduce(:==) }
end

Here are three ways to produce the desired result (not including the approach of sampling repeatedly until a valid sample is found). The following array will be used for illustration.
arr = [1,4,1,2,3,2,1]
Use Array#combination and Array#sample
If pairs sampled were permitted to have the same number twice, the sample space would be
arr.combination(2).to_a
#=> [[1, 4], [1, 1], [1, 2], [1, 3], [1, 2], [1, 1], [4, 1], [4, 2],
# [4, 3], [4, 2], [4, 1], [1, 2], [1, 3], [1, 2], [1, 1], [2, 3],
# [2, 2], [2, 1], [3, 2], [3, 1], [2, 1]]
The pairs containing the same value twice--here [1, 1] and [2, 2]--are not wanted so they are simple removed from the above array.
sample_space = arr.combination(2).reject { |x,y| x==y }
#=> [[1, 4], [1, 2], [1, 3], [1, 2], [4, 1], [4, 2], [4, 3],
# [4, 2], [4, 1], [1, 2], [1, 3], [1, 2], [2, 3], [2, 1],
# [3, 2], [3, 1], [2, 1]]
We evidently are to sample arr.size/2 elements from sample_space. Depending on whether this is to be done with or without replacement we would write
sample_space.sample(arr.size/2)
#=> [[4, 3], [1, 2], [1, 3]]
for sampling without replacement and
Array.new(arr.size/2) { sample_space.sample }
#=> [[1, 3], [4, 1], [2, 1]]
for sampling with replacement.
Sample elements of each pair sequentially, Method 1
This method, like the next, can only be used to sample with replacement.
Let's first consider sampling a single pair. We could do that by selecting the first element of the pair randomly from arr, remove all instances of that element in arr and then sample the second element from what's left of arr.
def sample_one_pair(arr)
first = arr.sample
[first, second = (arr-[first]).sample]
end
To draw a sample of arr.size/2 pairs we there execute the following.
Array.new(arr.size/2) { sample_one_pair(arr) }
#=> [[1, 2], [4, 3], [1, 2]]
Sample elements of each pair sequentially, Method 2
This method is a very fast way of sampling large numbers of pairs with replacement. Like the previous method, it cannot be used to sample without replacement.
First, compute the cdf (cumulative distribution function) for drawing an element of arr at random.
counts = arr.group_by(&:itself).transform_values { |v| v.size }
#=> {1=>3, 4=>1, 2=>2, 3=>1}
def cdf(sz, counts)
frac = 1.0/sz
counts.each_with_object([]) { |(k,v),a|
a << [k, frac * v + (a.empty? ? 0 : a.last.last)] }
end
cdf_first = cdf(arr.size, counts)
#=> [[1, 0.429], [4, 0.571], [2, 0.857], [3, 1.0]]
This means that there is a probability of 0.429 (rounded) of randomly drawing a 1, 0.571 of drawing a 1 or a 4, 0.857 of drawing a 1, 4 or 2 and 1.0 of drawing one of the four numbers. We therefore can randomly sample a number from arr by obtaining a (pseudo-) random number between zero and one (p = rand) and then determine the first element of counts_cdf, [n, q] for which p <= q:
def draw_random(cdf)
p = rand
cdf.find { |n,q| p <= q }.first
end
draw_random(counts_cdf) #=> 1
draw_random(counts_cdf) #=> 4
draw_random(counts_cdf) #=> 1
draw_random(counts_cdf) #=> 1
draw_random(counts_cdf) #=> 2
draw_random(counts_cdf) #=> 3
In simulation models, incidentally, this is the standard way of generating pseudo-random variates from discrete probability distributions.
Before drawing the second random number of the pair we need to modify cdf_first to reflect that fact that the first number cannot be drawn again. Assuming there will be many pairs to generate randomly, it is most efficient to construct a hash cdf_second whose keys are the first values drawn randomly for the pair and whose values are the corresponding cdf's.
cdf_second = counts.keys.each_with_object({}) { |n, h|
h[n] = cdf(arr.size - counts[n], counts.reject { |k,_| k==n }) }
#=> {1=>[[4, 0.25], [2, 0.75], [3, 1.0]],
# 4=>[[1, 0.5], [2, 0.833], [3, 1.0]],
# 2=>[[1, 0.6], [4, 0.8], [3, 1.0]],
# 3=>[[1, 0.5], [4, 0.667], [2, 1.0]]}
If, for example, a 2 is drawn for the first element of the pair, the probability is 0.6 of drawing a 1 for the second element, 0.8 of drawing a 1 or 4 and 1.0 of drawing a 1, 4, or 3.
We can then sample one pair as follows.
def sample_one_pair(cdf_first, cdf_second)
first = draw_random(cdf_first)
[first, draw_random(cdf_second[first])]
end
As before, to sample arr.size/2 values with replacement, we execute
Array.new(arr.size/2) { sample_one_pair }
#=> [[2, 1], [3, 2], [1, 2]]

With replacement, you may get results like:
unique_pairs([1, 1, 2, 2, 3, 4]) # => [[4, 1], [1, 2], [1, 3]]
Note that 1 gets chosen three times, even though it's only in the original array twice. This is because the 1 is "replaced" each time it's chosen. In other words, it's put back into the collection to potentially be chosen again.
Here's a version of Cary's excellent sample_one_pair solution without replacement:
def unique_pairs(arr)
dup = arr.dup
Array.new(dup.size / 2) do
dup.shuffle!
first = dup.pop
second_index = dup.rindex { |e| e != first }
raise StopIteration unless second_index
second = dup.delete_at(second_index)
[first, second]
end
rescue StopIteration
retry
end
unique_pairs([1, 1, 2, 2, 3, 4]) # => [[4, 3], [1, 2], [2, 1]]
This works by creating a copy of the original array and deleting elements out of it as they're chosen (so they can't be chosen again). The rescue/retry is in there in case it becomes impossible to produce the correct number of pairs. For example, if [1, 3] is chosen first, and [1, 4] is chosen second, it becomes impossible to make three unique pairs because [2, 2] is all that's left; the sample space is exhausted.
This should be slower than Cary's solution (with replacement) but faster (on average) than the posted solutions (without replacement) that require looping and retrying. Welp, chalk up another point for "always benchmark!" I was wrong about all most of my assumptions. Here are the results on my machine with an array of 16 numbers ([1, 1, 2, 2, 3, 4, 5, 5, 5, 6, 7, 7, 8, 9, 9, 10]):
cary_with_replacement
93.737k (± 2.9%) i/s - 470.690k in 5.025734s
mwp_without_replacement
187.739k (± 3.3%) i/s - 943.415k in 5.030774s
mudasobwa_without_replacement
129.490k (± 9.4%) i/s - 653.150k in 5.096761s
EDIT: I've updated the above solution to address Stefan's numerous concerns. In hindsight, the errors are obvious and embarrassing! On the plus side, the revised solution is now faster than mudasobwa's solution, and I've confirmed that the two solutions have the same biases.

You can check if there any mathes and shuffle again:
a = [1,1,2,2,3,4]
# first time shuffle
sliced = a.shuffle.each_slice(2).to_a
# checking if there are matches and shuffle if there are
while sliced.combination(2).any? { |a, b| a.sort == b.sort } do
sliced = a.shuffle.each_slice(2).to_a
end
It is unlikely, be aware about possibility of infinity loop

Determining if a collection has more than one max value

Right now I'm doing this, and it works:
groups = [[1, 1, 1], [2, 2]]
groups.select { |g| g.size == groups.max.size }.size
# => 1 # a clear majority
groups = [[1, 1], [2, 2]]
groups.select { |g| g.size == groups.max.size }.size
# => 2 # needs to be passed to another filter
but I have a suspicion there's a cleaner way.

You can do this snippet:
groups.group_by(&:size)[groups.max.size].size
Let me quickly explain what this does. I apologise in advance for the bad wording as "group" is a rather overloaded term here...
What it does, is first to group the arrays by size. This returns a hash:
groups = [[1, 1, 1], [2, 2]]
grouped = groups.group_by(&:size)
# => {3=>[[1, 1, 1]], 2=>[[2, 2]]}
Then, you take the array of group arrays containing exactly as many elements as the largest group
largest_list = grouped[groups.max.size]
# => [[2, 2]]
Now, you can simple get the size of this array to get the number of groups which have this length:
largest_list.size
# => 1
The reason why your approach is rather slow is that you calculate groups.max.size in your inner loop each time again.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Yielding partitions of a multiset with Ruby - arrays

You could put it in an array and use uniq: arr = [] partitions([1,1,2]) { |e| arr << e } puts arr.to_s #-> [[[1, 1, 2]], [[1, 2], [1]], [[1, 2], [1]], [[2], [1, 1]], [[2], [1], [1]]] puts arr.uniq.to_s #-> [[[1, 1, 2]], [[1, 2], [1]], [[2], [1, 1]], [[2], [1], [1]]]

Related

Find all array with second highest elements in a list

Calculating the sum of integers in a nested array, Ruby

Algorithm Logic, Splitting Arrays

Prevent identical pairs when shuffling and slicing Ruby array

Determining if a collection has more than one max value

Categories

Resources