Prevent identical pairs when shuffling and slicing Ruby array - arrays

I'd like to prevent producing pairs with the same items when producing a random set of pairs in a Ruby array.
For example:
[1,1,2,2,3,4].shuffle.each_slice(2).to_a
might produce:
[[1, 1], [3, 4], [2, 2]]
I'd like to be able to ensure that it produces a result such as:
[[4, 1], [1, 2], [3, 2]]
Thanks in advance for the help!

arr = [1,1,2,2,3,4]
loop do
sliced = arr.shuffle.each_slice(2).to_a
break sliced if sliced.none? { |a| a.reduce(:==) }
end

Here are three ways to produce the desired result (not including the approach of sampling repeatedly until a valid sample is found). The following array will be used for illustration.
arr = [1,4,1,2,3,2,1]
Use Array#combination and Array#sample
If pairs sampled were permitted to have the same number twice, the sample space would be
arr.combination(2).to_a
#=> [[1, 4], [1, 1], [1, 2], [1, 3], [1, 2], [1, 1], [4, 1], [4, 2],
# [4, 3], [4, 2], [4, 1], [1, 2], [1, 3], [1, 2], [1, 1], [2, 3],
# [2, 2], [2, 1], [3, 2], [3, 1], [2, 1]]
The pairs containing the same value twice--here [1, 1] and [2, 2]--are not wanted so they are simple removed from the above array.
sample_space = arr.combination(2).reject { |x,y| x==y }
#=> [[1, 4], [1, 2], [1, 3], [1, 2], [4, 1], [4, 2], [4, 3],
# [4, 2], [4, 1], [1, 2], [1, 3], [1, 2], [2, 3], [2, 1],
# [3, 2], [3, 1], [2, 1]]
We evidently are to sample arr.size/2 elements from sample_space. Depending on whether this is to be done with or without replacement we would write
sample_space.sample(arr.size/2)
#=> [[4, 3], [1, 2], [1, 3]]
for sampling without replacement and
Array.new(arr.size/2) { sample_space.sample }
#=> [[1, 3], [4, 1], [2, 1]]
for sampling with replacement.
Sample elements of each pair sequentially, Method 1
This method, like the next, can only be used to sample with replacement.
Let's first consider sampling a single pair. We could do that by selecting the first element of the pair randomly from arr, remove all instances of that element in arr and then sample the second element from what's left of arr.
def sample_one_pair(arr)
first = arr.sample
[first, second = (arr-[first]).sample]
end
To draw a sample of arr.size/2 pairs we there execute the following.
Array.new(arr.size/2) { sample_one_pair(arr) }
#=> [[1, 2], [4, 3], [1, 2]]
Sample elements of each pair sequentially, Method 2
This method is a very fast way of sampling large numbers of pairs with replacement. Like the previous method, it cannot be used to sample without replacement.
First, compute the cdf (cumulative distribution function) for drawing an element of arr at random.
counts = arr.group_by(&:itself).transform_values { |v| v.size }
#=> {1=>3, 4=>1, 2=>2, 3=>1}
def cdf(sz, counts)
frac = 1.0/sz
counts.each_with_object([]) { |(k,v),a|
a << [k, frac * v + (a.empty? ? 0 : a.last.last)] }
end
cdf_first = cdf(arr.size, counts)
#=> [[1, 0.429], [4, 0.571], [2, 0.857], [3, 1.0]]
This means that there is a probability of 0.429 (rounded) of randomly drawing a 1, 0.571 of drawing a 1 or a 4, 0.857 of drawing a 1, 4 or 2 and 1.0 of drawing one of the four numbers. We therefore can randomly sample a number from arr by obtaining a (pseudo-) random number between zero and one (p = rand) and then determine the first element of counts_cdf, [n, q] for which p <= q:
def draw_random(cdf)
p = rand
cdf.find { |n,q| p <= q }.first
end
draw_random(counts_cdf) #=> 1
draw_random(counts_cdf) #=> 4
draw_random(counts_cdf) #=> 1
draw_random(counts_cdf) #=> 1
draw_random(counts_cdf) #=> 2
draw_random(counts_cdf) #=> 3
In simulation models, incidentally, this is the standard way of generating pseudo-random variates from discrete probability distributions.
Before drawing the second random number of the pair we need to modify cdf_first to reflect that fact that the first number cannot be drawn again. Assuming there will be many pairs to generate randomly, it is most efficient to construct a hash cdf_second whose keys are the first values drawn randomly for the pair and whose values are the corresponding cdf's.
cdf_second = counts.keys.each_with_object({}) { |n, h|
h[n] = cdf(arr.size - counts[n], counts.reject { |k,_| k==n }) }
#=> {1=>[[4, 0.25], [2, 0.75], [3, 1.0]],
# 4=>[[1, 0.5], [2, 0.833], [3, 1.0]],
# 2=>[[1, 0.6], [4, 0.8], [3, 1.0]],
# 3=>[[1, 0.5], [4, 0.667], [2, 1.0]]}
If, for example, a 2 is drawn for the first element of the pair, the probability is 0.6 of drawing a 1 for the second element, 0.8 of drawing a 1 or 4 and 1.0 of drawing a 1, 4, or 3.
We can then sample one pair as follows.
def sample_one_pair(cdf_first, cdf_second)
first = draw_random(cdf_first)
[first, draw_random(cdf_second[first])]
end
As before, to sample arr.size/2 values with replacement, we execute
Array.new(arr.size/2) { sample_one_pair }
#=> [[2, 1], [3, 2], [1, 2]]

With replacement, you may get results like:
unique_pairs([1, 1, 2, 2, 3, 4]) # => [[4, 1], [1, 2], [1, 3]]
Note that 1 gets chosen three times, even though it's only in the original array twice. This is because the 1 is "replaced" each time it's chosen. In other words, it's put back into the collection to potentially be chosen again.
Here's a version of Cary's excellent sample_one_pair solution without replacement:
def unique_pairs(arr)
dup = arr.dup
Array.new(dup.size / 2) do
dup.shuffle!
first = dup.pop
second_index = dup.rindex { |e| e != first }
raise StopIteration unless second_index
second = dup.delete_at(second_index)
[first, second]
end
rescue StopIteration
retry
end
unique_pairs([1, 1, 2, 2, 3, 4]) # => [[4, 3], [1, 2], [2, 1]]
This works by creating a copy of the original array and deleting elements out of it as they're chosen (so they can't be chosen again). The rescue/retry is in there in case it becomes impossible to produce the correct number of pairs. For example, if [1, 3] is chosen first, and [1, 4] is chosen second, it becomes impossible to make three unique pairs because [2, 2] is all that's left; the sample space is exhausted.
This should be slower than Cary's solution (with replacement) but faster (on average) than the posted solutions (without replacement) that require looping and retrying. Welp, chalk up another point for "always benchmark!" I was wrong about all most of my assumptions. Here are the results on my machine with an array of 16 numbers ([1, 1, 2, 2, 3, 4, 5, 5, 5, 6, 7, 7, 8, 9, 9, 10]):
cary_with_replacement
93.737k (± 2.9%) i/s - 470.690k in 5.025734s
mwp_without_replacement
187.739k (± 3.3%) i/s - 943.415k in 5.030774s
mudasobwa_without_replacement
129.490k (± 9.4%) i/s - 653.150k in 5.096761s
EDIT: I've updated the above solution to address Stefan's numerous concerns. In hindsight, the errors are obvious and embarrassing! On the plus side, the revised solution is now faster than mudasobwa's solution, and I've confirmed that the two solutions have the same biases.

You can check if there any mathes and shuffle again:
a = [1,1,2,2,3,4]
# first time shuffle
sliced = a.shuffle.each_slice(2).to_a
# checking if there are matches and shuffle if there are
while sliced.combination(2).any? { |a, b| a.sort == b.sort } do
sliced = a.shuffle.each_slice(2).to_a
end
It is unlikely, be aware about possibility of infinity loop

Related

Acessing values in nested array

I'm learning Rust and I wanted to write program that would take a (randomly generated) bowling sheet and generate the score. I now know that I will have to use Rusts Vecs instead of arrays but I got stuck at accessing the value from the nested arrays so I would like to find the solution before I re-write it.
What I wanted to do is access the individual values and run some logic on them, but I got stuck at the "accessing values" part, this is what I came up with (but it doesn't work):
let sheet: [[u32; 2]; 10] = [[1, 3], [0, 6], [9, 0], [0, 5], [5, 3], [4, 2], [1, 4], [2, 3], [3, 0], [4, 4]];
for frame in 0..sheet.len() {
for score in 0..sheet[frame].len() {
println!("{}", sheet[frame[score]]);
}
}
You should clarify what exactly do you mean by 'accessing individual values', but from your code i'd assume that you just want to iterate over every score. Here's how you do it with for loops:
let sheet = [[1, 3], [0, 6], [9, 0], [0, 5], [5, 3], [4, 2], [1, 4], [2, 3], [3, 0], [4, 4]];
for frame in sheet {
// On the first iteration frame will be == [1, 3], then [0, 6], etc
for score in frame {
// on first iteration score will be == 1, then 3, then 0, etc
println!("{}", score);
}
}

get array index from sort in Ruby

I have an array
array_a1 = [9,43,3,6,7,0]
which I'm trying to get the sort indices out of, i.e. the answer should be
array_ordered = [6, 3, 4, 5, 1, 2]
I want to do this as a function, so that
def order (array)
will return array_ordered
I have tried implementing advice from Find the index by current sort order of an array in ruby but I don't see how I can do what they did for an array :(
if there are identical values in the array, e.g.
array_a1 = [9,43,3,6,7,7]
then the result should look like:
array_ordered = [3, 4, 5, 6, 1, 2]
(all indices should be 0-based, but these are 1-based)
You can do it this way:
[9,43,3,6,7,0].
each_with_index.to_a. # [[9, 0], [43, 1], [3, 2], [6, 3], [7, 4], [0, 5]]
sort_by(&:first). # [[0, 5], [3, 2], [6, 3], [7, 4], [9, 0], [43, 1]]
map(&:last)
#=> [5, 2, 3, 4, 0, 1]
First you add index to each element, then you sort by the element and finally you pick just indices.
Note, that array are zero-indexed in Ruby, so the results is less by one comparing to your spec.
You should be able to just map over the sorted array and lookup the index of that number in the original array.
arr = [9,43,3,6,7,0]
arr.sort.map { |n| arr.index(n) } #=> [5, 2, 3, 4, 0, 1]
Or if you really want it 1 indexed, instead of zero indexed, for some reason:
arr.sort.map { |n| arr.index(n) + 1 } #=> [6, 3, 4, 5, 1, 2]
array_a1 = [9,43,3,6,7,0]
array_a1.each_index.sort_by { |i| array_a1[i] }
#=> [5, 2, 3, 4, 0, 1]
If array_a1 may contain duplicates and ties are to be broken by the indices of the elements (the element with the smaller index first), you may modify the calculation as follows.
[9,43,3,6,7,7].each_index.sort_by { |i| [array_a1[i], i] }
#=> [2, 3, 4, 5, 0, 1]
Enumerable#sort_by compares two elements with the spaceship operator, <=>. Here, as pairs of arrays are being compared, it is the method Array#<=> that is used. See especially the third paragraph of that doc.

How to find indices of max n elements in array in stable order

I have a number and an array:
n = 4
a = [0, 1, 2, 3, 3, 4]
I want to find the indices corresponding to the maximal n elements of a in the reverse order of the element size, and in stable order when the element sizes are equal. The expected output is:
[5, 3, 4, 2]
This code:
a.each_with_index.max(n).map(&:last)
# => [5, 4, 3, 2]
gives the right indices, but changes the order.
Code
def max_with_order(arr, n)
arr.each_with_index.max_by(n) { |x,i| [x,-i] }.map(&:last)
end
Examples
a = [0,1,2,3,3,4]
max_with_order(a, 1) #=> [5]
max_with_order(a, 2) #=> [5, 3]
max_with_order(a, 3) #=> [5, 3, 4]
max_with_order(a, 4) #=> [5, 3, 4, 2]
max_with_order(a, 5) #=> [5, 3, 4, 2, 1]
max_with_order(a, 6) #=> [5, 3, 4, 2, 1, 0]
Explanation
For n = 3 the steps are as follows.
b = a.each_with_index
#=> #<Enumerator: [0, 1, 2, 3, 3, 4]:each_with_index>
We can convert b to an array to see the (six) values it will generate and pass to the block.
b.to_a
#=> [[0, 0], [1, 1], [2, 2], [3, 3], [3, 4], [4, 5]]
Continuing,
c = b.max_by(n) { |x,i| [x,-i] }
#=> [[4, 5], [3, 3], [3, 4]]
c.map(&:last)
#=> [5, 3, 4]
Note that the elements of arr need not be numeric, merely comparable.
You can supply a block to max to make the determination more specific like so
a.each_with_index.max(n) do |a,b|
if a[0] == b[0] # the numbers are the same
b[1] <=> a[1] # compare the indexes in reverse
else
a[0] <=> b[0] # compare the numbers themselves
end
end.map(&:last)
#=> [5,3,4,2]
max block expects a comparable response e.g. -1,0,1 so in this case we are just saying if the number is the same then compare the indexes in reverse order e.g. 4 <=> 3 #=> -1 the -1 indicates this values is less so that will then be placed after 3
Also to expand on #CarySwoveland's answer (which I am a bit jealous I did not think of), since you only care about returning the indices we could implement as follows without a secondary map
a.each_index.max_by(n) { |x| [a[x],-x] }
#=> [5,3,4,2]
#compsy you wrote without changing order, so it would be:
a = [0,1,2,3,3,4]
n = a.max
i = 0
a.each do |x|
break if x == n
i += 1
end
I use variable i as index, when x (which is the value beeing analized) is equals n we use break to stop the each method conserving the last value of i wich corresponds to the position of the max value at the array. Be aware that value of i is different by one of the natural position in the array, and tht is because in arrays the first element is 0 not 1.
I break the each because there is no need to keep checking all the other values of the array after we found the position of the value.

How to group two value arrays to n value arrays in the below example?

I already have many two value arrays for example in the below
ary = [[1, 2], [2, 3], [1, 3], [4, 5], [5, 6], [4, 7], [7, 8], [4, 8]]
I want to group them into
[1, 2, 3], [4, 5], [5, 6], [4, 7, 8]
Because the meaning is 1 and 2 have relationship, 2 and 3 have relationship,1 and 3 have relationship,so 1,2,3 all have relationship
How can I do this by ruby lib or any algorithm?
Here is a Ruby implementation of the basic Bron–Kerbosch algorithm:
class Graph
def initialize(edges)
#edges = edges
end
def find_maximum_cliques
#cliques ||= []
bron_kerbosch([], nodes, []) if #cliques.empty?
#cliques
end
private
def nodes
#nodes ||= #edges.flatten.uniq
end
def neighbours
#neighbours ||= nodes.map do |node|
node_neighbours =
#edges.select { |edge| edge.include? node }.flatten - [node]
[node, node_neighbours]
end.to_h
end
def bron_kerbosch(re, pe, xe)
#cliques << re if pe.empty? && xe.empty?
pe.each do |ve|
bron_kerbosch(re | [ve], pe & neighbours[ve], xe & neighbours[ve])
pe -= [ve]
xe |= [ve]
end
end
end
edges = [[1, 2], [2, 3], [1, 3], [4, 5], [5, 6], [4, 7], [7, 8], [4, 8]]
Graph.new(edges).find_maximum_cliques # => [[1, 2, 3], [4, 5], [4, 7, 8], [5, 6]]
There is a an optimization that can get it to O(3^n/3).
Your array can be seen as a Graph (e.g. Node 1 and Node 2 are connected by an Edge, as well as Node 2 and 3, ...)
You're looking for an Array of all the maximum cliques. A clique cover is an array of maximum cliques which contains every Node exactly once. This problem is hard.
A clique is a subset of your graph in which every node is connected to each other. If a node isn't connected to any other, it forms a clique with just one node, itself. A maximum clique is a clique which cannot be enlarged by adding another node.
This gem might be able to help you, with the all_max_cliques method. Here's a script you could write at the root of the Clique project :
require_relative 'src/graph.rb'
require_relative 'src/bron.rb'
require_relative 'src/max_clique.rb'
require_relative 'src/util.rb'
require 'set'
ary = [[1,2],[2,3],[1,3],[4,5],[5,6],[4,7],[7,8],[4,8]]
graph = Graph.new
graph = ary.each_with_object(Graph.new){|(n1,n2),graph| graph.insert(n1, [n2])}
all_max_cliques(graph.graph)
It outputs :
intersections after sort!: {3=>[2], 2=>[3]}
cliqueeee[3, 2, 1]
intersections after sort!: {3=>[1], 1=>[3]}
...
...
cliqueeee[6, 5]
intersections after sort!: {5=>[]}
cliqueeee[5, 6]
largest clique!: [6, 5]
[[3, 2, 1], [8, 7, 4], [6, 5]]
Note that if you want a clique cover, (i.e. a partition), every node should appear exactly once. 5 and 6 form a maximum clique, and 4 is already inside [4,7,8], so there's no need for [4,5].
Here's a very basic bruteforce solution. Don't use it for anything big!
require 'set'
class Graph
attr_reader :nodes, :edges
def initialize(edges)
#nodes = edges.flatten.sort.uniq
#edges = edges.map(&:sort).to_set
end
def minimum_clique_cover
partitions.select{ |p| all_cliques?(p) }.min_by(&:size)
end
private
def partitions(array = nodes)
if array.length == 1
[[array]]
else
*head, tail = array
partitions(head).inject([]) do |result, partition|
result + (0..partition.length).collect do |index_to_add_at|
new_partition = partition.dup
new_partition[index_to_add_at] = (new_partition[index_to_add_at] || []) + [tail]
new_partition
end
end
end
end
def all_cliques?(partition)
partition.all?{ |subset| clique?(subset) }
end
def clique?(subset)
subset.permutation(2).select{ |n1, n2| n2 > n1 }.all?{ |edge| edges.include?(edge) }
end
end
p Graph.new([[1, 2], [2, 3], [1, 3], [4, 5], [5, 6], [4, 7], [7, 8], [4, 8]]).minimum_clique_cover
# => [[4, 7, 8], [5, 6], [1, 2, 3]]
It returns a minimum clique cover, which is harder than just an Array of maximum cliques. Don't ask me about the complexity of this script, and I won't have to lie.

How to transpose an array in Python 3?

I've been scanning the forums and haven't found an answer yet that I can apply to my situation. I need to be able to take an n by n array and transpose it in Python-3. The example given is that I have this list input into the function:
[[4, 2, 1], ["a", "a", "a"], [-1, -2, -3]] and it needs to be transposed to read:
[[4, 'a', -1], [2, 'a', -2], [1, 'a', -3]] So basically reading vertically instead of horizontally.
I CANNOT use things like zip or numpy, I have to make my own function.
Been rattling my brain at this for two nights and it's a huge headache. If anyone could help and then provide an explanation so I can learn it, I'd be grateful.
Edit:
I should add for reference sake that the argument variable is M. The function we're supposed to write is trans(M):
A one-liner:
def trans(M):
return [[M[j][i] for j in range(len(M))] for i in range(len(M[0]))]
result:
>>> M = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> trans(M)
[[1, 4, 7], [2, 5, 8], [3, 6, 9]
# or for a non-square matrix:
>>> N = [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]
>>> trans(N)
[[1, 4, 7, 10], [2, 5, 8, 11], [3, 6, 9, 12]]
Additional Note: If you look up the tutorial on list comprehension, one of the examples is in fact transposition of a matrix array.
A variant that should work for matrices with irregular row lengths:
m=[[3, 2, 1],
[0, 1],
[2, 1, 0]]
m_T = [ [row[c] for row in m if c < len(row)] for c in range(0, max([len(row) for row in m])) ]
Here is an in place solution that works for square matrices:
def trans(M):
n = len(M)
for i in range(n - 1):
for j in range(i + 1, n):
M[i][j], M[j][i] = M[j][i], M[i][j]
Example Usage:
def print_matrix(M):
for row in M:
for ele in row:
print(ele, end='\t')
print()
M = [[4, 2, 1], ["a", "a", "a"], [-1, -2, -3]]
print('Original Matrix:')
print_matrix(M)
trans(M)
print('Transposed Matrix:')
print_matrix(M)
Output:
Original Matrix:
4 2 1
a a a
-1 -2 -3
Transposed Matrix:
4 a -1
2 a -2
1 a -3
y=([1,2], [3,4], [5,6])
transpose=[[row[i] for row in y] for i in range(len(y[0]))]
the output is
[[1, 3, 5], [2, 4, 6]]
You can also use the function in numpy to transpose - if you need the answer as a list it is straightforward to convert back using tolist:
from numpy import transpose
M = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
transpose(M).tolist()
the output is
[[1, 4, 7], [2, 5, 8], [3, 6, 9]]
Haven't timed it (no time!) but I strongly suspect this will be a lot faster than iterators for large arrays, especially if you don't need to convert back to a list.

Resources