Related
Consider the following code fragment:
import numpy as np
mask = np.array([True, True, False, True, True, False])
val = np.array([9, 3])
arr = np.random.randint(1, 9, size = (5,len(mask)))
As expected, we get an array of random integers, 1 to 9, with 5 rows and 6 columns as below. The val array has not been used yet.
[[2, 7, 6, 9, 7, 5],
[7, 2, 9, 7, 8, 3],
[9, 1, 3, 5, 7, 3],
[5, 7, 4, 4, 5, 2],
[7, 7, 9, 6, 9, 8]]
Now I'll introduce val = [9, 3].
Where mask = True, I want the row element to be taken randomly from 1 to 9.
Where mask = False, I want the row element to be taken randomly from 1 to 3.
How can this be done efficiently? A sample output is shown below.
[[2, 7, 2, 9, 7, 1],
[7, 2, 1, 7, 8, 3],
[9, 1, 3, 5, 7, 3],
[5, 7, 1, 4, 5, 2],
[7, 7, 2, 6, 9, 1]]
One idea is to sample randomly between 0 to 1, then multiply with 9 or 3 depending on mask, and finally add 1 to move the sample.
rand = np.random.rand(5,len(mask))
is3 = (1-mask).astype(int)
# out is random from 0-8 or 0-2 depending on `is3`
out = (rand*val[is3]).astype(int)
# move out by `1`:
out = (out + 1)
Output:
array([[4, 9, 3, 6, 2, 1],
[1, 8, 2, 7, 1, 3],
[8, 2, 1, 2, 3, 2],
[4, 3, 2, 2, 3, 2],
[5, 8, 1, 5, 6, 1]])
Consider two given arrays: (in this sample, these arrays are based on n=5)
Given: array m has shape (n, 2n). When n = 5, each row of m holds a random arrangement of integers 0,0,1,1,2,2,3,3,4,4.
import numpy as np
m= np.array([[4, 2, 2, 3, 0, 1, 3, 1, 0, 4],
[2, 4, 0, 4, 3, 2, 0, 1, 1, 3],
[0, 2, 3, 1, 3, 4, 2, 1, 4, 0],
[2, 1, 2, 4, 3, 0, 0, 4, 3, 1],
[2, 0, 1, 0, 3, 4, 4, 3, 2, 1]])
Given: array t has shape (n^2, 4). When n = 5, the first two columns (m_row, val) hold all 25 permutations pairs of 0 to 4.
The 1st column refers to rows of array m. The 2nd column refers to values in array m.
For now, the last two columns hold dummy value 99 that will be replaced.
t = np.array([[0, 0, 99, 99],
[0, 1, 99, 99],
[0, 2, 99, 99],
[0, 3, 99, 99],
[0, 4, 99, 99],
[1, 0, 99, 99],
[1, 1, 99, 99],
[1, 2, 99, 99],
[1, 3, 99, 99],
[1, 4, 99, 99],
[2, 0, 99, 99],
[2, 1, 99, 99],
[2, 2, 99, 99],
[2, 3, 99, 99],
[2, 4, 99, 99],
[3, 0, 99, 99],
[3, 1, 99, 99],
[3, 2, 99, 99],
[3, 3, 99, 99],
[3, 4, 99, 99],
[4, 0, 99, 99],
[4, 1, 99, 99],
[4, 2, 99, 99],
[4, 3, 99, 99],
[4, 4, 99, 99]])
PROBLEM: I want to replace the dummy values in the last two columns of t, as follows:
Let's consider t row [1, 3, 99, 99]. So from m's row=1, I determine the indices of the two columns that hold value 3. These are columns (4,9), so the t row is updated to [1, 3, 4, 9].
In the same way, t row [4, 2, 99, 99] becomes [4, 2, 0, 8].
I currently do this by looping through each column i of array m, looking for the two instances where m[m_row, i] = val, then updating array t. (slow!)
Is there a way to speed up this process, perhaps using vectorization or broadcasting?
Use the following code:
import itertools
# First 2 columns
t = np.array(list(itertools.product(range(m.shape[0]), repeat=2)))
# Add columns - indices of "wanted" elements
t = np.hstack((t, np.apply_along_axis(lambda row, arr:
np.nonzero(arr[row[0]] == row[1])[0], 1, t, m)))
The result, for your data sample (m array), is:
array([[0, 0, 4, 8],
[0, 1, 5, 7],
[0, 2, 1, 2],
[0, 3, 3, 6],
[0, 4, 0, 9],
[1, 0, 2, 6],
[1, 1, 7, 8],
[1, 2, 0, 5],
[1, 3, 4, 9],
[1, 4, 1, 3],
[2, 0, 0, 9],
[2, 1, 3, 7],
[2, 2, 1, 6],
[2, 3, 2, 4],
[2, 4, 5, 8],
[3, 0, 5, 6],
[3, 1, 1, 9],
[3, 2, 0, 2],
[3, 3, 4, 8],
[3, 4, 3, 7],
[4, 0, 1, 3],
[4, 1, 2, 9],
[4, 2, 0, 8],
[4, 3, 4, 7],
[4, 4, 5, 6]], dtype=int64)
Edit
The above code relies on the fact that each row in m contains
just 2 "wanted" values.
To make the code resistant to the case that some row contains either too many
or not enough "wanted" values (even none):
Define a function returning indices of "wanted" elements as:
def inds(row, arr):
ind = np.nonzero(arr[row[0]] == row[1])[0]
return np.pad(ind, (0,2), constant_values=99)[0:2]
Change the second instruction to:
t = np.hstack((t, np.apply_along_axis(inds, 1, t, m)))
To test this variant, change the first line of m to:
[4, 2, 2, 3, 5, 5, 3, 1, 5, 4]
i.e. it:
does not contain 0 elements,
contains only a single 1.
Then the initial part of the result is:
array([[ 0, 0, 99, 99],
[ 0, 1, 7, 99],
so that the missing indices in the result are filled with 99.
Consider the small sample of a 6-column integer array:
import numpy as np
J = np.array([[1, 3, 1, 3, 2, 5],
[2, 6, 3, 4, 2, 6],
[1, 7, 2, 5, 2, 5],
[4, 2, 8, 3, 8, 2],
[0, 3, 0, 3, 0, 3],
[2, 2, 3, 3, 2, 3],
[4, 3, 4, 3, 3, 4])
I want to remove, from J:
a) all rows where the first and second PAIRS of elements are exact matches
(this remove rows like [1,3, 1,3, 2,5])
b) all rows where the second and third PAIRS of elements are exact matches
(this remove rows like [1,7, 2,5, 2,5])
Matches between any other pairs are OK.
I have a solution, below, but it is handled in two steps. If there is a more direct, cleaner, or more readily extendable approach, I'd be very interested.
K = J[~(np.logical_and(J[:,0] == J[:,2], J[:,1] == J[:,3]))]
L = K[~(np.logical_and(K[:,2] == J[:,4], K[:,3] == K[:,5]))]
K removes the 1st, 5th, and 7th rows from J, leaving
K = [[2, 6, 3, 4, 2, 6],
[1, 7, 2, 5, 2, 5],
[4, 2, 8, 3, 8, 2],
[2, 2, 3, 3, 2, 3]])
L removes the 2nd row from K, giving the final outcome.
L = [[2, 6, 3, 4, 2, 6],
[4, 2, 8, 3, 8, 2],
[2, 2, 3, 3, 2, 3]])
I'm hoping for an efficient solution because, learning from this problem, I need to extend these ideas to 8-column arrays where
I eliminate rows having exact matches between the 1st and 2nd PAIRS, the 2nd and 3rd PAIRS, and the 3rd and 4th PAIRS.
Since we are checking for adjacent pairs for equality, a differencing on 3D reshaped data seems would be one way to do it for a cleaner vectorized one -
# a is input array
In [117]: b = a.reshape(a.shape[0],-1,2)
In [118]: a[~(np.diff(b,axis=1)==0).all(2).any(1)]
Out[118]:
array([[2, 6, 3, 4, 2, 6],
[4, 2, 8, 3, 8, 2],
[2, 2, 3, 3, 2, 3]])
If you are going for performance, skip the differencing and go for sliced equality -
In [142]: a[~(b[:,:-1] == b[:,1:]).all(2).any(1)]
Out[142]:
array([[2, 6, 3, 4, 2, 6],
[4, 2, 8, 3, 8, 2],
[2, 2, 3, 3, 2, 3]])
Generic no. of cols
Extends just as well on generic no. of cols -
In [156]: a
Out[156]:
array([[1, 3, 1, 3, 2, 5, 1, 3, 1, 3, 2, 5],
[2, 6, 3, 4, 2, 6, 2, 6, 3, 4, 2, 6],
[1, 7, 2, 5, 2, 5, 1, 7, 2, 5, 2, 5],
[4, 2, 8, 3, 8, 2, 4, 2, 8, 3, 8, 2],
[0, 3, 0, 3, 0, 3, 0, 3, 0, 3, 0, 3],
[2, 2, 3, 3, 2, 3, 2, 2, 3, 3, 2, 3],
[4, 3, 4, 3, 3, 4, 4, 3, 4, 3, 3, 4]])
In [158]: b = a.reshape(a.shape[0],-1,2)
In [159]: a[~(b[:,:-1] == b[:,1:]).all(2).any(1)]
Out[159]:
array([[4, 2, 8, 3, 8, 2, 4, 2, 8, 3, 8, 2],
[2, 2, 3, 3, 2, 3, 2, 2, 3, 3, 2, 3]])
Of course, we are assuming the number of cols allows pairing.
What you have is quite reasonable. Here's what I would write:
def eliminate_pairs(x: np.ndarray) -> np.ndarray:
first_second = (x[:, 0] == x[:, 2]) & (x[:, 1] == x[:, 3])
second_third = (x[:, 1] == x[:, 3]) & (x[:, 2] == x[:, 4])
return x[~(first_second | second_third)]
You could also apply DeMorgan's theorem and eliminate an extra not operation, but that's less important than clarity.
Let's try a loop:
mask = False
for i in range(0,3,2):
mask = (J[:,i:i+2]==J[:,i+2:i+4]).all(1) | mask
J[~mask]
Output:
array([[2, 6, 3, 4, 2, 6],
[4, 2, 8, 3, 8, 2],
[2, 2, 3, 3, 2, 3]])
I have a 9x9 multidimensional array that represents a sudoku game. I need to break it into it's 9 3x3 many components. How would this be done? I have absolutely no idea where to begin, here.
game = [
[1, 3, 2, 5, 7, 9, 4, 6, 8],
[4, 9, 8, 2, 6, 1, 3, 7, 5],
[7, 5, 6, 3, 8, 4, 2, 1, 9],
[6, 4, 3, 1, 5, 8, 7, 9, 2],
[5, 2, 1, 7, 9, 3, 8, 4, 6],
[9, 8, 7, 4, 2, 6, 5, 3, 1],
[2, 1, 4, 9, 3, 5, 6, 8, 7],
[3, 6, 5, 8, 1, 7, 9, 2, 4],
[8, 7, 9, 6, 4, 2, 1, 5, 3]
]
Split into chunks, it becomes
chunk_1 = [
[1, 3, 2],
[4, 9, 8],
[7, 5, 6]
]
chunk_2 = [
[5, 7, 9],
[2, 6, 1],
[3, 8, 4]
]
...and so on
That was a fun exercise!
Answer
game.each_slice(3).map{|stripe| stripe.transpose.each_slice(3).map{|chunk| chunk.transpose}}.flatten(1)
It would be cumbersome and not needed to define every chunk_1, chunk_2, ....
If you want chunk_2, you can use extract_chunks(game)[1]
It outputs [chunk_1, chunk_2, chunk_3, ..., chunk_9], so it's an Array of Arrays of Arrays :
1 3 2
4 9 8
7 5 6
5 7 9
2 6 1
3 8 4
4 6 8
3 7 5
2 1 9
6 4 3
5 2 1
...
You can define a method to check if this grid is valid (it is) :
def extract_chunks(game)
game.each_slice(3).map{|stripe| stripe.transpose.each_slice(3).map{|chunk| chunk.transpose}}.flatten(1)
end
class Array # NOTE: Use refinements if you don't want to patch Array
def has_nine_unique_elements?
self.flatten(1).uniq.size == 9
end
end
def valid?(game)
game.has_nine_unique_elements? &&
game.all?{|row| row.has_nine_unique_elements? } &&
game.all?{|column| column.has_nine_unique_elements? } &&
extract_chunks(game).all?{|chunk| chunk.has_nine_unique_elements? }
end
puts valid?(game) #=> true
Theory
The big grid can be sliced in 3 stripes, each containing 3 rows of 9 cells.
The first stripe will contain chunk_1, chunk_2 and chunk_3.
We need to cut the strip vertically into 3 chunks. To do so :
We transpose the strip,
Cut it horizontally with each_slice,
transpose back again.
We do the same for stripes #2 and #3.
To avoid returning an Array of Stripes of Chunks of Rows of Cells, we use flatten(1) to remove one level and return an Array of Chunks of Rows of Cells. :)
The method Matrix#minor is tailor-made for this:
require 'matrix'
def sub3x3(game, i, j)
Matrix[*game].minor(3*i, 3, 3*j, 3).to_a
end
chunk1 = sub3x3(game, 0, 0)
#=> [[1, 3, 2], [4, 9, 8], [7, 5, 6]]
chunk2 = sub3x3(game, 0, 1)
#=> [[5, 7, 9], [2, 6, 1], [3, 8, 4]]
chunk3 = sub3x3(game, 0, 2)
#=> [[4, 6, 8], [3, 7, 5], [2, 1, 9]]
chunk4 = sub3x3(game, 1, 0)
#=> [[6, 4, 3], [5, 2, 1], [9, 8, 7]]
...
chunk9 = sub3x3(game, 2, 2)
#=> [[6, 8, 7], [9, 2, 4], [1, 5, 3]]
Ruby has not concept of "rows" and "columns" of arrays. For convenience, therefore, I will refer to the 3x3 "subarray" of game, at offsets i and j (i = 0,1,2, j = 0,1,2), as the 3x3 submatrix of m = Matrix[*game] whose upper left value is at row offset 3*i and column offset 3*j of m, converted to an array.
This is relatively inefficient as a new matrix is created for the calculation of each "chunk". Considering the size of the array, this is not a problem, but rather than making that more efficient you really need to rethink the overall design. Creating nine local variables (rather than, say, an array of nine arrays) is not the way to go.
Here's a suggestion for checking the validity of game (that uses the method sub3x3 above) once all the open cells have been filled. Note that I've used the Wiki description of the game, in which the only valid entries are the digits 1-9, and I have assumed the code enforces that requirement when players enter values into cells.
def invalid_vector_index(game)
game.index { |vector| vector.uniq.size < 9 }
end
def sub3x3_invalid?(game, i, j)
sub3x3(game, i, j).flatten.uniq.size < 9
end
def valid?(game)
i = invalid_vector_index(game)
return [:ROW_ERR, i] if i
j = invalid_vector_index(game.transpose)
return [:COL_ERR, j] if j
m = Matrix[*game]
(0..2).each do |i|
(0..2).each do |j|
return [:SUB_ERR, i, j] if sub3x3_invalid?(game, i, j)
end
end
true
end
valid?(game)
#=> true
Notice this either returns true, meaning game is valid, or an array that both signifies that the solution is not valid and contains information that can be used to inform the player of the reason.
Now try
game[5], game[6] = game[6], game[5]
so
game
#=> [[1, 3, 2, 5, 7, 9, 4, 6, 8],
# [4, 9, 8, 2, 6, 1, 3, 7, 5],
# [7, 5, 6, 3, 8, 4, 2, 1, 9],
# [6, 4, 3, 1, 5, 8, 7, 9, 2],
# [5, 2, 1, 7, 9, 3, 8, 4, 6],
# [2, 1, 4, 9, 3, 5, 6, 8, 7],
# [9, 8, 7, 4, 2, 6, 5, 3, 1],
# [3, 6, 5, 8, 1, 7, 9, 2, 4],
# [8, 7, 9, 6, 4, 2, 1, 5, 3]]
valid?(game)
#=> [:SUB_ERR, 1, 0]
The rows and columns are obviously still valid, but this return value indicates that at least one 3x3 subarray is invalid and the array
[[6, 4, 3],
[5, 2, 1],
[2, 1, 4]]
was the first found to be invalid.
You could create a method that generates a single 3X3 chunk from a given index. since the sudoku board is of length 9, that will produce 9 3X3 chunks for you. see below.
#steps
#you'll loop through each index of the board
#to get the x value
#you divide the index by 3 and multiply by 3
#to get the y value
#you divide the index by 3, take remainder and multiply by 3
#for each x value, you can get 3 y values
#this will give you a single 3X3 box from one index so
def three_by3(index, sudoku)
#to get x value
x=(index/3)*3
#to get y value
y=(index%3)*3
(x...x+3).each_with_object([]) do |x,arr|
(y...y+3).each do |y|
arr<<sudoku[x][y]
end
end
end
sudoku = [ [1,2,3,4,5,6,7,8,9],
[2,3,4,5,6,7,8,9,1],
[3,4,5,6,7,8,9,1,2],
[1,2,3,4,5,6,7,8,9],
[2,3,4,5,6,7,8,9,1],
[3,4,5,6,7,8,9,1,2],
[1,2,3,4,5,6,7,8,9],
[2,3,4,5,6,7,8,9,1],
[3,4,5,6,7,8,9,1,2]]
p (0...sudoku.length).map {|i| three_by3(i,sudoku)}
#output:
#[[1, 2, 3, 2, 3, 4, 3, 4, 5],
# [4, 5, 6, 5, 6, 7, 6, 7, 8],
# [7, 8, 9, 8, 9, 1, 9, 1, 2],
# [1, 2, 3, 2, 3, 4, 3, 4, 5],
# [4, 5, 6, 5, 6, 7, 6, 7, 8],
# [7, 8, 9, 8, 9, 1, 9, 1, 2],
# [1, 2, 3, 2, 3, 4, 3, 4, 5],
# [4, 5, 6, 5, 6, 7, 6, 7, 8],
# [7, 8, 9, 8, 9, 1, 9, 1, 2]]
I am trying to generate all possible combinations of certain values in an array of 15 which add up to 50.
$a = [3, 4, 1, 2, 5]
print $a.repeated_permutation(15).to_a
In this case,
[2,2,2,2,4,4,4,4,4,4,4,4,4,3,3]
[2,2,2,4,2,4,4,4,4,4,4,4,4,3,3]
[2,2,4,2,2,4,4,4,4,4,4,4,4,3,3]
are all possible answers.
After some investigation I realize the code to do this is a bit over my head, but I will leave the question up if it might help someone else.
For some reference as to what I am working on, Project Euler, problem 114. It's pretty difficult, and so I am attempting to solve only a single case where my 50-space-long grid is filled only with 3-unit-long blocks. The blocks must be separated by at least one blank, so I am counting the blocks as 4. This (with some tweaking, which I have left out as this is confusing enough already) allows for twelve blocks plus three single blanks, or a maximum of fifteen elements.
Approach
I think recursion is the way to go here, where your recursive method looks like this:
def recurse(n,t)
where
n is the number of elements required; and
t is the required total.
If we let #arr be the array of integers you are given, recurse(n,t) returns an array of all permutations of n elements from #arr that sum to t.
Assumption
I have assumed that the elements of #arr are non-negative integers, sorted by size, but the method can be easily modified if it includes negative integers (though performance will suffer). Without loss of generality, we can assume the elements of #arr are unique, sorted by increasing magnitude.
Code
def recurse(n,t)
if n == 1
#arr.include?(t) ? [[t]] : nil
else
#arr.each_with_object([]) do |i,a|
break if i > t # as elements of #arr are non-decreasing
if (ret = recurse(n-1,t-i))
ret.each { |b| a << [i,*b] }
end
end
end
end
Examples
#arr = [3, 4, 1, 2, 5].sort
#=> [1, 2, 3, 4, 5]
recurse(1,4)
#=> [[4]]
recurse(2,6)
#=> [[1, 5], [2, 4], [3, 3], [4, 2], [5, 1]]
recurse(3,10)
#=> [[1, 4, 5], [1, 5, 4], [2, 3, 5], [2, 4, 4], [2, 5, 3],
# [3, 2, 5], [3, 3, 4], [3, 4, 3], [3, 5, 2], [4, 1, 5],
# [4, 2, 4], [4, 3, 3], [4, 4, 2], [4, 5, 1], [5, 1, 4],
# [5, 2, 3], [5, 3, 2], [5, 4, 1]]
recurse(3,50)
#=> []
Improvement
We can do better, however, by first computing all combinations, and then computing the permutations of each of those combinations.
def combo_recurse(n,t,last=0)
ndx = #arr.index { |i| i >= last }
return nil if ndx.nil?
arr_above = #arr[ndx..-1]
if n == 1
arr_above.include?(t) ? [[t]] : nil
else
arr_above.each_with_object([]) do |i,a|
break if i > t # as elements of #arr are non-decreasing
if (ret = combo_recurse(n-1,t-i,i))
ret.each { |b| a << [i,*b] }
end
end
end
end
combo_recurse(1,4)
#=> [[4]]
combo_recurse(2,6)
#=> [[1, 5], [2, 4], [3, 3]]
combo_recurse(3,10)
#=> [[1, 4, 5], [2, 3, 5], [2, 4, 4], [3, 3, 4]]
combo_recurse(3,50)
#=> []
combo_recurse(15,50).size
#=> 132
combo_recurse(15,50).first(5)
#=> [[1, 1, 1, 1, 1, 1, 4, 5, 5, 5, 5, 5, 5, 5, 5],
# [1, 1, 1, 1, 1, 2, 3, 5, 5, 5, 5, 5, 5, 5, 5],
# [1, 1, 1, 1, 1, 2, 4, 4, 5, 5, 5, 5, 5, 5, 5],
# [1, 1, 1, 1, 1, 3, 3, 4, 5, 5, 5, 5, 5, 5, 5],
# [1, 1, 1, 1, 1, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5]]
We can then compute the permutations from the combinations:
combo_recurse(2,6).flat_map { |a| a.permutation(a.size).to_a }.uniq
#=> [[1, 5], [5, 1], [2, 4], [4, 2], [3, 3]]
combo_recurse(3,10).flat_map { |a| a.permutation(a.size).to_a }.uniq
#=> [[1, 4, 5], [1, 5, 4], [4, 1, 5], [4, 5, 1], [5, 1, 4],
# [5, 4, 1], [2, 3, 5], [2, 5, 3], [3, 2, 5], [3, 5, 2],
# [5, 2, 3], [5, 3, 2], [2, 4, 4], [4, 2, 4], [4, 4, 2],
# [3, 3, 4], [3, 4, 3], [4, 3, 3]]
We can approximate the number of permutations for (15,50) (it will be somewhat high because uniq is not applied):
def factorial(n)
(1..n).reduce :*
end
Math.log10 combo_recurse(15,50).reduce(1) { |t,a| t*factorial(a.size) }
#=> 1599.3779486682888
That is, the result has about 1,600 digits. What platform will you be running this on?