Related
Looking for some help with numpy and building a 3d array from multiply 2d arrays. I want to make a loop, such that on every iteration I make a new 2d array and make it a new slice in an existing 3d array. Here's my code sample.
import numpy as np
import random
import array
a = np.random.randint(0, 9, size=(10, 10)) <-- make random 10x10 matrix
b = a <-- save copy
a = np.random.randint(0, 9, size=(10, 10)) <-- make random 10x10 matrix
a.shape
(10, 10) <-- verify it's 10x10
b.shape
(10, 10) <-- verify it's 10x10
b = np.array([b, a]) <-- convert two 2d matrix into one 3d matrix
b.shape
(2, 10, 10) <-- verify it's a 3d matrix with two planes
a = np.random.randint(0, 9, size=(10, 10)) <-- make new random 10x10 matrix
b = np.array([b, a]) <-- add new 2d plane to the 3d matrix
b.shape
(2,) <-- should be (3, 10, 10)
Can anyone see what I'm doing wrong?
When you combine two arrays by using np.array([...]), they have to be the same shape. If they aren't numpy treats them not as numpy arrays, but as dumb/blind objects. There should have been a warning when you ran the last b = np.array([b, a]):
VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
Instead, use np.stack
b = np.stack([*b, a])
*b basically expands the children of b, so the above is equivalent to b = np.stack([b[0], b[1], a])
Or you can use np.vstack (vertical stack):
b = np.vstack([b, a[None]])
a[None] basically wraps a in another array. a.shape == (10, 10), a[None].shape == (1, 10, 10)
Both of the above produce the following:
>>> b.shape
(3, 10, 10)
>>> b
array([[[3, 8, 0, 2, 8, 0, 0, 5, 7, 7],
[0, 5, 2, 8, 8, 2, 1, 4, 5, 8],
[3, 2, 2, 4, 1, 8, 2, 0, 7, 5],
[5, 6, 5, 0, 8, 7, 4, 0, 4, 6],
[6, 2, 3, 7, 4, 3, 6, 6, 4, 8],
[2, 5, 1, 7, 1, 3, 0, 6, 0, 5],
[3, 4, 0, 7, 3, 4, 5, 0, 7, 4],
[0, 7, 2, 8, 7, 7, 4, 3, 2, 6],
[4, 6, 2, 5, 5, 8, 5, 8, 0, 8],
[3, 4, 1, 0, 3, 7, 0, 6, 7, 3]],
[[4, 0, 6, 2, 4, 4, 7, 0, 7, 2],
[5, 8, 5, 8, 2, 8, 3, 7, 4, 6],
[2, 1, 2, 0, 4, 5, 6, 3, 0, 0],
[8, 7, 3, 0, 8, 8, 0, 4, 1, 4],
[0, 2, 5, 7, 5, 3, 0, 5, 1, 7],
[1, 5, 8, 0, 2, 6, 5, 0, 3, 2],
[4, 4, 4, 3, 3, 8, 6, 6, 5, 5],
[5, 3, 6, 8, 0, 3, 0, 8, 8, 3],
[4, 2, 6, 6, 6, 2, 0, 0, 6, 2],
[7, 3, 8, 0, 7, 1, 1, 8, 6, 2]],
[[6, 6, 1, 1, 6, 4, 6, 2, 6, 7],
[0, 5, 6, 7, 5, 0, 0, 5, 8, 2],
[6, 6, 1, 5, 2, 3, 2, 3, 3, 2],
[0, 3, 7, 6, 4, 5, 3, 1, 7, 2],
[7, 6, 3, 0, 1, 7, 8, 3, 8, 5],
[3, 1, 8, 6, 1, 5, 0, 8, 6, 1],
[1, 4, 8, 1, 7, 0, 1, 1, 5, 3],
[2, 1, 4, 8, 2, 3, 1, 6, 8, 7],
[8, 1, 1, 0, 6, 1, 0, 6, 1, 6],
[1, 8, 4, 7, 7, 5, 0, 3, 8, 6]]])
Consider the array sample A.
import numpy as np
A = np.array([[2, 3, 6, 7, 3, 6, 7, 2],
[2, 3, 6, 7, 3, 6, 7, 7],
[2, 4, 3, 4, 6, 4, 9, 4],
[4, 9, 0, 1, 2, 5, 3, 0],
[5, 5, 2, 5, 4, 3, 7, 5],
[7, 5, 4, 8, 0, 1, 2, 6],
[7, 5, 4, 7, 3, 8, 0, 7]])
PROBLEM: I want to identify rows that have a specified number of DISTINCT element copies. The following code comes close: The code needs to be able to answer questions like "which rows of A have exactly 4 elements that appear twice?", or "which rows of A have exactly 1 element that appear three times?"
r,c = A.shape
nCopies = 4
s = np.sort(A,axis=1)
out = A[((s[:,1:] != s[:,:-1]).sum(axis=1)+1 == c - nCopies)]
This produces 2 output rows, both having 4 copied elements.
The 1st row has copies of 2,3,6,7. The 2nd row has copies of 3,6,7,7:
array([[2, 3, 6, 7, 3, 6, 7, 2],
[2, 3, 6, 7, 3, 6, 7, 7]])
My problem is that I don't want the 2nd output row because it only has 3 DISTINCT copies (ie: 3,6,7)
How can to code be modified to identify only distinct copies?
If I understand correctly, you want the rows of A that have 4 distinct values and every value must have at least one copy. You can leverage np.unique(return_counts=True) which returns 2 values, the distinct values and the count of each value.
counts = [np.unique(row,return_counts=True) for row in A ]
valid_indices = [ np.all(row[1] > 1) and row[0].shape[0] == 4 for row in counts ]
valid_rows = A[valid_indices]
Consider the following code fragment:
import numpy as np
mask = np.array([True, True, False, True, True, False])
val = np.array([9, 3])
arr = np.random.randint(1, 9, size = (5,len(mask)))
As expected, we get an array of random integers, 1 to 9, with 5 rows and 6 columns as below. The val array has not been used yet.
[[2, 7, 6, 9, 7, 5],
[7, 2, 9, 7, 8, 3],
[9, 1, 3, 5, 7, 3],
[5, 7, 4, 4, 5, 2],
[7, 7, 9, 6, 9, 8]]
Now I'll introduce val = [9, 3].
Where mask = True, I want the row element to be taken randomly from 1 to 9.
Where mask = False, I want the row element to be taken randomly from 1 to 3.
How can this be done efficiently? A sample output is shown below.
[[2, 7, 2, 9, 7, 1],
[7, 2, 1, 7, 8, 3],
[9, 1, 3, 5, 7, 3],
[5, 7, 1, 4, 5, 2],
[7, 7, 2, 6, 9, 1]]
One idea is to sample randomly between 0 to 1, then multiply with 9 or 3 depending on mask, and finally add 1 to move the sample.
rand = np.random.rand(5,len(mask))
is3 = (1-mask).astype(int)
# out is random from 0-8 or 0-2 depending on `is3`
out = (rand*val[is3]).astype(int)
# move out by `1`:
out = (out + 1)
Output:
array([[4, 9, 3, 6, 2, 1],
[1, 8, 2, 7, 1, 3],
[8, 2, 1, 2, 3, 2],
[4, 3, 2, 2, 3, 2],
[5, 8, 1, 5, 6, 1]])
We are given an array sample a, shown below, and a constant c.
import numpy as np
a = np.array([[1, 3, 1, 11, 9, 14],
[2, 12, 1, 10, 7, 6],
[6, 7, 2, 14, 2, 15],
[14, 8, 1, 3, -7, 2],
[0, -3, 0, 3, -3, 0],
[2, 2, 3, 3, 12, 13],
[3, 14, 4, 12, 1, 4],
[0, 13, 13, 4, 0, 3]])
c = 2
It is convenient, in this problem, to think of each array row as being composed of three pairs, so the 1st row is [1,3, 1,11, 9,14].
DEFINITION: d_min is the minimum difference between the elements of two consecutive pairs.
The PROBLEM: I want to retain rows of array a, where all consecutive pairs have d_min <= c. Otherwise, the rows should be eliminated.
In the 1st array row, the 1st pair (1,3) and the 2nd pair (1,11) have d_min = 1-1=0.
The 2nd pair (1,11) and the 3rd pair(9,14) have d_min = 11-9=2. (in both cases, d_min<=c, so we keep this row in a)
In the 2nd array row, the 1st pair (2,12) and the 2nd pair (1,10) have d_min = 2-1=1.
But, the 2nd pair (1,10) and the 3rd pair(7,6) have d_min = 10-7=3. (3 > c, so this row should be eliminated from array a)
Current efforts: I currently handle this problem with nested for-loops (2 deep).
The outer loop runs through the rows of array a, determining d_min between the first two pairs using:
for r in a
d_min = np.amin(np.abs(np.subtract.outer(r[:2], r[2:4])))
The inner loop uses the same method to determine the d_min between the last two pairs.
Further processing only is done only when d_min<= c for both sets of consecutive pairs.
I'm really hoping there is a way to avoid the for-loops. I eventually need to deal with 8-column arrays, and my current approach would involve 3-deep looping.
In the example, there are 4 row eliminations. The final result should look like:
a = np.array([[1, 3, 1, 11, 9, 14],
[0, -3, 0, 3, -3, 0],
[3, 14, 4, 12, 1, 4],
[0, 13, 13, 4, 0, 3]])
Assume the number of elements in each row is always even:
import numpy as np
a = np.array([[1, 3, 1, 11, 9, 14],
[2, 12, 1, 10, 7, 6],
[6, 7, 2, 14, 2, 15],
[14, 8, 1, 3, -7, 2],
[0, -3, 0, 3, -3, 0],
[2, 2, 3, 3, 12, 13],
[3, 14, 4, 12, 1, 4],
[0, 13, 13, 4, 0, 3]])
c = 2
# separate the array as previous pairs and next pairs
sx, sy = a.shape
prev_shape = sx, (sy - 2) // 2, 1, 2
next_shape = sx, (sy - 2) // 2, 2, 1
prev_pairs = a[:, :-2].reshape(prev_shape)
next_pairs = a[:, 2:].reshape(next_shape)
# subtract which will effectively work as outer subtraction due to numpy broadcasting, and
# calculate the minimum difference for each pair
pair_diff_min = np.abs(prev_pairs - next_pairs).min(axis=(2, 3))
# calculate the filter condition as boolean array
to_keep = pair_diff_min.max(axis=1) <= c
print(a[to_keep])
#[[ 1 3 1 11 9 14]
# [ 0 -3 0 3 -3 0]
# [ 3 14 4 12 1 4]
# [ 0 13 13 4 0 3]]
Demo Link
I have a 9x9 multidimensional array that represents a sudoku game. I need to break it into it's 9 3x3 many components. How would this be done? I have absolutely no idea where to begin, here.
game = [
[1, 3, 2, 5, 7, 9, 4, 6, 8],
[4, 9, 8, 2, 6, 1, 3, 7, 5],
[7, 5, 6, 3, 8, 4, 2, 1, 9],
[6, 4, 3, 1, 5, 8, 7, 9, 2],
[5, 2, 1, 7, 9, 3, 8, 4, 6],
[9, 8, 7, 4, 2, 6, 5, 3, 1],
[2, 1, 4, 9, 3, 5, 6, 8, 7],
[3, 6, 5, 8, 1, 7, 9, 2, 4],
[8, 7, 9, 6, 4, 2, 1, 5, 3]
]
Split into chunks, it becomes
chunk_1 = [
[1, 3, 2],
[4, 9, 8],
[7, 5, 6]
]
chunk_2 = [
[5, 7, 9],
[2, 6, 1],
[3, 8, 4]
]
...and so on
That was a fun exercise!
Answer
game.each_slice(3).map{|stripe| stripe.transpose.each_slice(3).map{|chunk| chunk.transpose}}.flatten(1)
It would be cumbersome and not needed to define every chunk_1, chunk_2, ....
If you want chunk_2, you can use extract_chunks(game)[1]
It outputs [chunk_1, chunk_2, chunk_3, ..., chunk_9], so it's an Array of Arrays of Arrays :
1 3 2
4 9 8
7 5 6
5 7 9
2 6 1
3 8 4
4 6 8
3 7 5
2 1 9
6 4 3
5 2 1
...
You can define a method to check if this grid is valid (it is) :
def extract_chunks(game)
game.each_slice(3).map{|stripe| stripe.transpose.each_slice(3).map{|chunk| chunk.transpose}}.flatten(1)
end
class Array # NOTE: Use refinements if you don't want to patch Array
def has_nine_unique_elements?
self.flatten(1).uniq.size == 9
end
end
def valid?(game)
game.has_nine_unique_elements? &&
game.all?{|row| row.has_nine_unique_elements? } &&
game.all?{|column| column.has_nine_unique_elements? } &&
extract_chunks(game).all?{|chunk| chunk.has_nine_unique_elements? }
end
puts valid?(game) #=> true
Theory
The big grid can be sliced in 3 stripes, each containing 3 rows of 9 cells.
The first stripe will contain chunk_1, chunk_2 and chunk_3.
We need to cut the strip vertically into 3 chunks. To do so :
We transpose the strip,
Cut it horizontally with each_slice,
transpose back again.
We do the same for stripes #2 and #3.
To avoid returning an Array of Stripes of Chunks of Rows of Cells, we use flatten(1) to remove one level and return an Array of Chunks of Rows of Cells. :)
The method Matrix#minor is tailor-made for this:
require 'matrix'
def sub3x3(game, i, j)
Matrix[*game].minor(3*i, 3, 3*j, 3).to_a
end
chunk1 = sub3x3(game, 0, 0)
#=> [[1, 3, 2], [4, 9, 8], [7, 5, 6]]
chunk2 = sub3x3(game, 0, 1)
#=> [[5, 7, 9], [2, 6, 1], [3, 8, 4]]
chunk3 = sub3x3(game, 0, 2)
#=> [[4, 6, 8], [3, 7, 5], [2, 1, 9]]
chunk4 = sub3x3(game, 1, 0)
#=> [[6, 4, 3], [5, 2, 1], [9, 8, 7]]
...
chunk9 = sub3x3(game, 2, 2)
#=> [[6, 8, 7], [9, 2, 4], [1, 5, 3]]
Ruby has not concept of "rows" and "columns" of arrays. For convenience, therefore, I will refer to the 3x3 "subarray" of game, at offsets i and j (i = 0,1,2, j = 0,1,2), as the 3x3 submatrix of m = Matrix[*game] whose upper left value is at row offset 3*i and column offset 3*j of m, converted to an array.
This is relatively inefficient as a new matrix is created for the calculation of each "chunk". Considering the size of the array, this is not a problem, but rather than making that more efficient you really need to rethink the overall design. Creating nine local variables (rather than, say, an array of nine arrays) is not the way to go.
Here's a suggestion for checking the validity of game (that uses the method sub3x3 above) once all the open cells have been filled. Note that I've used the Wiki description of the game, in which the only valid entries are the digits 1-9, and I have assumed the code enforces that requirement when players enter values into cells.
def invalid_vector_index(game)
game.index { |vector| vector.uniq.size < 9 }
end
def sub3x3_invalid?(game, i, j)
sub3x3(game, i, j).flatten.uniq.size < 9
end
def valid?(game)
i = invalid_vector_index(game)
return [:ROW_ERR, i] if i
j = invalid_vector_index(game.transpose)
return [:COL_ERR, j] if j
m = Matrix[*game]
(0..2).each do |i|
(0..2).each do |j|
return [:SUB_ERR, i, j] if sub3x3_invalid?(game, i, j)
end
end
true
end
valid?(game)
#=> true
Notice this either returns true, meaning game is valid, or an array that both signifies that the solution is not valid and contains information that can be used to inform the player of the reason.
Now try
game[5], game[6] = game[6], game[5]
so
game
#=> [[1, 3, 2, 5, 7, 9, 4, 6, 8],
# [4, 9, 8, 2, 6, 1, 3, 7, 5],
# [7, 5, 6, 3, 8, 4, 2, 1, 9],
# [6, 4, 3, 1, 5, 8, 7, 9, 2],
# [5, 2, 1, 7, 9, 3, 8, 4, 6],
# [2, 1, 4, 9, 3, 5, 6, 8, 7],
# [9, 8, 7, 4, 2, 6, 5, 3, 1],
# [3, 6, 5, 8, 1, 7, 9, 2, 4],
# [8, 7, 9, 6, 4, 2, 1, 5, 3]]
valid?(game)
#=> [:SUB_ERR, 1, 0]
The rows and columns are obviously still valid, but this return value indicates that at least one 3x3 subarray is invalid and the array
[[6, 4, 3],
[5, 2, 1],
[2, 1, 4]]
was the first found to be invalid.
You could create a method that generates a single 3X3 chunk from a given index. since the sudoku board is of length 9, that will produce 9 3X3 chunks for you. see below.
#steps
#you'll loop through each index of the board
#to get the x value
#you divide the index by 3 and multiply by 3
#to get the y value
#you divide the index by 3, take remainder and multiply by 3
#for each x value, you can get 3 y values
#this will give you a single 3X3 box from one index so
def three_by3(index, sudoku)
#to get x value
x=(index/3)*3
#to get y value
y=(index%3)*3
(x...x+3).each_with_object([]) do |x,arr|
(y...y+3).each do |y|
arr<<sudoku[x][y]
end
end
end
sudoku = [ [1,2,3,4,5,6,7,8,9],
[2,3,4,5,6,7,8,9,1],
[3,4,5,6,7,8,9,1,2],
[1,2,3,4,5,6,7,8,9],
[2,3,4,5,6,7,8,9,1],
[3,4,5,6,7,8,9,1,2],
[1,2,3,4,5,6,7,8,9],
[2,3,4,5,6,7,8,9,1],
[3,4,5,6,7,8,9,1,2]]
p (0...sudoku.length).map {|i| three_by3(i,sudoku)}
#output:
#[[1, 2, 3, 2, 3, 4, 3, 4, 5],
# [4, 5, 6, 5, 6, 7, 6, 7, 8],
# [7, 8, 9, 8, 9, 1, 9, 1, 2],
# [1, 2, 3, 2, 3, 4, 3, 4, 5],
# [4, 5, 6, 5, 6, 7, 6, 7, 8],
# [7, 8, 9, 8, 9, 1, 9, 1, 2],
# [1, 2, 3, 2, 3, 4, 3, 4, 5],
# [4, 5, 6, 5, 6, 7, 6, 7, 8],
# [7, 8, 9, 8, 9, 1, 9, 1, 2]]