minus specify elements in 2D array numpy - arrays

Assume there is a matrix X, a mask and a vector y
>>> X
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
>>> mask
array([[False, True, True, True],
[ True, False, True, True],
[ True, True, False, True],
[ True, True, True, False]], dtype=bool)
>>> y
[8, 9, 10]
I want each row of X where mask is true minus y. so i get the result
>>> x[mask].reshape(4,3)-y
array([[-7, -7, -7],
[-4, -3, -3],
[ 0, 0, 1],
[ 4, 4, 4]])
But i want to keep the X a 4*4 matrix. That means where the mask is False, the value of X should not be changed. what should i do? Thanks.

Two approaches could be suggested for in-place edits.
Approach #1 : Boolean-index into X. Reshape it to have same number of elements as number of elements in y. Subtract y from it, thus leveraging broadcasting. Finally index into X with the same mask and assign flattened subtracted values.
-
X[mask] = (X[mask].reshape(X.shape[0],-1) - y).ravel()
Approach #2 : Resize y to have same number of elements as the number of True elements in mask and simply subtract from the masked places in X -
X[mask] -= np.resize(y,mask.sum())
Sample runs -
In [55]: X # Input array
Out[55]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
# Using approach #1
In [56]: X[mask] = (X[mask].reshape(X.shape[0],-1) - y).ravel()
In [57]: X # Changed input array
Out[57]:
array([[ 0, -7, -7, -7],
[-4, 5, -3, -3],
[ 0, 0, 10, 1],
[ 4, 4, 4, 15]])
In [59]: X # Input array
Out[59]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
# Using approach #2
In [60]: X[mask] -= np.resize(y,mask.sum())
In [61]: X # Changed input array
Out[61]:
array([[ 0, -7, -7, -7],
[-4, 5, -3, -3],
[ 0, 0, 10, 1],
[ 4, 4, 4, 15]])

Related

Sort a numpy array using multiple index and different order

I have an array of size 300x5 and I am trying to sort the array in such a way that column with index 4 is my primary index and in ascending order, index 1 is secondary index and in descending order, index 3 is tertiary index and in ascending order.
I have tried this using following code,
idx = np.lexsort((arr[:,3],arr[:,1][::-1],arr[:,4]))
arr= arr[idx]
where arr --> array of size 300x5
On executing this the secondary index also gets sorted in ascending order instead of descending order. Can anyone help me with this
I think you want -arr[:,1] and not arr[:,1][::-1] as the secondary index.
>>> import numpy as np
>>> arr = np.random.randint(0, 21, (300, 5))
>>> arr
array([[ 0, 19, 6, 19, 17],
[16, 2, 14, 17, 0],
[ 8, 17, 3, 17, 12],
...,
[ 4, 18, 18, 3, 8],
[10, 15, 4, 12, 4],
[ 9, 16, 12, 0, 12]])
>>> idx = np.lexsort((arr[:,3],-arr[:,1],arr[:,4]))
>>> arr = arr[idx]
>>> arr
array([[11, 20, 11, 18, 0],
[11, 16, 12, 2, 0],
[ 9, 16, 4, 8, 0],
...,
[20, 4, 5, 11, 20],
[ 9, 4, 0, 19, 20],
[ 9, 2, 4, 10, 20]])

Numpy argsort while distinguishing values of 0

I have a very large array but here I will show a simplified case:
a = np.array([[3, 0, 5, 0], [8, 7, 6, 10], [5, 4, 0, 10]])
array([[ 3, 0, 5, 0],
[ 8, 7, 6, 10],
[ 5, 4, 0, 10]])
I want to argsort() the array but have a way to distinguish 0s. I tried to replace it with NaN:
a = np.array([[3, np.nan, 5, np.nan], [8, 7, 6, 10], [5, 4, np.nan, 10]])
a.argsort()
array([[0, 2, 1, 3],
[2, 1, 0, 3],
[1, 0, 3, 2]])
But the NaNs are still being sorted. Is there any way to make argsort give it a value of -1 or something. Or is there another option other than NaN to replace 0s? I tried math.inf with no success as well. Anybody has any ideas?
The purpose of doing this is that I have a cosine similarity matrix, and I want to exclude those instances where similarities are 0. I am using argsort() to get the highest similarities, which will give me the indices to another table with mappings to labels. If an array's entire similarity is 0 ([0,0,0]), then I want to ignore it. So if I can get argsort() to output it as [-1,-1,-1] after sorting, I can check to see if the entire array is -1 and exclude it.
EDIT:
So output should be:
array([[0, 2, -1, -1],
[2, 1, 0, 3],
[1, 0, 3, -1]])
So when using the last row to refer back to a: the smallest will be a[1], which is 4, followed by a[0], which is 5, then a[3], which is 10, and at last -1, which is the 0
You may want to use numpy.ma.array() like this
a = np.array([[3,4,5],[8,7,6],[5,4,0]])
mask this array with condition a==0,
a_mask = np.ma.array(a, mask=(a==0))
print(a_mask)
# output
masked_array(
data=[[3, 4, 5],
[8, 7, 6],
[5, 4, --]],
mask=[[False, False, False],
[False, False, False],
[False, False, True]],
fill_value=999999)
print(a_mask.mask)
# outputs
array([[False, False, False],
[False, False, False],
[False, False, True]])
and you can use the mask attribute of masked_array to distinguish elements you want to label and fill in other values.
If you mean "distinguish 0s" as the highest value or lowest values, I would suggest trying:
a[a==0]=(a.max()+1)
or:
a[a==0]=(a.min()-1)
One way to achieve the task is to first generate a boolean mask checking for zero values (since you want to distinguish this in the array), then sort it and then use the boolean mask to set the desired values (e.g., -1)
# your unmodified input array
In [294]: a
Out[294]:
array([[3, 4, 5],
[8, 7, 6],
[5, 4, 0]])
# boolean mask checking for zero
In [295]: zero_bool_mask = a == 0
In [296]: zero_bool_mask
Out[296]:
array([[False, False, False],
[False, False, False],
[False, False, True]])
# usual argsort
In [297]: sorted_idxs = np.argsort(a)
In [298]: sorted_idxs
Out[298]:
array([[0, 1, 2],
[2, 1, 0],
[2, 1, 0]])
# replace the indices of 0 with desired value (e.g., -1)
In [299]: sorted_idxs[zero_bool_mask] = -1
In [300]: sorted_idxs
Out[300]:
array([[ 0, 1, 2],
[ 2, 1, 0],
[ 2, 1, -1]])
Following this, to account for the correct sorting indices after the substitution value (e.g., -1), we have to perform this final step:
In [327]: sorted_idxs - (sorted_idxs == -1).sum(1)[:, None]
Out[327]:
array([[ 0, 1, 2],
[ 2, 1, 0],
[ 1, 0, -2]])
So now the sorted_idxs with negative values are the locations where you had zeros in the original array.
Thus, we can have a custom function like so:
def argsort_excluding_zeros(arr, replacement_value):
zero_bool_mask = arr == 0
sorted_idxs = np.argsort(arr)
sorted_idxs[zero_bool_mask] = replacement_value
return sorted_idxs - (sorted_idxs == replacement_value).sum(1)[:, None]
# another array
In [339]: a
Out[339]:
array([[0, 4, 5],
[8, 7, 6],
[5, 4, 0]])
# sample run
In [340]: argsort_excluding_zeros(a, replacement_value=-1)
Out[340]:
array([[-2, 0, 1],
[ 2, 1, 0],
[ 1, 0, -2]])
Using #kmario23 and #ScienceSnake code, I came up with the solution:
a = np.array([[3, 0, 5, 0], [8, 7, 6, 10], [5, 4, 0, 10]])
b = np.where(a == 0, np.inf, a) # Replace 0 -> inf to make them sorted last
s = b.copy() # make a copy of b to sort it
s.sort()
mask = s == np.inf # create a mask to get inf locations after sorting
c = b.argsort()
d = np.where(mask, -1, c) # Replace where the zeros were originally with -1
Out:
array([[ 0, 2, -1, -1],
[ 2, 1, 0, 3],
[ 1, 0, 3, -1]])
Not the most efficient solution because it is sorting twice.....
There might be a slightly more efficient alternative, but this works in pure numpy and is very transparent.
import numpy as np
a = np.array([[3, 0, 5, 0], [8, 7, 6, 10], [5, 4, 0, 10]])
b = np.where(a == 0, np.inf, a) # Replace 0 -> inf to make them sorted last
c = b.argsort()
d = np.where(a == 0, -1, c) # Replace where the zeros were originally with -1
print(d)
outputs
[[ 0 -1 1 -1]
[ 2 1 0 3]
[ 1 0 -1 2]]
To save memory, some of the in-between assignments can be skipped, but I left it this way for clarity.
*** EDIT ***
The OP has clarified exactly what output they want. This is my new solution which has only one sort.
a = np.array([[3, 0, 5, 0], [8, 7, 6, 10], [5, 4, 0, 10]])
b = np.where(a == 0, np.inf, a).argsort()
def remove_invalid_entries(row, num_valid):
row[num_valid.pop():] = -1
return row
num_valid = np.flip(np.count_nonzero(a, 1)).tolist()
b = np.apply_along_axis(remove_invalid_entries, 1, b, num_valid)
print(b)
> [[ 0 2 -1 -1]
[ 2 1 0 3]
[ 1 0 3 -1]]
The start is as before. Then, we go through the argsorted list row by row, and replace the last n elements by -1, where n is the number of 0's that are in the corresponding row of the original list. The fastest way of doing this is with np.apply_along_axis. Here, I counted all the zeros in each row of a, and turn it into a list (reversed order) so that I can use pop() to get the number of elements to keep in the current row of b being iterated over by np.apply_along_axis.

array rows where the random-integer elements may have different ranges

Consider the following code fragment:
import numpy as np
mask = np.array([True, True, False, True, True, False])
val = np.array([9, 3])
arr = np.random.randint(1, 9, size = (5,len(mask)))
As expected, we get an array of random integers, 1 to 9, with 5 rows and 6 columns as below. The val array has not been used yet.
[[2, 7, 6, 9, 7, 5],
[7, 2, 9, 7, 8, 3],
[9, 1, 3, 5, 7, 3],
[5, 7, 4, 4, 5, 2],
[7, 7, 9, 6, 9, 8]]
Now I'll introduce val = [9, 3].
Where mask = True, I want the row element to be taken randomly from 1 to 9.
Where mask = False, I want the row element to be taken randomly from 1 to 3.
How can this be done efficiently? A sample output is shown below.
[[2, 7, 2, 9, 7, 1],
[7, 2, 1, 7, 8, 3],
[9, 1, 3, 5, 7, 3],
[5, 7, 1, 4, 5, 2],
[7, 7, 2, 6, 9, 1]]
One idea is to sample randomly between 0 to 1, then multiply with 9 or 3 depending on mask, and finally add 1 to move the sample.
rand = np.random.rand(5,len(mask))
is3 = (1-mask).astype(int)
# out is random from 0-8 or 0-2 depending on `is3`
out = (rand*val[is3]).astype(int)
# move out by `1`:
out = (out + 1)
Output:
array([[4, 9, 3, 6, 2, 1],
[1, 8, 2, 7, 1, 3],
[8, 2, 1, 2, 3, 2],
[4, 3, 2, 2, 3, 2],
[5, 8, 1, 5, 6, 1]])

Indexing highest value of numpy matrix

I have a numpy array of shape (4, 7) like this:
array([[ 1, 4, 5, 7, 8, 6, 7]
[ 2, 23, 2, 4, 8, 94, 2],
[ 1, 5, 6, 7, 10, 15, 20],
[ 3, 9, 2, 7, 6, 5, 4]])
I would like to get the index of the highest element, i.e. 94, in a form like: first row fifth column. Thus the output should be a numpy array ([1,5]) (matlab-style).
You get the index of the maximum index using arr.argmax() but to get the actual row and column you must use np.unravel_index as below:
import numpy as np
arr = np.array([[ 1, 4, 5, 7, 8, 6, 7],
[ 2, 23, 2, 4, 8, 94, 2],
[ 1, 5, 6, 7, 10, 15, 20],
[ 3, 9, 2, 7, 6, 5, 4]])
maximum = np.unravel_index(arr.argmax(), arr.shape)
print(maximum)
# (1, 5)
You have to use np.unravel_index as by default np.argmax will return the index from a flattened array (which in your case would be index 12).

Python 3.x IndexError while using nested For loops

So I've been trying to code a tabletop game that I made a long time ago - I'm working on the graphic section now, and I'm trying to draw the 9x7 tile map using nested For loops:
I'm using the numpy library for my 2d array
gameboard = array( [[8, 8, 8, 7, 7, 7, 8, 8, 8],
[8, 3, 6, 7, 7, 7, 6, 3, 8],
[0, 1, 1, 6, 6, 6, 1, 1, 0],
[0, 5, 4, 0, 0, 0, 4, 5, 0],
[0, 3, 2, 0, 0, 0, 2, 3, 0],
[8, 8, 1, 0, 0, 0, 1, 8, 8],
[8, 8, 8, 6, 6, 6, 8, 8, 8]] )
def mapdraw():
for x in [0, 1, 2, 3, 4, 5, 6, 7, 8]:
for y in [0, 1, 2, 3, 4, 5, 6]:
if gameboard[(x, y)] == 1:
#insert tile 1 at location
elif gameboard[(x, y)] == 2:
#insert tile 2 at location
elif gameboard[(x, y)] == 3:
#insert tile 3 at location
#this continues for all 8 tiles
#graphics update
When I run this program, i get an error on the line "if gameboard[(x,y)] == 1:"
"IndexError: index (7) out of range (0<=index<7) in dimension 0"
I've looked for hours to find what this error even means, and have tried many different ways to fix it: any help would be appreciated.
You have to index the array using [y,x] because the first coordinate is the row index (which, for you, is the y index).
As an aside, please iterate over a range instead of an explicit list!
for x in range(9):
for y in range(7):
if gameboard[y, x] == 1:
#insert tile 1 at location
...

Resources