Related
I've been trying to use Python lists as part of indices into Numpy arrays, and I'm seeing behavior that I don't understand.
The first example:
>>> a=np.zeros((5,5))
>>> a[0,[2,4]]=1
>>> a
array([[0., 0., 1., 0., 1.]
[0., 0., 0., 0., 0.]
[0., 0., 0., 0., 0.]
[0., 0., 0., 0., 0.]
[0., 0., 0., 0., 0.]])
The second example:
>>> a=np.zeros((5,5))
>>> a[[1,3],[2,4]]=1
>>> a
array([[0., 0., 0., 0., 0.]
[0., 0., 1., 0., 0.]
[0., 0., 0., 0., 0.]
[0., 0., 0., 0., 1.]
[0., 0., 0., 0., 0.]])
In the first example a[0,[2,4]], the first index is a scalar 0, the second a list.
It appears to me that the first index is treated as a specification of row, the second as column, and the first gets broadcast over the second to yield two row/column addresses [0,2] and [0,4].
In the second example a[[1,3],[2,4]], the first index is a list of rows, the second index is a list of columns, and they appear to be combined (broadcast?) to yield two row/column addresses [1,2] and [3,4].
Can someone help me better understand how Numpy array addressing works? I'm not sure what to Google for.
I want to create an array with numpy.zeros of the size 100k x 100k in google colab. The RAM is getting crashed whenever I try to do this operation. Is there any way that I can use the disk space and create one? Also, I have to perform operations on it by adding values to it.
import numpy as np
arr = np.zeros((100000,100000))
arr
np.zeros() is trying to allocate 80GB of RAM to fit the 100k x 100k array, which is not available to your instance. You try to use np.empty() instead:
import numpy as np
arr = np.empty((100000,100000))
arr
which will give you:
array([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]])
I have a black and white image of a triangle (it contains only 0 and 255 pixel values).
I've converted it to a numpy array called myArray
Currently, I can find the width of the bottom of the triangle (the number of black pixels) by using this code:
width = (max(numpy.where(myArray == 0)[1])) - (min(numpy.where(myArray == 0)[1]))
If the triangle was flipped upside-down, width would then apply to the top of the upside-down triangle.
What i'm trying to do is determine if the triangle is pointing up or down.
I could do this by finding the first row that contains a black pixel, and counting the number of black pixels in that row, calling this firstRow
and finding the last row that contains a black pixel, and counting the number of black pixels in that row, calling that lastRow
Then, if firstRow < lastRow, the triangle is pointing up.
What is the best way to calculate firstRow and lastRow?
With myArray with 255 for black pixel such as
array([[ 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 255., 0., 0., 0., 0.],
[ 0., 0., 0., 255., 255., 255., 0., 0., 0.],
[ 0., 0., 255., 255., 255., 255., 255., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
you can get all the rows with at least one black pixel with np.where after using sum on axis=1:
print (np.where(myArray.sum(axis=1)))
(array([3, 4, 5], dtype=int64),)
If you want to get the row with the maximum number of black pixels, you can use np.argmax still after sum on axis=1:
print (np.argmax(myArray.sum(axis=1)))
5
To know if the triangle is up or down, one way is to check if the argmax is the np.max element in the np.where(myArray.sum(axis=1)), then it would be up.
myArray_sum = myArray.sum(axis=1)
if np.max(np.where(myArray_sum)) == np.argmax(myArray_sum):
print ('up')
else:
print ('down')
If you want the first and last row, here is one way but it is related to the value of the black pixel.
myArray_sum = myArray.sum(axis=1)
firstRow = np.argmax(myArray_sum == 255)
lastRow = np.argmax(myArray_sum)
I'm trying to create an array of numpy arrays, each one with a different dimension.
So far, it seems to be fine. For example, if I run:
np.array([np.zeros((10,3)), np.zeros((11,8))])
the result is:
array([ array([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]]),
array([[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.]])], dtype=object)
The dimension of the two matrices are completely different and the array is generated without any problem. However, if the first dimension of the two matrices is the same, it doesn't work anymore:
np.array([np.zeros((10,3)), np.zeros((10,8))])
Traceback (most recent call last):
File "<ipython-input-123-97301e1424ae>", line 1, in <module>
a=np.array([np.zeros((10,3)), np.zeros((10,8))])
ValueError: could not broadcast input array from shape (10,3) into shape (10)
What is going on?
Thank you!
This has been hashed out before (Why do I get error trying to cast np.array(some_list) ValueError: could not broadcast input array;
Numpy array in array with unequal length). Basically np.array does one of 3 things:
make an n-dimensional array of a basic dtype, e.g. float.
make an object dtype array
raise an error, saying the first two are not possible.
These second two alternatives are fallback options, taken only if the first is impossible.
Without digging into the details of how the compiled code works, apparently what happens with
np.array([np.zeros((10,3)), np.zeros((10,8))])
is that it first sees the common first dimension, and deduces from that that it can take the first choice. It looks like it initialed a (10,2) array (2 items in your list), and tried to put the first array into the first row, hence the failed attempt to put a (10,3) array into a (10,) slot.
So if you really want an object dtype array, and not hit either the 1st or 3rd cases, you need to do some sort of 'round-about' creation.
PaulP and I have been exploring alternatives in Force numpy to create array of objects
Earlier: How to create a numpy array of lists?
In this question I suggest this iteration:
A=np.empty((3,),dtype=object)
for i,v in enumerate(A): A[i]=[v,i]
or in your case
In [451]: res = np.empty(2, object)
In [452]: alist = [np.zeros((10,3)), np.zeros((10,8))]
In [453]: for i,v in enumerate(alist):
...: res[i] = v
Prevent numpy from creating a multidimensional array
Rather than iterate on alist, it may work to do:
res[:] = alist
It seems to work in most cases that I've tried, but don't be surprised if you broadcasting errors.
Consider a small numpy array:
array([[ 0., 1., 0., 1., 0., 0., 0., 0., 0., 1.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 18., 15., 25., 0., 0., 0.],
[ 0., 0., 0., 23., 19., 20., 20., 0., 0., 0.],
[ 0., 0., 20., 22., 26., 23., 18., 0., 0., 0.],
[ 0., 0., 0., 23., 16., 20., 13., 0., 0., 0.],
[ 0., 0., 0., 0., 18., 20., 18., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 1., 0., 0., 1., 0., 0., 0., 0., 0.]])
I would like to plot, let say from the row number 3 to the row number 6, i.e. a section of my numpy array(I am coming from matlab backgroud). How could I loop this? or How could I plot multiple rows of my numpy array in the same graph?
So far I have tried; I define an arbitrary x:
x = np.arange(0,10)
then If I use
plt.plot(x,data[3,:])
to plot the third row and It does fine. The problem arises if I try:
plt.plot(x,data[3:4,:])
I get the error "x and y must have same first dimension", which I understand because he stacks row number 3 and row number 4 together, so that x and y do not have the same dimension. How can I overcome that?
Thank you
As the error implies, your data.shape = (1,10) is inconsistent with your input x.shape = (10,). To solve this problem you can just transpose your data using .T, i.e.
plt.plot(x, data[3:4,:].T)
Also, keep in mind that data[3:4,:] is the same as data[3,:], you will need to use data[3:5,:] to get the 3rd and 4th rows, for example.
Just a better application of psuedocubi's answer.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0,10,10)
y = np.array(YOUR DATA HERE)
plt.plot(x,a[3:4].T,'r--',label="x vs y1") #CONTAINS YOUR 3RD ROW
plt.plot(x,a[4:5].T,'g--',label="x vs y2") #CONTAINS YOUR 4TH ROW
plt.plot(x,a[5:6].T,'b--',label="x vs y3") #CONTAINS YOUR 5TH ROW
plt.legend(loc='best')
plt.xlabel("x")
plt.ylabel("y")
plt.show()
The x here as been plotted with your own data!
You can try:
for i in range(3):
plt.plot( x , data[ i , : ] )
plt.show()
If you want a range of rows ,for example from 3 to 6 , you can use:
range(3,7,1) , where 1 is the step , 3 is the starting row and 7 is the last row we want to plot (6 ) plus one