array in array to array in numpy - arrays

Dear friends in stack overflow,
I have trouble calculation with Numpy and Sympy. A is defined by
import numpy as np
import sympy as sym
sym.var('x y')
f = sym.Matrix([0,x,y])
func = sym.lambdify( (x,y), f, "numpy")
X=np.array([1,2,3])
Y=np.array((1,2,3])
A = func(X,Y).
Here, X and Y are just examples. In general, X and Y are one dimensional array in numpy, and they have the same length. Then, A’s output is
array([[0],
[array([1, 2, 3])],
[array([1, 2, 3])]], dtype=object).
But, I'd like to get this as
np.array([[0,0,0],[1,2,3],[1,2,3]]).
If we call this B, How do you convert A to B automatically. B’s first column is filled by 0, and it has the same length with X and Y.
Do you have any ideas?

First let's make sure we understand what is happening:
In [52]: x, y = symbols('x y')
In [54]: f = Matrix([0,x,y])
...: func = lambdify( (x,y), f, "numpy")
In [55]: f
Out[55]:
⎡0⎤
⎢ ⎥
⎢x⎥
⎢ ⎥
⎣y⎦
In [56]: print(func.__doc__)
Created with lambdify. Signature:
func(x, y)
Expression:
Matrix([[0], [x], [y]])
Source code:
def _lambdifygenerated(x, y):
return (array([[0], [x], [y]]))
See how the numpy function looks just like the sympy, replacing sym.Matrix with np.array. lambdify just does a lexographic translation; it does not have a deep knowledge of the differences between the languages.
With scalars the func runs as expected:
In [57]: func(1,2)
Out[57]:
array([[0],
[1],
[2]])
With arrays the results is this ragged array (new enough numpy adds this warning:
In [59]: func(np.array([1,2,3]),np.array([1,2,3]))
<lambdifygenerated-2>:2: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
return (array([[0], [x], [y]]))
Out[59]:
array([[0],
[array([1, 2, 3])],
[array([1, 2, 3])]], dtype=object)
If you don't know numpy, sympy is not a short cut to filling in your knowledge gaps.
The simplest fix is to replace original 0 with another symbol.
Even in sympy, the 0 is not expanded:
In [65]: f.subs({x:Matrix([[1,2,3]]), y:Matrix([[4,5,6]])})
Out[65]:
⎡ 0 ⎤
⎢ ⎥
⎢[1 2 3]⎥
⎢ ⎥
⎣[4 5 6]⎦
In [74]: Matrix([[0,0,0],[1,2,3],[4,5,6]])
Out[74]:
⎡0 0 0⎤
⎢ ⎥
⎢1 2 3⎥
⎢ ⎥
⎣4 5 6⎦
In [75]: Matrix([[0],[1,2,3],[4,5,6]])
...
ValueError: mismatched dimensions
To make the desired array in numpy we have to do something like:
In [71]: arr = np.zeros((3,3), int)
In [72]: arr[1:,:] = [[1,2,3],[4,5,6]]
In [73]: arr
Out[73]:
array([[0, 0, 0],
[1, 2, 3],
[4, 5, 6]])
That is, initial the array and fill selected rows. There isn't simple expression that will do the desired 'automaticlly fill the first row with 0', much less something that can be naively translated from sympy.

Related

Slicing numpy arrays

mean = [0, 0]
cov = [[1, 0], [0, 100]]
gg = np.random.multivariate_normal(mean, cov, size = [5, 12])
I get an array which has 2 columns and 12 rows, i want to take the first column which will include all 12 rows and convert them to columns. What is the appropriate method for sclicing and how can one convet the result to columns? To be precise, looking at the screen (the second one) one should take all 0 column columns and convert them in a normal way from the left to the right
the results should be like this (the first screen)
The problem is that your array gg is not two- but three-dimensional. So, what you need is in fact the first column of each stacked 2D array. Here is an example:
import numpy as np
x = np.random.randint(0, 10, (3, 4, 5))
x[:, :, 0].flatten()
The colon in slicing means "all values in this dimension". So, x[:, :, 0] means "all values in the the first dimension and all values in the second dimension and with third dimension fixed on index 0". This results in a two-dimensional array, which you have to flatten additionally.

Selecting numpy array elements

I have the task of selecting p% of elements within a given numpy array. For example,
# Initialize 5 x 3 array-
x = np.random.randint(low = -10, high = 10, size = (5, 3))
x
'''
array([[-4, -8, 3],
[-9, -1, 5],
[ 9, 1, 1],
[-1, -1, -5],
[-1, -4, -1]])
'''
Now, I want to select say p = 30% of the numbers in x, so 30% of numbers in x is 5 (rounded up).
Is there a way to select these 30% of numbers in x? Where p can change and the dimensionality of numpy array x can be 3-D or maybe more.
I am using Python 3.7 and numpy 1.18.1
Thanks
You can use np.random.choice to sample without replacement from a 1d numpy array:
p = 0.3
np.random.choice(x.flatten(), int(x.size * p) , replace=False)
For large arrays, the performance of sampling without replacement can be pretty bad, but there are some workarounds.
You can randome choice 0,1 and usenp.nonzero and boolean indexing:
np.random.seed(1)
x[np.nonzero(np.random.choice([1, 0], size=x.shape, p=[0.3,0.7]))]
Output:
array([ 3, -1, 5, 9, -1, -1])
I found a way of selecting p% of numpy elements:
p = 20
# To select p% of elements-
x_abs[x_abs < np.percentile(x_abs, p)]
# To select p% of elements and set them to a value (in this case, zero)-
x_abs[x_abs < np.percentile(x_abs, p)] = 0

How does numpy determine the dimensions of a column vector?

I'm starting out with numpy and was trying to figure out how its arrays work for column vectors. Defining the following:
x1 = np.array([3.0, 2.0, 1.0])
x2 = np.array([-2.0, 1.0, 0.0])
And calling
print("inner product x1/x2: ", np.inner(x1, x2))
Produces inner product x1/x2: -4.0 as expected - this made me think that numpy assumes an array of this form is a column vector and, as part of the inner function, tranposes one of them to give a scalar. However, I wrote some code to test this idea and it gave some results that I don't understand.
After doing some googling about how to specify that an array is a column vector using .T I defined the following:
x = np.array([1, 0]).T
xT = np.array([1, 0])
Where I intended for x to be a column vector and xT to be a row vector. However, calling the following:
print(x)
print(x.shape)
print(xT)
print(xT.shape)
Produces this:
[1 0]
(2,)
[1 0]
(2,)
Which suggests the two arrays have the same dimensions, despite one being the transpose of the other. Furthermore, calling both np.inner(x,x) and np.inner(x,xT) produces the same result. Am I misunderstanding the .T function, or perhaps some fundamental feature of numpy/linear algebra? I don't feel like x & xT should be the same vector.
Finally, the reason I initially used .T was because trying to define a column vector as x = np.array([[1], [0]]) and calling print(np.inner(x, x)) produced the following as the inner product:
[[1 0]
[0 0]]
Which is the output you'd expect to see for the outer product. Am I misusing this way of defining a column vector?
Look at the inner docs:
Ordinary inner product of vectors for 1-D arrays
...
np.inner(a, b) = sum(a[:]*b[:])
With your sample arrays:
In [374]: x1 = np.array([3.0, 2.0, 1.0])
...: x2 = np.array([-2.0, 1.0, 0.0])
In [375]: x1*x2
Out[375]: array([-6., 2., 0.])
In [376]: np.sum(x1*x2)
Out[376]: -4.0
In [377]: np.inner(x1,x2)
Out[377]: -4.0
In [378]: np.dot(x1,x2)
Out[378]: -4.0
In [379]: x1#x2
Out[379]: -4.0
From the wiki for dot/scalar/inner product:
https://en.wikipedia.org/wiki/Dot_product
two equal-length sequences of numbers (usually coordinate vectors) and returns a single number
If vectors are identified with row matrices, the dot product can also
be written as a matrix product
Coming from a linear algebra world, it easy to think of everything in terms of matrices (2d) and vectors, which are 1 row or 1 column matrices. MATLAB/Octave works in that framework. But numpy is more general, with arrays with 0 or more dimensions, not just 2.
np.transpose does not add dimensions, it just permutes the existing ones. Hence x1.T does not change anything.
A column vector can be made with np.array([[1], [0]]) or:
In [381]: x1
Out[381]: array([3., 2., 1.])
In [382]: x1[:,None]
Out[382]:
array([[3.],
[2.],
[1.]])
In [383]: x1.reshape(3,1)
Out[383]:
array([[3.],
[2.],
[1.]])
np.inner describes what happens when the inputs not 1d, such as your 2d (2,1) shape x. It says it uses np.tensordot which is a generalization of np.dot, matrix product.
In [386]: x = np.array([[1],[0]])
In [387]: x
Out[387]:
array([[1],
[0]])
In [388]: np.inner(x,x)
Out[388]:
array([[1, 0],
[0, 0]])
In [389]: np.dot(x,x.T)
Out[389]:
array([[1, 0],
[0, 0]])
In [390]: x*x.T
Out[390]:
array([[1, 0],
[0, 0]])
This is the elementwise product of (2,1) and (1,2) resulting in a (2,2), or outer product.

what does numpy ndarray shape do?

I have a simple question about the .shape function, which confused me a lot.
a = np.array([1, 2, 3]) # Create a rank 1 array
print(type(a)) # Prints "<class 'numpy.ndarray'>"
print(a.shape) # Prints "(3,)"
b = np.array([[1,2,3],[4,5,6]]) # Create a rank 2 array
print(b.shape) # Prints "(2, 3)"
What did the .shape exactly do? count how many rows, how many columns,
then the a.shape suppose to be, (1,3), one row three columns, right?
yourarray.shape or np.shape() or np.ma.shape() returns the shape of your ndarray as a tuple; And you can get the (number of) dimensions of your array using yourarray.ndim or np.ndim(). (i.e. it gives the n of the ndarray since all arrays in NumPy are just n-dimensional arrays (shortly called as ndarrays))
For a 1D array, the shape would be (n,) where n is the number of elements in your array.
For a 2D array, the shape would be (n,m) where n is the number of rows and m is the number of columns in your array.
Please note that in 1D case, the shape would simply be (n, ) instead of what you said as either (1, n) or (n, 1) for row and column vectors respectively.
This is to follow the convention that:
For 1D array, return a shape tuple with only 1 element (i.e. (n,))
For 2D array, return a shape tuple with only 2 elements (i.e. (n,m))
For 3D array, return a shape tuple with only 3 elements (i.e. (n,m,k))
For 4D array, return a shape tuple with only 4 elements (i.e. (n,m,k,j))
and so on.
Also, please see the example below to see how np.shape() or np.ma.shape() behaves with 1D arrays and scalars:
# sample array
In [10]: u = np.arange(10)
# get its shape
In [11]: np.shape(u) # u.shape
Out[11]: (10,)
# get array dimension using `np.ndim`
In [12]: np.ndim(u)
Out[12]: 1
In [13]: np.shape(np.mean(u))
Out[13]: () # empty tuple (to indicate that a scalar is a 0D array).
# check using `numpy.ndim`
In [14]: np.ndim(np.mean(u))
Out[14]: 0
P.S.: So, the shape tuple is consistent with our understanding of dimensions of space, at least mathematically.
Unlike it's most popular commercial competitor, numpy pretty much from the outset is about "arbitrary-dimensional" arrays, that's why the core class is called ndarray. You can check the dimensionality of a numpy array using the .ndim property. The .shape property is a tuple of length .ndim containing the length of each dimensions. Currently, numpy can handle up to 32 dimensions:
a = np.ones(32*(1,))
a
# array([[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[ 1.]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]])
a.shape
# (1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
a.ndim
# 32
If a numpy array happens to be 2d like your second example, then it's appropriate to think about it in terms of rows and columns. But a 1d array in numpy is truly 1d, no rows or columns.
If you want something like a row or column vector you can achieve this by creating a 2d array with one of its dimensions equal to 1.
a = np.array([[1,2,3]]) # a 'row vector'
b = np.array([[1],[2],[3]]) # a 'column vector'
# or if you don't want to type so many brackets:
b = np.array([[1,2,3]]).T
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
.shape() gives the actual shape of your array in terms of no of elements in it, No of rows/No of Columns.
The answer you get is in the form of tuples.
For Example:
1D ARRAY:
d=np.array([1,2,3,4])
print(d)
(1,)
Output: (4,)
ie the number4 denotes the no of elements in the 1D Array.
2D Array:
e=np.array([[1,2,3],[4,5,6]])
print(e)
(2,3)
Output: (2,3) ie the number of rows and the number of columns.
The number of elements in the final output will depend on the number of rows in the Array....it goes on increasing gradually.

Read one specific cell of numpy matrix

I wrote a function to see if a matrix is symmetric or not:
def issymmetric(mat):
if(mat.shape[0]!=mat.shape[1]):
return 0
for i in range(mat.shape[0]):
for j in range(i):
if (mat[i][j]!=mat[j][i]):
return 0
return 1
It works well with built-in ndarrays e.g. numpy.ones:
import numpy as np
a=np.ones((5,5), int)
print issymmetric(a)
And with numpy arrays:
import numpy as np
a=np.array([[1, 2, 3], [2, 1 , 2], [3, 2, 1]])
print issymmetric(a)
But when it comes to numpy matrixes:
import numpy as np
a=np.matrix([[1, 2, 3], [2, 1 , 2], [3, 2, 1]])
print issymmetric(a)
It gaves me this error:
File "issymetry.py", line 9, in issymmetric
if (mat[i][j]!=mat[j][i]):
File "/usr/lib/python2.7/dist-packages/numpy/matrixlib/defmatrix.py", line 316, in __getitem__
out = N.ndarray.__getitem__(self, index)
IndexError: index 1 is out of bounds for axis 0 with size 1
shell returned 1
That's because There is no a[0][1]
a[0] is matrix([[1, 2, 3]]). a[0][0] is matrix([[1, 2, 3]]) too., but there is no a[0][1].
How can I fix this issue, without changing the matrix type, or the function?
In general, what is the proper way to read and update one specific cell of a numpy matrix?
It is best to use [i,j] style indexing in numpy. Often you can get by with [i][j] when using np.array, but not with np.matrix. Remember an np.matrix is always 2d.
In a shell construct a simple 2d array, and try different methods of indexing. Now try it with np.matrix arrays. Pay attention to the shape.
In [2]: A = np.arange(6).reshape(2,3)
In [3]: A[1] # short for A[1,:]
Out[3]: array([3, 4, 5]) # shape (3,)
In [4]: A[1][2] # short for A[1,:][2]
Out[4]: 5
In [5]: M=np.matrix(A)
In [6]: M[1]
Out[6]: matrix([[3, 4, 5]]) # shape (1,3), 2d
In [7]: M[1][2]
...
IndexError: index 2 is out of bounds for axis 0 with size 1
correct indexing that works with both
In [9]: A[1,2]
Out[9]: 5
In [10]: M[1,2]
Out[10]: 5
A[i][j]=... is also prone to failure when used on the LHS. It only works if the first part A[i] returns a view. If fails if it produces a copy.

Resources