Generating nested Numpy Arrays - arrays

I'm trying to write a function that will take as input a numpy array in the form:
a = [[0,0], [10,0], [10,10], [5,4]]
and return a numpy array b such that:
b = [[[0,0]], [[10,0]], [[10,10]], [[5,4]]]
For some reason I'm finding this surprisingly difficult.
The reason I'm doing this is that I have some contours generated using skimage that I'm attempting to use opencv2 on to calculate features ( area, perimeter etc...) but the opencv functions will only take arrays in the form of b as input, rather than a.

a is shape (4,2), b is (4,1,2)
a.reshape(4,1,2)
np.expanddims(a, 1)
a[:,None]
all work
In [503]: B
Out[503]:
array([[[ 0, 0]],
[[10, 0]],
[[10, 10]],
[[ 5, 4]]])
In [504]: B.tolist()
Out[504]: [[[0, 0]], [[10, 0]], [[10, 10]], [[5, 4]]]

Related

How to organize list of list of lists to be compatible with scipy.optimize fmin init array

I am very amateur when it comes to scipy. I am trying to use scipy's fmin function on a multidimensional variable system. For the sake of simplicity I am using list of list of list's. My data is 12 dimensional, when I enter np.shape(DATA) it returns (3,2,2), I am not even sure if scipy can handle that many dimensions, if not no problem I can reduce them, the point is that the optimize.fmin() function doesn't accept list based arrays as x0 initial parameters, so I need help either rewriting the x0 array into numpy compatible one or the entire DATA array into a 12 dimensional matrix or something like that.
Here is a simpler example illustrating the issue:
from scipy import optimize
import numpy as np
def f(x): return(x[0][0]*1.5-x[0][1]*2.0+x[1][0]*2.5-x[1][1]*3.0)
result = optimize.fmin(f,[[0.1,0.1],[0.1,0.1]])
print(result)
It will give an error saying invalid index to scalar variable which probably comes from not understanding the [[],[]] list of list structure, so it probably only understands numpy array formats.
So how to rewrite this to make it work, and also for my (3,2,2) shaped list of list as well!?
scipy.optimize.fmin needs the initial guess for the function parameters to be a 1D array with a number of elements that suits the function to optimize. In your case, maybe you can use flatten and reshape if you just need the output to be in the same shape as your input parameters. An example based on your illustration code:
from scipy import optimize
import numpy as np
def f(x):
return x[0]*1.5-x[1]*2.0+x[2]*2.5-x[3]*3.0
guess = np.array([[0.1, 0.1],
[0.1, 0.1]]) # guess.shape is (2,2)
out = optimize.fmin(f, guess.flatten()) # flatten upon input
# out.shape is (4,)
# reshape output according to guess
out = out.reshape(guess.shape) # out.shape is (2,2) again
or out = optimize.fmin(f, guess.flatten()).reshape(guess.shape) in one line. Note that this also works for a 3-dimensional array as you propose:
guess = np.arange(12).reshape(3,2,2)
# array([[[ 0, 1],
# [ 2, 3]],
# [[ 4, 5],
# [ 6, 7]],
# [[ 8, 9],
# [10, 11]]])
guess = guess.flatten()
# array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
guess = guess.reshape(3,2,2)
# array([[[ 0, 1],
# [ 2, 3]],
# [[ 4, 5],
# [ 6, 7]],
# [[ 8, 9],
# [10, 11]]])

Axes Swapping in higher degree numpy arrays

So I was embarking on a mission to figure out how the numpy swapaxes function operates and reached a sort of a roadblock when it came to swapping axes in arrays of dimensions > 3.
Say
import numpy as np
array=np.arange(24).reshape(3,2,2,2)
This would create a numpy array of shape (3,2,2,2) with elements 0-2. Can someone explain to me how exactly axes swapping works in this case, where we cannot visualise the four axes separately?
Say I want to swap axes 0 and 2.
array.swapaxes(0,2)
It would be great if someone could actually describe the abstract swapping which is occurring when there are 4 or more axes. Thanks!
How do you 'describe' a 4d array? We don't have intuitions to match; the best we can do is project from 2d experience. rows, cols, planes, ??
This array is small enough to show the actual print:
In [271]: arr = np.arange(24).reshape(3,2,2,2)
In [272]: arr
Out[272]:
array([[[[ 0, 1],
[ 2, 3]],
[[ 4, 5],
[ 6, 7]]],
[[[ 8, 9],
[10, 11]],
[[12, 13],
[14, 15]]],
[[[16, 17],
[18, 19]],
[[20, 21],
[22, 23]]]])
The print marks the higher dimensions with extra [] and blank lines.
In [273]: arr.swapaxes(0,2)
Out[273]:
array([[[[ 0, 1],
[ 8, 9],
[16, 17]],
[[ 4, 5],
[12, 13],
[20, 21]]],
[[[ 2, 3],
[10, 11],
[18, 19]],
[[ 6, 7],
[14, 15],
[22, 23]]]])
To see what's actually being done, we have to look at the underlying properties of the arrays
In [274]: arr.__array_interface__
Out[274]:
{'data': (188452024, False),
'descr': [('', '<i4')],
'shape': (3, 2, 2, 2),
'strides': None, # arr.strides = (32, 16, 8, 4)
'typestr': '<i4',
'version': 3}
In [275]: arr.swapaxes(0,2).__array_interface__
Out[275]:
{'data': (188452024, False),
'descr': [('', '<i4')],
'shape': (2, 2, 3, 2),
'strides': (8, 16, 32, 4),
'typestr': '<i4',
'version': 3}
The data attributes are the same - the swap is a view, sharing data buffer with the original. So no numbers are moved around.
The shape change is obvious, that's what we told it swap. Sometimes it helps to make all dimensions different, e.g. (2,3,4)
It has also swapped 2 strides values, though how that affects the display is harder to explain. We have to know something about how shape and strides work together to create a multidimensional array (from a flat data buffer).

Assign 1 and 0 values to numpy array depending on whether values are in list

I am looking for a way to filter numpy arrays based on a list
input_array = [[0,4,6],[2,1,1],[6,6,9]]
list=[9,4]
...
output_array = [[0,1,0],[0,0,0],[0,0,1]]
I am currently flattening the array, and turning it to a list and back. Looks very unpythonic:
list=[9,4]
shape = input_array.shape
input_array = input_array.flatten()
output_array = np.array([int(i in list) for i in input_array])
output_array = output_array.reshape(shape)
We could use np.in1d to get the mask of matches. Now, np.in1d flattens the input to 1D before processing. So, the output from it is to be reshaped back to 2D and then converted to int for an output with 0s and 1s.
Thus, the implementation would be -
np.in1d(input_array, list).reshape(input_array.shape).astype(int)
Sample run -
In [40]: input_array
Out[40]:
array([[0, 4, 6],
[2, 1, 1],
[6, 6, 9]])
In [41]: list=[9,4]
In [42]: np.in1d(input_array, list).reshape(input_array.shape).astype(int)
Out[42]:
array([[0, 1, 0],
[0, 0, 0],
[0, 0, 1]])

numpy using multidimensional index array on another multidimensional array

I have a 2 multidimensional arrays, and I'd like to use one as the index to produce a new multidimensional array. For example:
a = array([[4, 3, 2, 5],
[7, 8, 6, 8],
[3, 1, 5, 6]])
b = array([[0,2],[1,1],[3,1]])
I want to use the first array in b to return those indexed elements in the first array of a, and so on. So I want the output to be:
array([[4,2],[8,8],[6,1]])
This is probably simple but I couldn't find an answer by searching. Thanks.
This is a little tricky, but the following will do it:
>>> a[np.arange(3)[:, np.newaxis], b]
array([[4, 2],
[8, 8],
[6, 1]])
You need to index both the rows and the columns of the a array, so to match your b array you would need an array like this:
rows = np.array([[0, 0],
[1, 1],
[2, 2]])
And then a[rows, b] would clearly return what you are after. You can get the same result relying on broadcasting as above, replacing the rows array with np.arange(3)[:, np.newaxis], which is equivalent to np.arange(3).reshape(3, 1).

How to extract lines in an array, which contain a certain value? (numpy, scipy)

I have an numpy 2D array and I want it to return coloumn c where (r, c-1) (row r, coloumn c) equals a certain value (int n).
I don't want to iterate over the rows writing something like
for r in len(rows):
if array[r, c-1] == 1:
store array[r,c]
, because there are 4000 of them and this 2D array is just one of 20 i have to look trough.
I found "filter" but don't know how to use it (Found no doc).
Is there an function, that provides such a search?
I hope I understood your question correctly. Let's say you have an array a
a = array(range(7)*3).reshape(7, 3)
print a
array([[0, 1, 2],
[3, 4, 5],
[6, 0, 1],
[2, 3, 4],
[5, 6, 0],
[1, 2, 3],
[4, 5, 6]])
and you want to extract all lines where the first entry is 2. This can be done like this:
print a[a[:,0] == 2]
array([[2, 3, 4]])
a[:,0] denotes the first column of the array, == 2 returns a Boolean array marking the entries that match, and then we use advanced indexing to extract the respective rows.
Of course, NumPy needs to iterate over all entries, but this will be much faster than doing it in Python.
Numpy arrays are not indexed. If you need to perform this specific operation more effeciently than linear in the array size, then you need to use something other than numpy.

Resources