I have an array which looks like this:
import numpy as np
ar = np.array([[[1,2,3], [4,5,6], [7,8,9]],
[[10,11,12], [13,14,15], [16,17,18]]])
print(ar)
Output:
[[[ 1 2 3]
[ 4 5 6]
[ 7 8 9]]
[[10 11 12]
[13 14 15]
[16 17 18]]]
Is there a simple way to transform it to this:
[[ 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18]]
Edit
With the help of Vinay I was able to do this:
ar = np.reshape(ar, (len(ar), -1))
You can use reshape method.
>>> import numpy as np
>>> ar = np.array([[[1,2,3], [4,5,6], [7,8,9]],
... [[10,11,12], [13,14,15], [16,17,18]]])
>>> np.reshape(ar, (2,-1))
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14, 15, 16, 17, 18]])
In case the data are not sorted to begin with you can use sort following the reshape as Vinay recommended.
import numpy as np
ar = np.array([[[1,2,3], [4,5,6], [7,8,9]],
[[10,11,12], [13,14,15], [16,17,18]]])
ar = np.sort(np.reshape(ar, (2,9)))
print(ar)
Related
I have an array with the shape (10000,6). For example:
a = np.array([[5, 5, 5, 5, 5, 5][10, 10, 10, 10, 10][15, 15, 15, 15, 15]...])
I want to take every 25th array and subtract its element values from the next 25 elements until a new subtraction array in selected. so for example if the first array is:
[10, 10, 10, 10, 10]
then these values should be subtracted on the array itself and the next 25 arrays until for example a new subtraction array like this is selected:
[2, 2, 2, 2, 2]
then the array itself and the following 25 elements should be subtracted that arrays values.
This means that after the operation every 25th array will be:
[0, 0, 0, 0, 0]
because it has been subtracted by itself.
Here's what I would do:
import numpy as np
arr = np.random.randint(0, 10, (9, 3))
group_size = 3
# select vectors you want ot subtract and copy them {group_size} times
selected = arr[::group_size].repeat(3, axis = 0)
# subtract selected vectors from all vectors in the group
sub_arr = arr-selected
output:
arr =
[[9 6 3]
[8 3 3]
[2 0 4]
[0 3 9]
[3 9 9]
[0 8 6]
[4 0 0]
[6 1 9]
[2 6 4]]
selected =
[[9 6 3]
[9 6 3]
[9 6 3]
[0 3 9]
[0 3 9]
[0 3 9]
[4 0 0]
[4 0 0]
[4 0 0]]
sub_arr =
[[ 0 0 0]
[-1 -3 0]
[-7 -6 1]
[ 0 0 0]
[ 3 6 0]
[ 0 5 -3]
[ 0 0 0]
[ 2 1 9]
[-2 6 4]]
You can reshape your array so that each chunk has the right number of lines, and then simply subtract the first line
import numpy as np
a = np.arange(10000)[:, None] * np.ones(6)
a = a.reshape(-1, 25, 6)
a -= a[:, 0, :][:, None, :]
a = a.reshape(-1, 6)
I have a 2D array. For example:
ary = np.arange(24).reshape(6,4)
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]
I want to break this into smaller 2D arrays, each 2x2, and compute the square root of the sum of each. I actually want to use arbitrary sized sub-arrays, and compute arbitrary functions of them, but I think this question is easier to ask with concrete operations and concrete array sizes, so in this example starting with a 6x4 array and computing the square root of sums of 2x2 sub-arrays, the final result would be a 3x2 array, as follows:
[[3.16, 4.24] # math.sqrt(0+1+4+5) , math.sqrt(2+3+6+7)
[6.48, 7.07] # math.sqrt(8+9+12+13) , math.sqrt(10+11+14+15)
[8.60, 9.05]] # math.sqrt(16+17+20+21), math.sqrt(18+19+22+23)
How can I slice, or split, or do some operation to perform some computation on 2D sub-arrays?
Here is a working, inefficient example of what I'm trying to do:
import numpy as np
a_height = 6
a_width = 4
a_area = a_height * a_width
a = np.arange(a_area).reshape(a_height, a_width)
window_height = 2
window_width = 2
b_height = a_height // window_height
b_width = a_width // window_width
b_area = b_height * b_width
b = np.zeros(b_area).reshape(b_height, b_width)
for i in range(b_height):
for j in range(b_width):
b[i, j] = a[i * window_height:(i + 1) * window_height, j * window_width:(j + 1) * window_width].sum()
b = np.sqrt(b)
print(b)
# [[3.16227766 4.24264069]
# [6.4807407 7.07106781]
# [8.60232527 9.05538514]]
In [2]: ary = np.arange(24).reshape(6,4)
In [3]: ary
Out[3]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])
While I recommended moving-windows based on as_strided, we can also divide the array into 'blocks' with reshape and transpose:
In [4]: ary.reshape(3,2,2,2).transpose(0,2,1,3)
Out[4]:
array([[[[ 0, 1],
[ 4, 5]],
[[ 2, 3],
[ 6, 7]]],
[[[ 8, 9],
[12, 13]],
[[10, 11],
[14, 15]]],
[[[16, 17],
[20, 21]],
[[18, 19],
[22, 23]]]])
In [5]: np.sqrt(_.sum(axis=(2,3)))
Out[5]:
array([[3.16227766, 4.24264069],
[6.4807407 , 7.07106781],
[8.60232527, 9.05538514]])
While the transpose makes it easier to visual the blocks that need to be summed, it isn't necessary:
In [7]: np.sqrt(ary.reshape(3,2,2,2).sum(axis=(1,3)))
Out[7]:
array([[3.16227766, 4.24264069],
[6.4807407 , 7.07106781],
[8.60232527, 9.05538514]])
np.lib.stride_tricks.sliding_window doesn't give us as much direct control as I thought, but
np.lib.stride_tricks.sliding_window_view(ary,(2,2))[::2,::2]
gives the same result as Out[4].
In [13]: np.sqrt(np.lib.stride_tricks.sliding_window_view(ary,(2,2))[::2,::2].sum(axis=(2,3)))
Out[13]:
array([[3.16227766, 4.24264069],
[6.4807407 , 7.07106781],
[8.60232527, 9.05538514]])
[7] is faster.
In general, it can be done like this:
a_height = 15
a_width = 16
a_area = a_height * a_width
a = np.arange(a_are).reshape(a_height, a_width)
window_height = 3 # must evenly divide a_height
window_width = 4 # must evenly divide a_width
b_height = a_height // window_height
b_width = a_width // window_width
b = a.reshape(b_height, window_height, b_width, window_width).transpose(0,2,1,3)
# or, assuming you want sum or another function that takes `axis` argument
b = a.reshape(b_height, window_height, b_width, window_width).sum(axis=(1,3))
How can I partition this array into arrays of length 3, with a padded or unpadded remainder (doesn't matter)
>>> np.array([0,1,2,3,4,5,6,7,8,9,10]).reshape([3,-1])
ValueError: cannot reshape array of size 11 into shape (3,newaxis)
### Two Examples Without Padding
x = np.array([0,1,2,3,4,5,6,7,8,9,10])
desired_length = 3
num_splits = np.ceil(x.shape[0]/desired_length)
print(np.array_split(x, num_splits))
# Prints:
# [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8]), array([ 9, 10])]
x = np.arange(13)
desired_length = 3
num_splits = np.ceil(x.shape[0]/desired_length)
print(np.array_split(x, num_splits))
# Prints:
# [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8]), array([ 9, 10]), array([11, 12])]
### One Example With Padding
x = np.arange(13)
desired_length = 3
padding = int(num_splits*desired_length - x.shape[0])
x_pad = np.pad(x, (0,padding), 'constant', constant_values=0)
print(np.split(x_pad, num_splits))
# Prints:
# [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8]), array([ 9, 10, 11]), array([12, 0, 0])]
If you want to avoid padding with zeros, the most elegant way to do it might be slicing in a list comprehension:
>>> import numpy as np
>>> x = np.arange(11)
>>> [x[i:i+3] for i in range(0, x.size, 3)]
[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8]), array([ 9, 10])]
If you want to pad with zeros, ndarray.resize() does this for you, but you have to figure out the size of the expected array yourself:
import numpy as np
x = np.array([0,1,2,3,4,5,6,7,8,9,10])
cols = 3
rows = np.ceil(x.size / cols).astype(int)
x.resize((rows, cols))
print(x)
Which results in:
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 0]]
As far as I can tell, this is hundreds of times faster than the list comprehension approach (see my other answer).
Note that if you do anything to x before resizing, you might run into an issue with 'references'. Either work on x.copy() or pass refcheck=False to resize().
I have an array of arrays that represent matrices and I need to transpose each matrix, ideally without transposing in a loop. When I use array.T, it transposes everything, not just the axes in each array. Is it possible to just transpose each matrix?
INPUT: np.arange(27).reshape(3, 3, 3).T
OUTPUT:
[[[ 0 9 18]
[ 3 12 21]
[ 6 15 24]]
[[ 1 10 19]
[ 4 13 22]
[ 7 16 25]]
[[ 2 11 20]
[ 5 14 23]
[ 8 17 26]]]
What I want is for the arrays to look like this:
[[[ 0 3 6]
[ 1 4 7]
[ 2 5 8]]
[[ 9 12 15]
[ 10 13 16]
[ 11 14 17]]
[[ 18 21 24]
[ 19 22 25]
[ 20 23 26]]]
In [11]: A = np.arange(27).reshape(3, 3, 3)
In [12]: A
Out[12]:
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
transpose the last 2 dimensions:
In [13]: A.transpose(0,2,1)
Out[13]:
array([[[ 0, 3, 6],
[ 1, 4, 7],
[ 2, 5, 8]],
[[ 9, 12, 15],
[10, 13, 16],
[11, 14, 17]],
[[18, 21, 24],
[19, 22, 25],
[20, 23, 26]]])
A.swapaxes(1,2) also does it.
I am working with some documentation and wish to portray an array of this form
>>> a_3d
array([[[4, 6, 4],
[1, 1, 8],
[0, 7, 5],
[5, 3, 3],
[8, 9, 5]],
[[8, 8, 4],
[3, 4, 4],
[0, 0, 9],
[3, 7, 3],
[3, 4, 7]],
[[9, 5, 4],
[7, 7, 3],
[9, 5, 9],
[8, 7, 8],
[5, 8, 8]]], dtype=int32)
as text in a similar fashion as I can do it using MatPlotLib as a graph/map.
I have managed to simply decompress the original array and provide some additional information into this form.
array...
shape (3, 5, 3) ndim 3 size 45
a[0]...
[[[4 6 4]
[1 1 8]
[0 7 5]
[5 3 3]
[8 9 5]]
a[1]....
[[8 8 4]
[3 4 4]
[0 0 9]
[3 7 3]
[3 4 7]]
a[2]....
[[9 5 4]
[7 7 3]
[9 5 9]
[8 7 8]
[5 8 8]]]
But I have tried every combination of reshaping, transposing to get it into a row representation. I haven't found a soution, short of reconstructing the array from first principles so that the three 2D blocks appear in one row.
Again, this is for teaching and visualization purposes and not for analysis. If I have overlooked the obvious, I would appreciate any comments.
EDIT
[[[4, 6, 4], [[8, 8, 4], [[9, 5, 4],
[1, 1, 8],
[0, 7, 5], etc etc
[5, 3, 3],
[8, 9, 5]], [3, 4, 7]], [5, 8, 8]]]
or similar... if this helps
apparently the kludge workaround I am using might help, it would be nice to work with the original data and restructure it, rather than to have to say...we will flip out to using lists and object arrays for awhile...
def to_row(a):
""" kludge workaround """
n, rows, cols = a.shape
e = np.empty((rows, cols), dtype='object')
for r in range(rows):
for c in range(cols):
e[r][c] = (a[c][r]).tolist()
return e
So you have an array with shape (3,5,3), and the default array function displays it has 3 planes, each a (5,3) 2d array.
Reshaping and transposing does not change this basic display format - it still splits the array on the 1st axis, and formats each block.
The formatting is handled by a builtin numpy function:
In [112]: arr=np.arange(2*3*4).reshape(2,3,4)
In [113]: arr.__format__('')
Out[113]: '[[[ 0 1 2 3]\n [ 4 5 6 7]\n [ 8 9 10 11]]\n\n [[12 13 14 15]\n [16 17 18 19]\n [20 21 22 23]]]'
np.array2string(arr) produces the same string.
Conceivably you could split this string on \n, and rearrange the pieces.
In [116]: np.get_printoptions()
Out[116]:
{'edgeitems': 3,
'formatter': None,
'infstr': 'inf',
'linewidth': 75,
'nanstr': 'nan',
'precision': 8,
'suppress': False,
'threshold': 1000}
the set_options function's doc describes these values. You might also look at np.set_string_function
Here's a first stab at rearranging the lines:
In [137]: astr=np.array2string(arr)
In [138]: lines=astr.splitlines()
In [139]: lines
Out[139]:
['[[[ 0 1 2 3]',
' [ 4 5 6 7]',
' [ 8 9 10 11]]',
'',
' [[12 13 14 15]',
' [16 17 18 19]',
' [20 21 22 23]]]']
In [140]: print '\n'.join([' '.join((lines[i],lines[i+4])) for i in range(3)])
[[[ 0 1 2 3] [[12 13 14 15]
[ 4 5 6 7] [16 17 18 19]
[ 8 9 10 11]] [20 21 22 23]]]
Brackets need to be cleaned up, but overall the shape looks right.
Another way to get such a set of lines is to format each plane:
In [151]: alist=[np.array2string(i).splitlines() for i in arr]
In [152]: alist
Out[152]:
[['[[ 0 1 2 3]', ' [ 4 5 6 7]', ' [ 8 9 10 11]]'],
['[[12 13 14 15]', ' [16 17 18 19]', ' [20 21 22 23]]']]
In [153]: zip(*alist) # a list form of transpose
Out[153]:
[('[[ 0 1 2 3]', '[[12 13 14 15]'),
(' [ 4 5 6 7]', ' [16 17 18 19]'),
(' [ 8 9 10 11]]', ' [20 21 22 23]]')]
which then can be joined. \t (tab) cleans up the bracket spacing.
In [155]: '\n'.join(['\t'.join(k) for k in zip(*alist)])
Out[155]: '[[ 0 1 2 3]\t[[12 13 14 15]\n [ 4 5 6 7]\t [16 17 18 19]\n [ 8 9 10 11]]\t [20 21 22 23]]'
In [156]: print _
[[ 0 1 2 3] [[12 13 14 15]
[ 4 5 6 7] [16 17 18 19]
[ 8 9 10 11]] [20 21 22 23]]
for 3 blocks - it still needs work :(
In [157]: arr1=np.arange(2*3*4).reshape(3,4,2)
In [158]: alist=[np.array2string(i).splitlines() for i in arr1]
In [159]: print '\n'.join(['\t'.join(k) for k in zip(*alist)])
[[0 1] [[ 8 9] [[16 17]
[2 3] [10 11] [18 19]
[4 5] [12 13] [20 21]
[6 7]] [14 15]] [22 23]]
In a sense it's the same problem you have with text when you want to display it in columns. May be there's a multi-column print utility.
Even though you are thinking in terms of blocks side by side, the display is still based on lines.