subtracting every nth array with every nth array - arrays

I have an array with the shape (10000,6). For example:
a = np.array([[5, 5, 5, 5, 5, 5][10, 10, 10, 10, 10][15, 15, 15, 15, 15]...])
I want to take every 25th array and subtract its element values from the next 25 elements until a new subtraction array in selected. so for example if the first array is:
[10, 10, 10, 10, 10]
then these values should be subtracted on the array itself and the next 25 arrays until for example a new subtraction array like this is selected:
[2, 2, 2, 2, 2]
then the array itself and the following 25 elements should be subtracted that arrays values.
This means that after the operation every 25th array will be:
[0, 0, 0, 0, 0]
because it has been subtracted by itself.

Here's what I would do:
import numpy as np
arr = np.random.randint(0, 10, (9, 3))
group_size = 3
# select vectors you want ot subtract and copy them {group_size} times
selected = arr[::group_size].repeat(3, axis = 0)
# subtract selected vectors from all vectors in the group
sub_arr = arr-selected
output:
arr =
[[9 6 3]
[8 3 3]
[2 0 4]
[0 3 9]
[3 9 9]
[0 8 6]
[4 0 0]
[6 1 9]
[2 6 4]]
selected =
[[9 6 3]
[9 6 3]
[9 6 3]
[0 3 9]
[0 3 9]
[0 3 9]
[4 0 0]
[4 0 0]
[4 0 0]]
sub_arr =
[[ 0 0 0]
[-1 -3 0]
[-7 -6 1]
[ 0 0 0]
[ 3 6 0]
[ 0 5 -3]
[ 0 0 0]
[ 2 1 9]
[-2 6 4]]

You can reshape your array so that each chunk has the right number of lines, and then simply subtract the first line
import numpy as np
a = np.arange(10000)[:, None] * np.ones(6)
a = a.reshape(-1, 25, 6)
a -= a[:, 0, :][:, None, :]
a = a.reshape(-1, 6)

Related

Change first row of numpy array

Sorry for the newbie question so i have an array as code below:
import numpy as np
p = np.array([[2,3,0,5],[2,3,4,5],[2,3,4,5],[0,0,0,0]])
p[np.where(p[0]==0)]=100
print(p)
I wanted to change the first rows 0th value to be 100. However the output is:
[[ 2 3 0 5]
[ 2 3 4 5]
[100 100 100 100]
[ 0 0 0 0]]
So it was changing the 3rd row. A bit perplex. Can I use where? What are other suggestions.
Kevin
[[2 3 100 5]
[2 3 4 5]
[2 3 4 5]
[0 0 0 0]]
Directly use indexing:
p[0, p[0]==0] = 100
Updated p:
array([[ 2, 3, 100, 5],
[ 2, 3, 4, 5],
[ 2, 3, 4, 5],
[ 0, 0, 0, 0]])

contingency table for Julia array

Consider a (m x n) matrix of only 0s and 1s, with m potentially large.
julia> rand([0, 1], 5, 3)
5×3 Array{Int64,2}:
0 1 1
0 0 0
0 1 1
1 0 0
1 0 1
Is there an efficient way to count the number of occurrences and track the indices for each unique row?
For example, the first row above occurs twice, at indices 1 and 3. I am trying to build a sort of contingency table.
Thanks
This is one of the approaches that is based only on functionalities provided in Julia Base:
julia> x = rand([0, 1], 20, 3)
20×3 Matrix{Int64}:
1 0 1
1 1 1
0 0 0
1 0 0
0 0 1
1 0 1
0 0 0
1 0 1
0 0 0
0 0 1
1 1 0
0 1 1
0 1 1
1 0 0
0 0 0
0 0 0
0 0 1
0 1 0
1 1 0
1 0 0
julia> d = Dict()
Dict{Any, Any}()
julia> for (i, r) in enumerate(eachrow(x))
push!(get!(d, r, Int[]), i)
end
julia> d
Dict{Any, Any} with 8 entries:
[1, 1, 1] => [2]
[0, 0, 0] => [3, 7, 9, 15, 16]
[0, 0, 1] => [5, 10, 17]
[1, 1, 0] => [11, 19]
[1, 0, 0] => [4, 14, 20]
[0, 1, 1] => [12, 13]
[1, 0, 1] => [1, 6, 8]
[0, 1, 0] => [18]
and now using the SplitApplyCombine.jl package:
julia> using SplitApplyCombine
julia> group(i -> view(x, i, :), axes(x, 1))
8-element Dictionaries.Dictionary{Any, Vector{Int64}}
[1, 0, 1] │ [1, 6, 8]
[1, 1, 1] │ [2]
[0, 0, 0] │ [3, 7, 9, 15, 16]
[1, 0, 0] │ [4, 14, 20]
[0, 0, 1] │ [5, 10, 17]
[1, 1, 0] │ [11, 19]
[0, 1, 1] │ [12, 13]
[0, 1, 0] │ [18]

make list in list dynamic in python, problem?

i want to make a list by combining the results of n. but when the x values are different, it creates a new list. However, if the x value is the same, it will be in 1 list
My code is like this
muy = [[1,2,3],[4,5,6],[7,8,9]]
# will = []
for x in muy:
for y in muy:
if x != y:
print(x, " ", y)
m = np.subtract(x, y)
n = sum(m)
print(m)
print(n)
the result is like this
[1, 2, 3] [4, 5, 6]
[-3 -3 -3]
-9
[1, 2, 3] [7, 8, 9]
[-6 -6 -6]
-18
[4, 5, 6] [1, 2, 3]
[3 3 3]
9
[4, 5, 6] [7, 8, 9]
[-3 -3 -3]
-9
[7, 8, 9] [1, 2, 3]
[6 6 6]
18
[7, 8, 9] [4, 5, 6]
[3 3 3]
9
the result what i want is like this:
[[-9,-18][9,-9][18,9]]
what should i do?
muy = [[1,2,3],[4,5,6],[7,8,9]]
will = []
for x in muy:
temp = []
for y in muy:
if x != y:
m = np.subtract(x, y)
n = sum(m)
temp.append(n)
will.append(temp)
print(will)

How to have multidimensional array with different length in Julia

I need to make a sequence of an array with different length by reading a dataset. I need to call each of them in a loop so probably I need some sort of indexing in order to call them. For example, how can I create the following sequence:
P[1]=[1 2 3 4]
P[2]=[1 4]
P[3]=[8 9 0 0 5 6]
.
.
.
Here it is:
julia> P = Vector{Vector{Int64}}([[1,2,3,4],[1,4],[8,9,0,0,5,6]])
3-element Array{Array{Int64,1},1}:
[1, 2, 3, 4]
[1, 4]
[8, 9, 0, 0, 5, 6]
julia> P[1]
4-element Array{Int64,1}:
1
2
3
4
julia> P[2]
2-element Array{Int64,1}:
1
4
julia> P[3]
6-element Array{Int64,1}:
8
9
0
0
5
6
If you want to add a new element use push!():
julia> push!(P,[7,8,9])
4-element Array{Array{Int64,1},1}:
[1, 2, 3, 4]
[1, 4]
[8, 9, 0, 0, 5, 6]
[7, 8, 9]

numpy array representation and formatting

I am working with some documentation and wish to portray an array of this form
>>> a_3d
array([[[4, 6, 4],
[1, 1, 8],
[0, 7, 5],
[5, 3, 3],
[8, 9, 5]],
[[8, 8, 4],
[3, 4, 4],
[0, 0, 9],
[3, 7, 3],
[3, 4, 7]],
[[9, 5, 4],
[7, 7, 3],
[9, 5, 9],
[8, 7, 8],
[5, 8, 8]]], dtype=int32)
as text in a similar fashion as I can do it using MatPlotLib as a graph/map.
I have managed to simply decompress the original array and provide some additional information into this form.
array...
shape (3, 5, 3) ndim 3 size 45
a[0]...
[[[4 6 4]
[1 1 8]
[0 7 5]
[5 3 3]
[8 9 5]]
a[1]....
[[8 8 4]
[3 4 4]
[0 0 9]
[3 7 3]
[3 4 7]]
a[2]....
[[9 5 4]
[7 7 3]
[9 5 9]
[8 7 8]
[5 8 8]]]
But I have tried every combination of reshaping, transposing to get it into a row representation. I haven't found a soution, short of reconstructing the array from first principles so that the three 2D blocks appear in one row.
Again, this is for teaching and visualization purposes and not for analysis. If I have overlooked the obvious, I would appreciate any comments.
EDIT
[[[4, 6, 4], [[8, 8, 4], [[9, 5, 4],
[1, 1, 8],
[0, 7, 5], etc etc
[5, 3, 3],
[8, 9, 5]], [3, 4, 7]], [5, 8, 8]]]
or similar... if this helps
apparently the kludge workaround I am using might help, it would be nice to work with the original data and restructure it, rather than to have to say...we will flip out to using lists and object arrays for awhile...
def to_row(a):
""" kludge workaround """
n, rows, cols = a.shape
e = np.empty((rows, cols), dtype='object')
for r in range(rows):
for c in range(cols):
e[r][c] = (a[c][r]).tolist()
return e
So you have an array with shape (3,5,3), and the default array function displays it has 3 planes, each a (5,3) 2d array.
Reshaping and transposing does not change this basic display format - it still splits the array on the 1st axis, and formats each block.
The formatting is handled by a builtin numpy function:
In [112]: arr=np.arange(2*3*4).reshape(2,3,4)
In [113]: arr.__format__('')
Out[113]: '[[[ 0 1 2 3]\n [ 4 5 6 7]\n [ 8 9 10 11]]\n\n [[12 13 14 15]\n [16 17 18 19]\n [20 21 22 23]]]'
np.array2string(arr) produces the same string.
Conceivably you could split this string on \n, and rearrange the pieces.
In [116]: np.get_printoptions()
Out[116]:
{'edgeitems': 3,
'formatter': None,
'infstr': 'inf',
'linewidth': 75,
'nanstr': 'nan',
'precision': 8,
'suppress': False,
'threshold': 1000}
the set_options function's doc describes these values. You might also look at np.set_string_function
Here's a first stab at rearranging the lines:
In [137]: astr=np.array2string(arr)
In [138]: lines=astr.splitlines()
In [139]: lines
Out[139]:
['[[[ 0 1 2 3]',
' [ 4 5 6 7]',
' [ 8 9 10 11]]',
'',
' [[12 13 14 15]',
' [16 17 18 19]',
' [20 21 22 23]]]']
In [140]: print '\n'.join([' '.join((lines[i],lines[i+4])) for i in range(3)])
[[[ 0 1 2 3] [[12 13 14 15]
[ 4 5 6 7] [16 17 18 19]
[ 8 9 10 11]] [20 21 22 23]]]
Brackets need to be cleaned up, but overall the shape looks right.
Another way to get such a set of lines is to format each plane:
In [151]: alist=[np.array2string(i).splitlines() for i in arr]
In [152]: alist
Out[152]:
[['[[ 0 1 2 3]', ' [ 4 5 6 7]', ' [ 8 9 10 11]]'],
['[[12 13 14 15]', ' [16 17 18 19]', ' [20 21 22 23]]']]
In [153]: zip(*alist) # a list form of transpose
Out[153]:
[('[[ 0 1 2 3]', '[[12 13 14 15]'),
(' [ 4 5 6 7]', ' [16 17 18 19]'),
(' [ 8 9 10 11]]', ' [20 21 22 23]]')]
which then can be joined. \t (tab) cleans up the bracket spacing.
In [155]: '\n'.join(['\t'.join(k) for k in zip(*alist)])
Out[155]: '[[ 0 1 2 3]\t[[12 13 14 15]\n [ 4 5 6 7]\t [16 17 18 19]\n [ 8 9 10 11]]\t [20 21 22 23]]'
In [156]: print _
[[ 0 1 2 3] [[12 13 14 15]
[ 4 5 6 7] [16 17 18 19]
[ 8 9 10 11]] [20 21 22 23]]
for 3 blocks - it still needs work :(
In [157]: arr1=np.arange(2*3*4).reshape(3,4,2)
In [158]: alist=[np.array2string(i).splitlines() for i in arr1]
In [159]: print '\n'.join(['\t'.join(k) for k in zip(*alist)])
[[0 1] [[ 8 9] [[16 17]
[2 3] [10 11] [18 19]
[4 5] [12 13] [20 21]
[6 7]] [14 15]] [22 23]]
In a sense it's the same problem you have with text when you want to display it in columns. May be there's a multi-column print utility.
Even though you are thinking in terms of blocks side by side, the display is still based on lines.

Resources