Add a column to a numpy array that counts if rows change - arrays

I have the following array:
[[1 2 1 0 2 0]
[1 2 1 0 2 0]
[1 2 1 0 2 0]
[1 2 1 0 2 0]
[0 1 2 1 0 0]
[0 1 2 1 0 0]
[0 0 1 0 1 0]
[0 0 0 1 1 0]
[0 0 0 0 1 0]
[0 0 0 0 0 1]]
I need to add a column to this array that adds a number whenever the values in the rows change starting with number 3. So the result would look like this:
[[1 2 1 0 2 0 3]
[1 2 1 0 2 0 3]
[1 2 1 0 2 0 3]
[1 2 1 0 2 0 3]
[0 1 2 1 0 0 4]
[0 1 2 1 0 0 4]
[0 0 1 0 1 0 5]
[0 0 0 1 1 0 6]
[0 0 0 0 1 0 7]
[0 0 0 0 0 1 8]]
Thank you

If a is your array as:
a = np.array([[1, 2, 1, 0, 2, 0], [1, 2, 1, 0, 2, 0], [1, 2, 1, 0, 2, 0], [1, 2, 1, 0, 2, 0],
[0, 1, 2, 1, 0, 0], [0, 1, 2, 1, 0, 0], [0, 0, 1, 0, 1, 0], [0, 0, 0, 1, 1, 0],
[0, 0, 0, 0, 1, 0], [0, 0, 0, 0, 0, 1]])
using the following code will get you the results:
n = 3
a = a.tolist()
for i, j in enumerate(a):
if i == 0:
j.append(n)
elif i > 0 and j == a[i-1][:-1]:
j.append(n)
else:
n += 1
j.append(n)
# a = np.array(a)
which will give:
[[1 2 1 0 2 0 3]
[1 2 1 0 2 0 3]
[1 2 1 0 2 0 3]
[1 2 1 0 2 0 3]
[0 1 2 1 0 0 4]
[0 1 2 1 0 0 4]
[0 0 1 0 1 0 5]
[0 0 0 1 1 0 6]
[0 0 0 0 1 0 7]
[0 0 0 0 0 1 8]]

Related

How to calculate cumulative sums of ones with a reset each time a zero is encountered

I have an array made of 0 and 1. I want to calculate a cumulative sum of all consecutive 1 with a reset each time a 0 is met, using numpy as I have thousands of arrays of thousands of lines and columns.
I can do it with loops but I suspect it will not be efficient.
Would you have a smarter and quick way to run it on the array.
Here is short example of the input and the expected output:
import numpy as np
arr_in = np.array([[1,1,1,1,1,1], [0,0,0,0,0,0], [1,0,1,0,1,1], [0,1,1,1,0,0]])
print(arr_in)
print("expected result:")
arr_out = np.array([[1,2,3,4,5,6], [0,0,0,0,0,0], [1,0,1,0,1,2], [0,1,2,3,0,0]])
print(arr_out)
When you run it:
[[1 1 1 1 1 1]
[0 0 0 0 0 0]
[1 0 1 0 1 1]
[0 1 1 1 0 0]]
expected result:
[[1 2 3 4 5 6]
[0 0 0 0 0 0]
[1 0 1 0 1 2]
[0 1 2 3 0 0]]
With numba.vectorize you can define a custom numpy ufunc to use for accumulation.
import numba as nb # v0.56.4, no support for numpy >= 1.22.0
import numpy as np # v1.21.6
#nb.vectorize([nb.int64(nb.int64, nb.int64)])
def reset_cumsum(x, y):
return x + y if y else 0
arr_in = np.array([[1,1,1,1,1,1],
[0,0,0,0,0,0],
[1,0,1,0,1,1],
[0,1,1,1,0,0]])
reset_cumsum.accumulate(arr_in, axis=1)
Output
array([[1, 2, 3, 4, 5, 6],
[0, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 1, 2],
[0, 1, 2, 3, 0, 0]])
You can compute the cumsum for the 1s, then identify the 0s and forward-fill the cumulated sum to subtract it:
# identify 0s
mask = arr_in==0
# get classical cumsum
cs = arr_in.cumsum(axis=1)
# ffill the cumsum value on 1s
# subtract from cumsum
out = cs-np.maximum.accumulate(np.where(mask, cs, 0), axis=1)
Output:
[[1 2 3 4 5 6]
[0 0 0 0 0 0]
[1 0 1 0 1 2]
[0 1 2 3 0 0]]
Output on second example:
[[1 2 3 4 5 6 0 1]
[0 1 2 0 0 0 1 0]]

Add a column to an array with values from a position in another array if rows match

I have two arrays, one looks like this:
[[1 2 1 0 2 0 1]
[1 2 1 0 2 0 1]
[1 2 1 0 2 0 1]
[1 2 1 0 2 0 1]
[0 1 2 1 0 0 2]
[0 1 2 1 0 0 2]
[0 0 1 0 1 0 3]
[0 0 0 1 1 0 4]
[0 0 0 0 1 0 5]
[0 0 0 0 0 1 6]]
The other looks like this:
[[1 2 1 0 2 0]
[1 1 1 0 2 0]
[1 1 1 0 2 0]
[1 2 1 0 2 0]
[0 3 2 2 0 0]
[0 1 2 1 0 0]
[0 2 1 2 1 0]
[0 0 0 1 1 0]
[0 0 0 0 1 0]
[0 0 0 0 0 1]
...
[0 3 2 2 0 0]
[0 1 2 1 0 0]
[0 2 1 2 1 0]
[0 0 0 1 1 0]
[0 0 0 0 1 0]
[0 0 0 0 0 1]]
Whenever a row in the second array matches the first six values in the first array I need to add the last element of the first array (the 7th element) at the end of the row of the second array that matches and when it doesn't match add a 0. The result would look like this:
[[1 2 1 0 2 0 1]
[1 1 1 0 2 0 0]
[1 1 1 0 2 0 0]
[1 2 1 0 2 0 1]
[0 3 2 2 0 0 0]
[0 1 2 1 0 0 2]
[0 2 1 2 1 0 0]
[0 0 0 1 1 0 4]
[0 0 0 0 1 0 5]
[0 0 0 0 0 1 6]
...
[0 3 2 2 0 0 0]
[0 1 2 1 0 0 2]
[0 2 1 2 1 0 0]
[0 0 0 1 1 0 4]
[0 0 0 0 1 0 5]
[0 0 0 0 0 1 6]]
You could use:
import numpy as np
m = (B == A[:,None,:6]).all(2)
new_A = np.c_[B, np.where(m.any(0), np.take(A[:,6], m.argmax(0)), 0)]
How it works:
1- use broadcasting to compare B with all combinations of rows of A (limited to first 6 columns), and build a mask
2- Using numpy.where to check the condition: if at least 1 row in A matches, use numpy.argmax to get the index of the first match, and numpy.take to get the value from A's last column. Else, assign 0.
3- concatenate B and the newly build column
output:
array([[1, 2, 1, 0, 2, 0, 1],
[1, 1, 1, 0, 2, 0, 0],
[1, 1, 1, 0, 2, 0, 0],
[1, 2, 1, 0, 2, 0, 1],
[0, 3, 2, 2, 0, 0, 0],
[0, 1, 2, 1, 0, 0, 2],
[0, 2, 1, 2, 1, 0, 0],
[0, 0, 0, 1, 1, 0, 4],
[0, 0, 0, 0, 1, 0, 5],
[0, 0, 0, 0, 0, 1, 6],
[0, 3, 2, 2, 0, 0, 0],
[0, 1, 2, 1, 0, 0, 2],
[0, 2, 1, 2, 1, 0, 0],
[0, 0, 0, 1, 1, 0, 4],
[0, 0, 0, 0, 1, 0, 5],
[0, 0, 0, 0, 0, 1, 6]])
inputs:
A = [[1, 2, 1, 0, 2, 0, 1],
[1, 2, 1, 0, 2, 0, 1],
[1, 2, 1, 0, 2, 0, 1],
[1, 2, 1, 0, 2, 0, 1],
[0, 1, 2, 1, 0, 0, 2],
[0, 1, 2, 1, 0, 0, 2],
[0, 0, 1, 0, 1, 0, 3],
[0, 0, 0, 1, 1, 0, 4],
[0, 0, 0, 0, 1, 0, 5],
[0, 0, 0, 0, 0, 1, 6]]
A = np.array(A)
B = [[1, 2, 1, 0, 2, 0],
[1, 1, 1, 0, 2, 0],
[1, 1, 1, 0, 2, 0],
[1, 2, 1, 0, 2, 0],
[0, 3, 2, 2, 0, 0],
[0, 1, 2, 1, 0, 0],
[0, 2, 1, 2, 1, 0],
[0, 0, 0, 1, 1, 0],
[0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 1],
[0, 3, 2, 2, 0, 0],
[0, 1, 2, 1, 0, 0],
[0, 2, 1, 2, 1, 0],
[0, 0, 0, 1, 1, 0],
[0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 1]]
B = np.array(B)

contingency table for Julia array

Consider a (m x n) matrix of only 0s and 1s, with m potentially large.
julia> rand([0, 1], 5, 3)
5×3 Array{Int64,2}:
0 1 1
0 0 0
0 1 1
1 0 0
1 0 1
Is there an efficient way to count the number of occurrences and track the indices for each unique row?
For example, the first row above occurs twice, at indices 1 and 3. I am trying to build a sort of contingency table.
Thanks
This is one of the approaches that is based only on functionalities provided in Julia Base:
julia> x = rand([0, 1], 20, 3)
20×3 Matrix{Int64}:
1 0 1
1 1 1
0 0 0
1 0 0
0 0 1
1 0 1
0 0 0
1 0 1
0 0 0
0 0 1
1 1 0
0 1 1
0 1 1
1 0 0
0 0 0
0 0 0
0 0 1
0 1 0
1 1 0
1 0 0
julia> d = Dict()
Dict{Any, Any}()
julia> for (i, r) in enumerate(eachrow(x))
push!(get!(d, r, Int[]), i)
end
julia> d
Dict{Any, Any} with 8 entries:
[1, 1, 1] => [2]
[0, 0, 0] => [3, 7, 9, 15, 16]
[0, 0, 1] => [5, 10, 17]
[1, 1, 0] => [11, 19]
[1, 0, 0] => [4, 14, 20]
[0, 1, 1] => [12, 13]
[1, 0, 1] => [1, 6, 8]
[0, 1, 0] => [18]
and now using the SplitApplyCombine.jl package:
julia> using SplitApplyCombine
julia> group(i -> view(x, i, :), axes(x, 1))
8-element Dictionaries.Dictionary{Any, Vector{Int64}}
[1, 0, 1] │ [1, 6, 8]
[1, 1, 1] │ [2]
[0, 0, 0] │ [3, 7, 9, 15, 16]
[1, 0, 0] │ [4, 14, 20]
[0, 0, 1] │ [5, 10, 17]
[1, 1, 0] │ [11, 19]
[0, 1, 1] │ [12, 13]
[0, 1, 0] │ [18]

NumPy change negitive values to zero AND all values above it in the column

How can I reset all values in a column from a negative number to the top to zero in an array?
data = np.array([[1, 1, 1, 2], [0, 1, 0, -1], [-1, 0, 1, 0], [1, 1, 1, 1]])
resetneg_data = np.where(data<0, 0, data)
print(resetnet_data)
This gives me:
[[1 1 1 2]
[0 1 0 0]
[0 0 1 0]
[1 1 1 1]]
But what I want is:
[[0 1 1 0]
[0 1 0 0]
[0 0 1 0]
[1 1 1 1]]
That is, zero where negative, and zero everywhere above the negative. But not zero above other zeros. So that if a column drops below zero in a row, all the rows above it reset to zero.
Can I mask the values somehow by finding the specific ranges:
mask_end = np.where(data < 0)
print(mask_end)
gives:
(array([1, 2]), array([3, 0]))
maybe... use those values to replace to that row in a column with zeros?
# find values that are smaller than 0 from bottom up along with values above negatives
mask = np.minimum.accumulate(data[::-1])[::-1] < 0
# set value at mask positions as 0
data[mask] = 0
data
#[[0 1 1 0]
# [0 1 0 0]
# [0 0 1 0]
# [1 1 1 1]]

Iterate Clojure vectors

I am implementing a Clojure function (gol [coll]) that receives a vector of vectors of the same size with 1 and 0, iterates it checking the near positions of each index and returns a new board; something like Conway’s Game of Life
Input:
`(gol [[0 0 0 0 0]
[0 0 0 0 0]
[0 1 1 1 0]
[0 0 0 0 0]
[0 0 0 0 0]])`
Output:
`[[0 0 0 0 0]
[0 0 1 0 0]
[0 0 1 0 0]
[0 0 1 0 0]
[0 0 0 0 0]]`
How can I iterate the vectors and change the values at the same time?
Use assoc-in:
(assoc-in v [0 0] 1)
The above will set the top left value to 1.
To set many at once you can reduce over assoc-in.
(def new-values [[[0 0] 1]
[[0 1] 2]
[[0 2] 3]])
(reduce
(fn [acc ele]
(apply assoc-in acc ele))
v
new-values)
;;=> [[1 2 3 0 0] ...]
To go from your input to your output the transform would be:
[[[2 1] 0]
[[2 3] 0]
[[1 2] 1]
[[3 2] 1]]

Resources