ClickHouse array - find a longest chain of repeating number in array

ClickHouse array - find a longest chain of repeating number in array - arrays

In Clickhouse I have a column with array of Int16 elements. I'm looking for a way to find a longest chain of repeating number 1.
For example, in array [0,1,1,1,5,1,1,1,1,1,2] longest chain of repeating 1 is 5 elements. Is there any way do do it with existing functions ?

Try this query:
SELECT
/* The source number. */
data.1 AS number,
/* The source array. */
data.2 AS array,
/* Number the values in each chain. */
arrayCumSumNonNegative((x, index) -> x = number ? 1 : -index, array, arrayEnumerate(array)) AS partiallySumArray,
arrayReduce('max', partiallySumArray) AS result
FROM
(
/* test data set */
SELECT arrayJoin([
/**/
(1, []),
(1, [0, 2, 2, 2, 5]),
(1, [0, 1, 1, 1, 5, 1, 1, 1, 1, 1,2]),
(1, [1, 1, 1, 2, 3, 4, 5, 1, 1]),
(1, [-5, 100, 1, 1, 0, 1, 1, 1]),
(1, [1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0]),
/**/
(5, []),
(5, [0, 2, 2, 2, 55]),
(5, [5, 5, 10, 300, 5, 77, 5])
]) AS data
)
FORMAT Vertical
/* Result:
Row 1:
──────
number: 1
array: []
partiallySumArray: []
result: 0
Row 2:
──────
number: 1
array: [0,2,2,2,5]
partiallySumArray: [0,0,0,0,0]
result: 0
Row 3:
──────
number: 1
array: [0,1,1,1,5,1,1,1,1,1,2]
partiallySumArray: [0,1,2,3,0,1,2,3,4,5,0]
result: 5
Row 4:
──────
number: 1
array: [1,1,1,2,3,4,5,1,1]
partiallySumArray: [1,2,3,0,0,0,0,1,2]
result: 3
Row 5:
──────
number: 1
array: [-5,100,1,1,0,1,1,1]
partiallySumArray: [0,0,1,2,0,1,2,3]
result: 3
Row 6:
──────
number: 1
array: [1,1,0,1,1,1,1,1,1,0,0]
partiallySumArray: [1,2,0,1,2,3,4,5,6,0,0]
result: 6
Row 7:
──────
number: 5
array: []
partiallySumArray: []
result: 0
Row 8:
──────
number: 5
array: [0,2,2,2,55]
partiallySumArray: [0,0,0,0,0]
result: 0
Row 9:
───────
number: 5
array: [5,5,10,300,5,77,5]
partiallySumArray: [1,2,0,0,1,0,1]
result: 2
*/

Related

on restoring the original order of row elements

Consider numpy array p shown below. Unique values 0 to 9 are used in each row. The distinguishing characteristic is that every row is composed of 5 (in this case) values PAIRED with 5 other values. Pairs are formed when p[k] = p[p[k]] for k = 0 to 9.
p = np.array([[1, 0, 3, 2, 5, 4, 7, 6, 9, 8],
...
[6, 5, 3, 2, 9, 1, 0, 8, 7, 4],
...
[9, 8, 5, 7, 6, 2, 4, 3, 1, 0]])
Examine, for example, the row:
[6, 5, 3, 2, 9, 1, 0, 8, 7, 4]
This row pairs values 6 and 0 because p[6] = 0 and p[0] = 6. Other pairs are values (5, 1), (3, 2), (9, 4), (8, 7). Different rows may have different arrangements of pairs.
Now, we are interested here in the 1st value of each pair (ie: 6, 5, 3, 9, 8) and the 2nd value of each pair (ie: 0, 1, 2, 4, 7)
I'm not sure this is the best way to proceed, but I've separated the 1st pair values from the 2nd pair values this way:
import numpy as np
p = np.array([6, 5, 3, 2, 9, 1, 0, 8, 7, 4])
p1 = np.where(p[p] < p) # indices of 1st member of pairs
p2 = (p[p1]) # indices of 2nd member of pairs
qi = np.hstack((p1, p2.reshape(1,5)))
qv = p[qi]
#output: qi = [0, 1, 2, 4, 7, 6, 5, 3, 9, 8] #indices of 1st of pair values, followed by 2nd of pair values
# qv = [6, 5, 3, 9, 8, 0, 1, 2, 4, 7] #corresponding values
Finally consider another 1D array: c = [1, 1, 1, 1, 1, -1, -1, -1, -1, -1].
I find c*qv, giving:
out1 = [6, 5, 3, 9, 8, 0, -1, -2, -4, -7]
QUESTION: out1 holds the correct values, but I need them to be in the original order (as found in p). How can this be achieved?
I need to get:
out2 = [6, 5, 3, -2, 9, -1, 0, 8, -7, -4]

You can reuse p1 and p2, which hold the original position information.
out2 = np.zeros_like(out1)
out2[p1] = out1[:5]
out2[p2] = out1[5:]
print(out2)
# [ 6 5 3 -2 9 -1 0 8 -7 -4]
Can also use qi to similar effect, but even neater.
out2 = np.zeros_like(out1)
out2[qi] = out1
Or using np.put in case you don't want to create out2:
np.put(out1, qi, out1)
print(out1)
# [ 6 5 3 -2 9 -1 0 8 -7 -4]
2D Case
For 2D version of the problem, we will use a similar idea, but some tricks while indexing.
p = np.array([[1, 0, 3, 2, 5, 4, 7, 6, 9, 8],
[6, 5, 3, 2, 9, 1, 0, 8, 7, 4],
[9, 8, 5, 7, 6, 2, 4, 3, 1, 0]])
c = np.array([1, 1, 1, 1, 1, -1, -1, -1, -1, -1])
p0 = np.arange(10) # this is equivalent to p[p] in 1D
p1_r, p1_c = np.where(p0 < p) # save both row and column indices
p2 = p[p1_r, p1_c]
# We will maintain row and column indices, not just qi
qi_r = np.hstack([p1_r.reshape(-1, 5), p1_r.reshape(-1, 5)]).ravel()
qi_c = np.hstack([p1_c.reshape(-1, 5), p2.reshape(-1, 5)]).ravel()
qv = p[qi_r, qi_c].reshape(-1, 10)
out1 = qv * c
# Use qi_r and qi_c to restore the position
out2 = np.zeros_like(out1)
out2[qi_r, qi_c] = out1.ravel()
print(out2)
# [[ 1 0 3 -2 5 -4 7 -6 9 -8]
# [ 6 5 3 -2 9 -1 0 8 -7 -4]
# [ 9 8 5 7 6 -2 -4 -3 -1 0]]
Feel free to print out each intermediate variable, will help you understand what's going on.

Finding indices of unique rows and columns in a 2D array and the minimum sum of the elements in those positions

i have stumbled upon a problem where i am given a 5x5 matrix in the form of a 2D array and i am supposed to find the minimum sum of 5 elements where each element should be in unique row and column and print the indices of those elements and the minimum sum.
The problem gave 3 test cases as example.
Test case 1:
{
{5, 4, 4, 1, 6},
{1, 3, 2, 4, 6},
{3, 2, 3, 2, 6},
{0, 4, 5, 4, 6},
(6, 6, 6, 6, 6}
};
Output: (3,0) (2,1) (1,2) (0,3) (4,4)
Minimum sum: 11
Test case 2:
{0, 0, 0, 0, 0},
{0, 0, 0, 0, 0},
{0, 0, 0, 0, 0},
{0, 0, 0, 0, 0},
{0, 0, 0, 0, 0}
};
Output: (0,0) (1,1) (2,2) (3,3) (4,4)
Minimum sum: 0
Test case 3:
{
{1, 2, 3, 4, 5},
{5, 4, 3, 2, 1},
{1, 2, 7, 4, 5},
{5, 4, 3, 2, 1},
{1, 2, 3, 4, 5},
};
Output: (0,0) (2,1) (4,2) (1,3) (3,4)
Minimum sum: 9
I would like to understand what is meant when they say unique row and column and all i can see from the test cases is that the column indices are starting from 0 and increasing by one for each pair of indices.I would like to know how to approach this problem.

Unique row and column means that no two elements share a row or a column.
Here, I've highlighted the selected numbers.
You can see that when a number is selected, No other number in that same column is also selected. No other number in that same row is selected.
5 4 4 1 6
1 3 2 4 6
3 2 3 2 6
0 4 5 4 6
6 6 6 6 6
1 + 2 + 2 + 0 + 6 = 11

Counting inversions in a changing array

You have an array A[] of size (1 ≤ N ≤ 10^5). For each of i = 0, 1, 2, ..., N - 1, we want to determine the number of inversions in the array if all entries greater than i are decreased to i.
An inversion is defined as two entries A[i] and A[j] where A[i] > A[j] and i < j.
Example:
A[] = {3, 2, 1, 5, 2, 0, 5}
i = 0: {0, 0, 0, 0, 0, 0, 0} Inversions: 0
i = 1: {1, 1, 1, 1, 1, 0, 1} Inversions: 5
i = 2: {2, 2, 1, 2, 2, 0, 2} Inversions: 7
i = 3: {3, 2, 1, 3, 2, 0, 3} Inversions: 10
i = 4: {3, 2, 1, 4, 2, 0, 4} Inversions: 10
i = 5: {3, 2, 1, 5, 2, 0, 5} Inversions: 10
i = 6: {3, 2, 1, 5, 2, 0, 5} Inversions: 10
So your output would be:
0
5
7
10
10
10
10
I know how to find the number of inversions in an array through MergeSort in O(NlogN). However, if I was to explicitly generate every array for each value of i, it would be an O(N^2logN) algorithm which wouldn't pass in time.
One observation I made was that the inversions increase as i increases. This makes sense because when all entries are 0, there will be no inversions (as it is sorted), but as you keep increasing the maximum entry value, the entry can become larger than entries that previously were of the same value.
So you could start with an A[] with only 0s, and keep increasing i. You can use your answer for previous values of i to determine the answer for larger values of i. Still, if you scanned through each array you would still get an O(N^2) algorithm.
How can I solve this problem?

I'll take a stab at this. We're going to consider queries in descending order, so from i = N-1, ..., down to 0. First of all, notice that when we're shrinking all A[j] > i to i, any A[j] = i will no longer cause an inversion with elements larger than it of smaller index.
For example, say we have A = [1, 2, 5, 4] and we shrink A[2] to 4. Then we have A = [1, 2, 4, 4] and our single inversion disappears. Thus, for each j, we can count the number of elements in A with smaller index and larger value, and denote this V[j], the "number of inversions it contributes". We find the total number of inversions in the original array, and then for each i = N-1,...,0 we remove V[j] from the total number of inversions for all j such that V[j] = i.
Let's apply this to the example given.
A = [3, 2, 1, 5, 2, 0, 5]
V = [0, 1, 2, 0, 2, 5, 0]
Then, going through i = 6, 5, 4, 3, 2, 1:
i = 6: A = [3, 2, 1, 5, 2, 0, 5], res = 10 (original calculation using merge sort)
i = 5: A = [3, 2, 1, 5, 2, 0, 5], res = 10 (subtract nothing because V[3] = V[6] = 0)
i = 4: A = [3, 2, 1, 4, 2, 0, 4], res = 10 (subtract nothing because no occurrences of 4)
i = 3: A = [3, 2, 1, 3, 2, 0, 3], res = 10 (10 - V[0] = 10)
i = 2: A = [2, 2, 1, 2, 2, 0, 2], res = 7 (10 - V[1] - V[4] = 10 - 1 - 2 = 7)
i = 1: A = [1, 1, 1, 1, 1, 0, 1], res = 5 (7 - V[2] = 7 - 2 = 5)
i = 0: A = [0, 0, 0, 0, 0, 0, 0], res = 0 (5 - V[5] = 5 - 5 = 0)
And we get our desired outputs. Implementation details can vary; you can find the number of elements greater than A[j] with lower index using a Fenwick Tree or something similar. This algorithm runs in O(NlogN) time.

For each element in an array, if the element is less than its previous element, increase it till the previous element with one

Suppose I have an array: list1 = [8, 5, 3, 1, 1, 10, 15, 9]
Now if the element is less than its previous element, increase it till the previous element with one.
Here:
5 < 8 so 5 should become: 5 + 3 + 1 = 9 i.e (8+1)
3 < 5 so 3 should become: 3 + 2 + 1 = 6 i.e (5+1)
1 < 3 so 1 should become: 1 + 2 + 1 = 4 i.e (3+1)
Now I am able to get the difference between elements if its less than its previous element.
But, how to use it in a final list to get an output like this:
finallist = [8, 9, 6, 4, 1, 10, 15, 16]
Also how can I get a final list value of 'k' list in my code? Right now it shows:
[2]
[2, 4]
[2, 4, 3]
[2, 4, 3, 3]
[2, 4, 3, 3, 7]
Source code:
list1 = [8, 5, 3, 1, 1, 10, 15, 9]
k = []
def comput(x):
if i[x] < i[x-1]:
num = (i[x-1] - i[x]) + 1
k.append(num)
print(k)
return
for i in [list1]:
for j in range(len(list1)):
comput(j)

You can use a list comprehension for this. Basically, the following code will check if one is larger than the next. If it is, then it will convert it to the previous+1.
list1 = [8, 5, 3, 1, 1, 10, 15, 9]
k = [list1[0]] + [i if j<=i else j+1 for i,j in zip(list1[1:],list1[:-1])]
cost = [j-i for i,j in zip(list1,k)]
print(k)
print(cost)
Output:
[8, 9, 6, 4, 1, 10, 15, 16]
[0, 4, 3, 3, 0, 0, 0, 7]

The following code will create a new list with the required output
l1 = [8, 5, 3, 1, 1, 10, 15, 9]
l = [l1[0]]
c=[0] # cost / difference list
for i in range(len(l1)-1):
if l1[i+1] < l1[i]:
l.append(l1[i]+1)
c.append(l1[i]+1-l1[i+1])
else:
l.append(l1[i+1])
c.append(0)
print(l)
Output
[8, 9, 6, 4, 1, 10, 15, 16]
[0, 4, 3, 3, 0, 0, 0, 7]

How do I sort data from a list and append the the number of occurrences in the list

I am having trouble dissecting this data. I would like to find out how many #1 are in each list. After finding that number, I would like to append it to another list for later.
I seem to be getting the input:
--> [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
When I really want:
--> [2, 2, 2, 2, 0, 2, 2, 1, 11, 0]
This is the code:
d = []
count = 0
b = [[1,3,6,2,7,3,9,2,7,1,7],
[1,5,8,3,0,3,6,2,7,2,1],
[1,5,2,6,8,6,2,5,1,8,9],
[5,2,5,2,1,8,1,5,2,4,6],
[5,7,2,7,3,7,3,7,3,9,2],
[1,5,8,3,0,3,6,2,7,2,1],
[5,2,5,2,1,8,1,5,2,4,6],
[3,6,1,5,7,8,4,3,6,3,3],
[1,1,1,1,1,1,1,1,1,1,1],
[3,4,5,6,8,5,7,5,7,3,7]]
for i in b:
for x in b:
if x == 1:
count =+ 1
d.append(count)
count = 0
print(d)

You are iterating over the wrong object in your second for loop, I believe you meant:
for x in i:
This is why you are getting 0s
There is a Counter class in the standard collections module, so you can simplify this:
>>> from collections import Counter
>>> [Counter(i)[1] for i in b]
[2, 2, 2, 2, 0, 2, 2, 1, 11, 0]
You can also do this without the Counter class a bit more verbosely:
>>> [sum(1 for x in i if x == 1) for i in b]
[2, 2, 2, 2, 0, 2, 2, 1, 11, 0]

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

ClickHouse array - find a longest chain of repeating number in array - arrays

In Clickhouse I have a column with array of Int16 elements. I'm looking for a way to find a longest chain of repeating number 1. For example, in array [0,1,1,1,5,1,1,1,1,1,2] longest chain of repeating 1 is 5 elements. Is there any way do do it with existing functions ?

Related

on restoring the original order of row elements

Finding indices of unique rows and columns in a 2D array and the minimum sum of the elements in those positions

Counting inversions in a changing array

For each element in an array, if the element is less than its previous element, increase it till the previous element with one

How do I sort data from a list and append the the number of occurrences in the list

Categories

Resources