Regarding positions of elements within a row's 'islands of positive values' - arrays

Consider a specified row in a numpy integer array. My task has 3 parts:
Identify the locations of any positive islands (ie: consec positive values) in the specified row.
Identify the lengths of each of these positive islands.
Determine (True or False) if the island element that is closest in value to the row index is in the FIRST or LAST island position.
The following code, I believe, answers parts a) and b).
import numpy as np
arr = np.array([[-1, -4, -2, -8, 8, -3, -5, -6, 7],
[-4, -9, -1, 3, 8, -7, -6, 2, -5],
[ 4, 6, 9, 3, -1, -2, 5, 4, 8],
[ 5, -1, 2, 5, 6, 7, -3, -4, 1]])
row_idx = 2
arr_row = arr[row_idx]
mask = arr_row > 0
changes = np.concatenate(([mask[0]], mask[:-1] != mask[1:], [mask[-1]]))
isl_idx = np.where(changes)[0] # 1st index of islands
pos_idx = isl_idx[::2] # 1st index of POSITIVE islands
print('pos_idx = ', pos_idx)
pos_len = np.diff(isl_idx)[::2] # length of POSITIVE islands
print('pos_len = ', pos_len)
print()
When row_idx = 2, for example, we have output:
pos_idx = [0, 6] # first indices of the two positive islands
pos_len = [4,3]) # lengths of the two positive islands
My problem is that I can't find a good way to tackle part c). The desired output, for the example above, would look like:
firstLast = [True, False]
Explanation: We are in row_idx = 2, so:
The value in the 1st positive island closest to 2 is 3, and this 3 is indeed in the FIRST or LAST position of its island. (True)
The value in the 2nd positive island closest to 2 is 4, which is not in the FIRST or LAST position of its island. (False)

Related

Eliminating array rows based on a property of consecutive pairs of elements

We are given an array sample a, shown below, and a constant c.
import numpy as np
a = np.array([[1, 3, 1, 11, 9, 14],
[2, 12, 1, 10, 7, 6],
[6, 7, 2, 14, 2, 15],
[14, 8, 1, 3, -7, 2],
[0, -3, 0, 3, -3, 0],
[2, 2, 3, 3, 12, 13],
[3, 14, 4, 12, 1, 4],
[0, 13, 13, 4, 0, 3]])
c = 2
It is convenient, in this problem, to think of each array row as being composed of three pairs, so the 1st row is [1,3, 1,11, 9,14].
DEFINITION: d_min is the minimum difference between the elements of two consecutive pairs.
The PROBLEM: I want to retain rows of array a, where all consecutive pairs have d_min <= c. Otherwise, the rows should be eliminated.
In the 1st array row, the 1st pair (1,3) and the 2nd pair (1,11) have d_min = 1-1=0.
The 2nd pair (1,11) and the 3rd pair(9,14) have d_min = 11-9=2. (in both cases, d_min<=c, so we keep this row in a)
In the 2nd array row, the 1st pair (2,12) and the 2nd pair (1,10) have d_min = 2-1=1.
But, the 2nd pair (1,10) and the 3rd pair(7,6) have d_min = 10-7=3. (3 > c, so this row should be eliminated from array a)
Current efforts: I currently handle this problem with nested for-loops (2 deep).
The outer loop runs through the rows of array a, determining d_min between the first two pairs using:
for r in a
d_min = np.amin(np.abs(np.subtract.outer(r[:2], r[2:4])))
The inner loop uses the same method to determine the d_min between the last two pairs.
Further processing only is done only when d_min<= c for both sets of consecutive pairs.
I'm really hoping there is a way to avoid the for-loops. I eventually need to deal with 8-column arrays, and my current approach would involve 3-deep looping.
In the example, there are 4 row eliminations. The final result should look like:
a = np.array([[1, 3, 1, 11, 9, 14],
[0, -3, 0, 3, -3, 0],
[3, 14, 4, 12, 1, 4],
[0, 13, 13, 4, 0, 3]])
Assume the number of elements in each row is always even:
import numpy as np
a = np.array([[1, 3, 1, 11, 9, 14],
[2, 12, 1, 10, 7, 6],
[6, 7, 2, 14, 2, 15],
[14, 8, 1, 3, -7, 2],
[0, -3, 0, 3, -3, 0],
[2, 2, 3, 3, 12, 13],
[3, 14, 4, 12, 1, 4],
[0, 13, 13, 4, 0, 3]])
c = 2
# separate the array as previous pairs and next pairs
sx, sy = a.shape
prev_shape = sx, (sy - 2) // 2, 1, 2
next_shape = sx, (sy - 2) // 2, 2, 1
prev_pairs = a[:, :-2].reshape(prev_shape)
next_pairs = a[:, 2:].reshape(next_shape)
# subtract which will effectively work as outer subtraction due to numpy broadcasting, and
# calculate the minimum difference for each pair
pair_diff_min = np.abs(prev_pairs - next_pairs).min(axis=(2, 3))
# calculate the filter condition as boolean array
to_keep = pair_diff_min.max(axis=1) <= c
print(a[to_keep])
#[[ 1 3 1 11 9 14]
# [ 0 -3 0 3 -3 0]
# [ 3 14 4 12 1 4]
# [ 0 13 13 4 0 3]]
Demo Link

Permutations with predicate Scala

I am trying to solve combinations task in Scala. I have an array with repeated elements and I have to count the number of combinations which satisfy the condition a+b+c = 0. Numbers should not be repeated, if they are in different places it doesn`t count as a distinct combination.
So I turned my array into Set, so the elements would not repeat each other. Also, I have found about combinations method for sequences, but I am not really sure how to use it in this case. Also, I do not know where t put these permutations condition.
Here is what I have for now:
var arr = Array(-1, -1, -2, -2, 1, -5, 1, 0, 1, 14, -8, 4, 5, -11, 13, 5, 7, -10, -4, 3, -6, 8, 6, 2, -9, -1, -4, 0)
val arrSet = Set(arr)
arrSet.toSeq.combinations(n)
I am new to Scala, so I would be really grateful for any advice!
Here's what you need:
arr.distinct.combinations(3).filter(_.sum == 0).size
where:
distinct removes the duplicates
combinations(n) produces combinations of n elements
filter filters them by keeping only those whose sum is 0
size returns the total number of such combinations
P.S.: arr don't need to be a var. You should strive to never use var in Scala and stick to val as long as it's possible.

Python Array append vectors and then sum the elements of the array positionwise (not elementwise)

first of all I explain what I would like to do. I have a function which gives me some lists. These lists have the same number of elements and they contain numbers, which represents positions on the x-axis. For example one of them is [-11, -6, -5, -4, -1, 1, 3, 4, 6, 7], another one is [-11, -6, -5, -3, -1, 1, 2, 4, 5, 7]. The entries will always be integers and in ascending order.
I want to run this function many times and at the end "sum-up" all these vectors in a particular way. Imagine that each vector shows the position of a person in the x-axis. I want to know, at the end of say q experiments, how many people there are in each position. However, they do not all start from -11 or end at 7.
For example [-13, -8, -3, -1, 0, 1, 2, 4, 5, 7] or [-12, -7, -2, -1, 0, 1, 3, 4, 5, 6] are other two valid output from the function.
How can I do that?
My idea was to create a loop, compute the function, and store these lists into an array and then use some weird matrix operation. However I am absolutely stuck, this is my attempt, where rep_assign_time2(n,p,m) is the function that gives me the lists:
def many_experiments(n,p,m,q):
jj = 0
vector_min = []
vector_max = []
a = np.array([])
while jj < q:
s = rep_assign_time2(n,p,m)
a = np.concatenate((a,s), axis = 0) # I add s as an element of a
for k in range(a.shape):
ma = max(a[k])
mi = min(a[k])
vector_min.append(mi)
vector_max.append(ma)
minimum = min(vector_min)
maximum = max(vector_max)
And then I have NO IDEA on how to create an operation that does what I want. I've been thinking for an hour and still no clue. Do you have any idea?
You are in luck with NumPy, as there's a built-in for it as np.unique. It gives us both such unique labels (axis positions in this case) and their counts at each such label. So, let's say you have the lists stored as a list, thus a list of lists as A, you could simply do -
unq,counts = np.unique(A,return_counts=True)
Sample run -
In [33]: A = [[-11, -6, -5, -4, -1, 1, 3, 4, 6, 7], \
...: [-11, -6, -5, -3, -1, 1, 2, 4, 5, 7],\
...: [-13, -8, -3, -1, 0, 1, 2, 4, 5, 7],\
...: [-12, -7, -2, -1, 0, 1, 3, 4, 5, 6]]
In [34]: unq,counts = np.unique(A,return_counts=True)
In [35]: unq
Out[35]:
array([-13, -12, -11, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1,
2, 3, 4, 5, 6, 7])
In [36]: counts
Out[36]: array([1, 1, 2, 1, 1, 2, 2, 1, 2, 1, 4, 2, 4, 2, 2, 4, 3, 2, 3])
In [40]: import matplotlib.pyplot as plt
In [41]: # Plot the results
...: plt.bar(unq, counts, align='center')
...: plt.grid()
...: plt.show()
...:

Find All Numbers in Array which Sum upto Zero

Given an array, the output array consecutive elements where total sum is 0.
Eg:
For input [2, 3, -3, 4, -4, 5, 6, -6, -5, 10],
Output is [3, -3, 4, -4, 5, 6, -6, -5]
I just can't find an optimal solution.
Clarification 1: For any element in the output subarray, there should a subset in the subarray which adds with the element to zero.
Eg: For -5, either one of subsets {[-2, -3], [-1, -4], [-5], ....} should be present in output subarray.
Clarification 2: Output subarray should be all consecutive elements.
Here is a python solution that runs in O(n³):
def conSumZero(input):
take = [False] * len(input)
for i in range(len(input)):
for j in range(i+1, len(input)):
if sum(input[i:j]) == 0:
for k in range(i, j):
take[k] = True;
return numpy.where(take, input)
EDIT: Now more efficient! (Not sure if it's quite O(n²); will update once I finish calculating the complexity.)
def conSumZero(input):
take = [False] * len(input)
cs = numpy.cumsum(input)
cs.insert(0,0)
for i in range(len(input)):
for j in range(i+1, len(input)):
if cs[j] - cs[i] == 0:
for k in range(i, j):
take[k] = True;
return numpy.where(take, input)
The difference here is that I precompute the partial sums of the sequence, and use them to calculate subsequence sums - since sum(a[i:j]) = sum(a[0:j]) - sum(a[0:i]) - rather than iterating each time.
Why not just hash the incremental sum totals and update their indexes as you traverse the array, the winner being the one with largest index range. O(n) time complexity (assuming average hash table complexity).
[2, 3, -3, 4, -4, 5, 6, -6, -5, 10]
sum 0 2 5 2 6 2 7 13 7 2 12
The winner is 2, indexed 1 to 8!
To also guarantee an exact counterpart contiguous-subarray for each number in the output array, I don't yet see a way around checking/hashing all the sum subsequences in the candidate subarrays, which would raise the time complexity to O(n^2).
Based on the example, I assumed that you wanted to find only the ones where 2 values together added up to 0, if you want to include ones that add up to 0 if you add more of them together (like 5 + -2 + -3), then you would need to clarify your parameters a bit more.
The implementation is different based on language, but here is a javascript example that shows the algorithm, which you can implement in any language:
var inputArray = [2, 3, -3, 4, -4, 5, 6, -6, -5, 10];
var ouputArray = [];
for (var i=0;i<inputArray.length;i++){
var num1 = inputArray[i];
for (var x=0;x<inputArray.length;x++){
var num2 = inputArray[x];
var sumVal = num1+num2;
if (sumVal == 0){
outputArray.push(num1);
outputArray.push(num2);
}
}
}
Is this the problem you are trying to solve?
Given a sequence , find maximizing such that
If so, here is the algorithm for solving it:
let $U$ be a set of contiguous integers
for each contiguous $S\in\Bbb Z^+_{\le n}$
for each $\T in \wp\left([i,j)\right)$
if $\sum_{n\in T}a_n = 0$
if $\left|S\right| < \left|U\left$
$S \to u$
return $U$
(Will update with full latex once I get the chance.)

Given an unsorted array, Find the maximum subtraction between two elements in the array

I've got this question from an Interview in Microsoft: Given an unsorted array, Find the maximum subtraction between two elements in the array is a way that:
(Index1, Index2) = arr[Index2] - arr[Index1]. Index1<Index2.
Example:
given the array: [1, 5, 3, 2, 7, 9, 4, 3] -> Output: (1,9)=8.
given the array: [4, 9, 2, 3, 6, 3, 8, 1] -> Output: (2,8)=6.
The naive solution works in O(n^2) times: Scan the first index for subtraction with all other indexes and save the max value, Continue to the next index and so on.
Is there any way to optimize this?
Fairly simple when you write it down. Rephrasing the problem, you want to find the largest element to the right of each element. Now given the first example, this is:
[1, 5, 3, 2, 7, 9, 4, 3]
=>
[9, 9, 9, 9, 9, 4, 3]
Now, notice the maximums array is just the cumulative maximums from the right. Given this property it is easy to construct an O(n) time algorithm.
Implementation in python:
def find_max(xs):
ys = []
cur_max = float('-inf')
for x in reversed(xs):
cur_max = max(x, cur_max)
ys.append(cur_max)
ys = ys[::-1][1:]
return max(y - x for x, y in zip(xs, ys))
We can also construct the maximums array lazily, doing so gives us O(1) memory, which is the best possible:
def find_max(xs):
cur_max = float('-inf')
cum_max = xs[-1]
for i in range(len(xs) - 2, -1, -1):
cur_max = max(cur_max, cum_max - xs[i])
cum_max = max(cum_max, xs[i])
return cur_max
I think this is correct and O(nlogn): Split in the middle and select from right the max, from left the min value. Also split the the other 2 quarters, if one of them gives bigger result continue on that branch recursively.
Second example:
4, 9, 2, 3| 6, 3, 8, 1 -> 2 and 8
4, 9| 2, 3, 6, 3, 8, 1 -> 4 and 8
4, 9, 2, 3, 6, 3| 8, 1 -> 2 and 8
So working on the right split:
4, 9, 2, 3, 6, 3, 8| 1 -> 2 and 1
Selecting the 2 and 8 option. It also works for the first example.

Resources