Count subarrays with similarity number more than K - arrays

Similarity number for two arrays X and Y, each with size N, is defined as the number of pairs of indices (i,j) such that X[i]=Y[j] , for 1<=i,j
Now we are given two arrays, of size N and M. We need to find the number of sub arrays of equal sizes from these two arrays such that the similairty number of each subarray pair is greater or equal to given number K.
Example, say we have N=3, M=3, K=1 and arrays be [1,3,4] and [1,5,3] then here answer is 6
Explanation :
({1},{1})
({3},{3})
({1,3},{1,5})
({1,3},{5,3})
({3,4},{5,3})
({1,3,4},{1,5,3})
so ans = 6
How to solve it for given arrays of size N,M and given integer K.
Number of elements can't be more than 2000. K is also less than N*M
Approach :
Form all subarrays from array 1 of size N, those will be N*(N+1)/2 And same for array 2 of size M. Then try to find similarity number between each subarray pair. But this is very unoptimised way of doing it.
What can be better way to solve this problem ? I think Dynamic programming can be used to solve this. Any suggestions ?
For {1,1,2} and {1,1,3} and K=1
{[1(1)],[1(1)]}
{[1(1)],[1(2)]}
{[1(2)],[1(1)]}
{[1(2)],[1(2)]}
{[1(1),1(2)],[1(1)]}
{[1(1),1(2)],[1(2)]}
{[1(1)],[1(1),1(2)]}
{[1(2)],[1(1),1(2)]}
{[1(1),1(2)],[1(1),1(2)]}
{[1(2),2],[1(2),3]}
{[1(1),1(2),2],[1(1),1(2),3]}

Since the contest is now over, just for the sake of completeness, here's my understanding of the editorial answer there (from which I learned a lot). Let's say we had an O(1) time method to calculate the similarity of two contiguous subarrays, one from each array, of length l. Then, for each pair of indexes, (i, j), we could binary search the smallest l (extending, say to their left) that satisfies similarity k. (Once we have the smallest l, we know that any greater such l also has enough similarity and we can add those counts in O(1) time.) The total time in this case would be O(M * N * log(max (M,N)).
Well, it turns out there is a way to calculate the similarity of two contiguous subarrays in O(1): matrix prefix-sums. In a matrix, A; where each entry, A(i,j), is 1 if the first array's ith element equals the second array's jth element and 0 otherwise; the sum of the elements in A in the rectangle A(i-l, j-l), A(i,j) (top-left to bottom-right) is exactly that. And we can calculate that sum in O(1) time with matrix prefix-sums, given O(M*N) preprocessing.

Related

Min Increment/Decrement operation to make all subarray sum of length k equal

I am solving a problem where I have been given an array A of length N and an integer 0<K<N. We need to make sum of all subarrays(including circular) of length K equal in min operations. In one operation, we can either increment or decrement an element of array by 1.
I am unable to think of an algorithm to do this. For K=1, I can calculate the mean and then calculate the sum of absolute difference between mean and the array elements. But for larger K, can anybody give me a hint?
Hint: the final array should be whole repetitions of the first K elements, like [1,2,3,1,2,3,1,2,3] due to the circular constraint.
Hence if N is not divisible by K, then all elements should be equal, and they should all be changed to the median of the array. If N is even, taking the N/2 or N/2+1 smallest element is the same.
Otherwise, you need to make a[0], a[K], ... equal, a[1], a[K+1], ... equal and so on. Solve them independently by changing each to the corresponding median.

Find k missing elements in array of size n-k [duplicate]

This question already has answers here:
Easy interview question got harder: given numbers 1..100, find the missing number(s) given exactly k are missing
(49 answers)
Closed 1 year ago.
Given an array of n unique integers in [1, n], take random k elements away from the array, then shuffle it to be left with an array of size n-k of integers of size n-k
I want to find those k integers in the best complexity.
If k==1, we can sum all the elements in the array, and the missing element would be the difference between n(n+1)/2 (sum of all numbers from 1 to n) and the sum of the array elements.
I would like to extend this algorithm to k missing elements by k equations, but don't know how to build the equations. Can it be done?
Lets assume a[1], a[2], a[3], ....a[n] are the original unique integers and b[1], b[2],...b[n-k] are the integers after k integers are removed.
Sort the arrays a and b.
For each adjacent pair (i,i+1) in b do a binary search for b[i], b[i+1] in array 'a' and get the indices lets say p, q
If q != p + 1 then all the integers in array a between p, q are among the k integers taken away.
The complexity should be O(n log n )
Yes it can. It is not pretty though.
You need to realize that in a [1, n] range
the sum of squares is n(n + 1)(2n + 1) / 6
the sum of cubes is n**2(n + 1)**2 / 4
etc
The devil is in etc. The general formula of summing kth powers is
Sum(i: [1..n]) i**k = 1/(k+1) Sum(j: [1..k]) (-1)**j binom(k+1, j) Bj n**(k-j+1)
where Bj are the Bernoulli's numbers. It is a polynomial of the k+1th degree.
The Bernoulli numbers are notoriously hard to compute, and the resulting system of equations is not too pleasant to deal with.
Assuming that you overcame all the computational problems, the complexity will be O(nk).

Max number in array algorithm, when array's size is power of 2

I've found this algorithm and i have been asked what this algorithm does.
Given an array A with Integers, when the array size is power of 2.
What does the algorithm returns?
1. k <- n
2. while k>1 do:
2.1. k <- k/2
2.2. for i <- 1 to k do:
2.2.1. if A[i] < A[i+k]
2.2.1.1 swap A[i] and A[i+k]
3. return A[1]
I'm almost sure this algorithm returns the largest number in the array.
My questions are:
How long the algorithm takes? I think O(n), but not sure.
How can I proof it returns the largest number?
Thanks a lot!
How long the algorithm takes? I think O(n), but not sure.
O(n) is correct. Note that on each iteration you are reducing the size of the array to be checked by half, so there will be exactly log(n) steps, since n is a power of two.
So the final complexity is n/2 + n/4 + n/8 + ... + 1, these are exactly log(n) terms. This is a geometric sequence whose summation is n-1, which is O(n).
How can I proof it returns the largest number?
In each iteration i, you have the largest element in the first n/2^i elements of the array, so after log(n) steps the first element of the array is the largest number of the array.

Maximal subset sum smaller than a given value

You are given an array of integers and a number k. The question is to find a subset such that the sum is maximal and smaller than given number k.
I feel like there is a dynamic programming approach to solve this but I am not sure how to solve this problem efficiently.
The simple dynamic programs for this class of problems all use the same basic trick: for each successive prefix of the array, compute the set of its subset sums, depending on the fact that, when the input elements are bounded, this set has many fewer than 2^n elements (for the problem in question, less than 10,000,000 instead of 2^1000). This particular problem, in Python:
def maxsubsetsumlt(array, k):
sums = {0}
for elem in array:
sums.update({sum_ + elem for sum_ in sums if sum_ + elem < k})
return max(sums)
This could be done using the Subset Sum algorithm with a small adaptation. Instead of returning the boolean value from the last (bottom right) cell you search for the first cell with a value of true which indicates that there is a combination of elements that sum up to this particular k_i < k. This k_i is your maximal sum smaller than k. This algorithm has a worst case time and space complexity of O(nk).

Algorithm - Find the center index

Given an array of ints with size t, one needs to find the center index. The center index x is the index where the sum of ints (0 to x-1) is equal to sum (x+1 to t-1).
The best algorithm I could come up with is O(n).
I would have a temp array with the sums of all ints before (not including the one at index x) : so at index 1 it would be 1, at 2 it would be a sum of 2 and 1 and so on.
Another int would be the sum of all ints.
I would loop twice through the array, the first make the temp array, and the other to find if both parts are equal.
Is there a better algorithm O(logn)?
Since you have to calculate the sum of both the half of the array, this can't be solved in less than O(n). Because you have to inspect each element at least once (to calculate the sum). Any algorithm can be logn only if we can skip inspecting certain elements of the array based on some condition which is not possible here.

Resources