How to find Longest non-decreasing Subsequence containing duplicates in O(n) or O(nlogn)? - arrays

We know about an algorithm that will find the Longest Increasing subsequence in O(nlogn). I was wondering whether we can find the Longest non-decreasing subsequence with similar time complexity?
For example, consider an array : (4,10,4,8,9).
The longest increasing subsequence is (4,8,9).
And a longest non-decreasing subsequence would be (4,4,8,9).

First, here’s a “black box” approach that will let you find the longest nondecreasing subsequence using an off-the-shelf solver for longest increasing subsequences. Let’s take your sample array:
4, 10, 4, 8, 9
Now, imagine we transformed this array as follows by adding a tiny fraction to each number:
4.0, 10.1, 4.2, 8.3, 9.4
Changing the numbers this way will not change the results of any comparisons between two different integers, since the integer components have a larger magnitude difference than the values after the decimal point. However, if you compare the two 4s now, the latter 4 compares bigger than the previous one. If you now find the longest nondecreasing subsequence, you get back [4.0, 4.2, 8.3, 9.4], which you can then map back to [4, 4, 8, 9].
More generally, if you’re working with an array of n integer values, you can add i / n to each of the numbers, where i is its index, and you’ll be left with a sequence of distinct numbers. From there running a regular LIS algorithm will do the trick.
If you can’t work with fractions this way, you could alternatively multiply each number by n and then add in i, which also works.
On the other hand, suppose you have the code for a solver for LIS and want to convert it to one that solves the longest nondecreasing subsequence problem. The reasoning above shows that if you treat later copies of numbers as being “larger” than earlier copies, then you can just use a regular LIS. Given that, just read over the code for LIS and find spots where comparisons are made. When a comparison is made between two equal values, break the tie by considering the later appearance to be bigger than the earlier one.

I think the following will work in O(nlogn):
Scan the array from right to left, and for each element solve a subproblem of finding a longest subsequence starting from the given element of the array. E.g. if your array has indices from 0 to 4, then you start with the subarray [4,4] and check what's the longest sequence starting from 4, then you check subarray [3,4] and what's the longest subsequence starting from 3, next [2,4], and so on, until [0,4]. Finally, you choose the longest subsequence established in either of the steps.
For the last element (so subarray [4,4]) the longest sequence is always of length 1.
When in the next iteration you consider another element to the left (e.g., in the second step you consider the subarray [3,4], so the new element is element with the index 3 in the original array) you check if that element is not greater than some of the elements to its right. If so, you can take the result for some element from the right and add one.
For instance:
[4,4] -> longest sequence of length 1 (9)
[3,4] -> longest sequence of length 2 (8,9) 1+1 (you take the longest sequence from above which starts with 9 and add one to its length)
[2,4] -> longest sequence of length 3 (4,8,9) 2+1 (you take the longest sequence from above, i.e. (8,9), and add one to its length)
[1,4] -> longest sequence of length 1 (10) nothing to add to (10 is greater than all the elements to its right)
[0,4] -> longest sequence of length 4 (4,4,8,9) 3+1 (you take the longest sequence above, i.e. (4,8,9), and add one to its length)
The main issue is how to browse all the candidates to the right in logarithmic time. For that you keep a sorted map (a balanced binary tree). The keys are the already visited elements of the array. The values are the longest sequence lengths obtainable from that element. No need to store duplicates - among duplicate keys store the entry with largest value.

Related

Given a sorted array of integers find subarrays such that the largest elements of the subarrays are within some distance of the smallest

For example, given an array
a = [1, 2, 3, 7, 8, 9]
and an integer
i = 2. Find maximal subarrays where the distance between the largest and the smallest elements is at most i. The output for the example above would be:
[1,2,3] [7,8,9]
The subarrays are maximal in the sense given two subarrays A and B. There exists no element b in B such that A + b satisfies the condition given. Does there exist a non-polynomial time algorithm for said problem ?
This problem might be solved in linear time using method of two pointers and two deques storing indices, the first deque keeps minimum, another keeps maximum in sliding window.
Deque for minimum (similar for maximum):
current_minimum = a[minq.front]
Adding i-th element of array: //at the right index
while (!minq.empty and a[minq.back] > a[i]):
//last element has no chance to become a minimum because newer one is better
minq.pop_back
minq.push_back(i)
Extracting j-th element: //at the left index
if (!minq.empty and minq.front == j)
minq.pop_front
So min-deque always contains non-decreasing sequence.
Now set left and right indices in 0, insert index 0 into deques, and start to move right. At every step add index in order into deques, and check than left..right interval range is good. When range becomes too wide (min-max distance is exceeded), stop moving right index, check length of the last good interval, compare with the best length.
Now move left index, removing elements from deques. When max-min becomes good, stop left and start with right again. Repeat until array end.

Minimum element in a pair from 2 lists/arrays

I have 2 sorted integer lists or arrays a and b, both having same number of elements. I want to pair an element in a with an element in b such that when I take smaller element in all pairs, their sum is minimum.
For example,
a=[1,7,14,18]
b=[8,9,10,12]
I would be pairing [(1,12),(7,10),(14,9),(18,8)] and then taking smaller element in each pair, namely, [1,7,9,8], I will get minimum sum. This is just one possibility I took. I want to know if this method of pairing elements of first list from the first element and moving forward with elements of the second list starting from end and going backwards will give me the minimum sum.
Yes, that method of pairing the largest with the smallest will work:
If the largest element in the second array is smaller than the smallest in the first array, any pairing method will work and so pair the remaining elements using your method.
If not, pairing the first pair with your method will ensure the smallest from the first (which should be counted) will be counted and the largest from the second (which should not be counted) will not be counted
Repeat steps 1 and 2 with the remaining elements from both arrays until you run out of elements
As you can see, the smallest remaining elements from the arrays will always be counted at each step along the way, and so the sum of the smallest from the resulting pairs will be minimized as desired.

Find way to separate array so each subarrays sum is less or equal to a number

I have a mathematical/algorithmic problem here.
Given an array of numbers, find a way to separate it to 5 subarrays, so that sum of each subarrays is less than or equal to a given number. All numbers from the initial array, must go to one of the subarrays, and be part of one sum.
So the input to the algorithm would be:
d - representing the number that each subarrays sum has to be less or equal
A - representing the array of numbers that will be separated to different subarrays, and will be part of one sum
Algorithm complexity must be polynomial.
Thank you.
If by "subarray" you mean "subset" as opposed to "contiguous slice", it is impossible to find a polynomial time algorithm for this problem (unless P = NP). The Partition Problem is to partition a list of numbers into to sets such that the sum of both sets are equal. It is known to be NP-complete. The partition problem can be reduced to your problem as follows:
Suppose that x1, ..., x_n are positive numbers that you want to partition into 2 sets such that their sums are equal. Let d be this common sum (which would be the sum of the xi divided by 2). extend x_i to an array, A, of size n+3 by adding three copies of d. Clearly the only way to partition A into 5 subarrays so that the sum of each is less than or equal to d is if the sum of each actually equals d. This would in turn require 3 of the subarrays to have length 1, each consisting of the number d. The remaining 2 subarrays would be exactly a partition of the original n numbers.
On the other hand, if there are additional constraints on what the numbers are and/or the subarrays need to be, there might be a polynomial solution. But, if so, you should clearly spell out what there constraints are.
Set up of the problem:
d : the upper bound for the subarray
A : the initial array
Assuming A is not sorted.
(Heuristic)
Algorithm:
1.Sort A in ascending order using standard sorting algorithm->O(nlogn)
2.Check if the largest element of A is greater than d ->(constant)
if yes, no solution
if no, continue
3.Sum up all the element in A, denote S. Check if S/5 > d ->O(n)
if yes, no solution
if no, continue
4.Using greedy approach, create a new subarray Asi, add next biggest element aj in the sorted A to Asi so that the sum of Asi does not exceed d. Remove aj from sorted A ->O(n)
repeat step4 until either of the condition satisfied:
I.At creating subarray Asi, there are only 5-i element left
In this case, split the remaining element to individual subarray, done
II. i = 5. There are 5 subarray created.
The algorithm described above is bounded by O(nlogn) therefore in polynomial time.

Count of subarray

The problem is a variant of subarray counting. Given an array of numbers, let's say, 1,2,2,3,2,1,2,2,2,2 I look for subarrays and count the frequency of each. I start with looking from some K length subarrays (example K = 3).
Count of subarray 1,2,2 is C1:2.
Count of subarray 2,2,3 is 1.
Count of subarray 2,3,2 is 1.
and so on.
Now, I look for subarrays of length 2.
Count of subarray 1,2 is C2: 2. But (1,2) is a subset of the subarray 1,2,2. So, I calculate its count by subtracting C1 from C2 which gives count of 1,2 as 0. Similarly, count of 2,2 is 1.
My problem is in handling cases where more than one parent subset exists. I don't consider the sub-arrays in my result set whose frequency comes out to be 1. Example:
1,2,3,1,2,3,1,2,2,3
Here, Count of 1,2,3 is 2.
Count of 2,3,1 is 2.
Now, when I look for count of 2,3, it should be 1 as all the greater length parents have covered the occurrences. How shall I handle these cases?
The approach I thought was to mark all the pattern occurrences of the parent. In the above case, mark all the occurrences of 1,2,3 and 2,3,1. Array looks like this:
1,2,3,1,2,3,1,2,2,3
X,X,X,X,X,X,X,2,2,3
where X denotes the marked position. Now, frequency of 2,3 we see is 1 as per the positions which are unmarked. So, basically, I mark all the pattern occurrences I find at the current step. For the next step, I start looking for patterns from the unmarked locations only to get the correct count.
I am dealing with the large data on which this seems a bit not-so-good thing to do. Also, I'm not sure if it's correct or not. Any other approaches or ideas can be of big help?
Build suffix array for given array.
To count all repeating subarrays with given length - walk through this suffix array, comparing neighbor suffixes by needed prefix length.
For your first example
source array
1,2,2,3,2,1,2,2,2,2
suffix array is
5,0,9,4,8,7,6,1,2,3:
1,2,2,2,2 (5)
1,2,2,3,2,1,2,2,2,2 (0)
2 (9)
2,1,2,2,2,2 (4)
2,2 (8)
2,2,2 (7)
2,2,2,2 (6)
2,2,3,2,1,2,2,2,2 (1)
2,3,2,1,2,2,2,2 (2)
3,2,1,2,2,2,2 (3)
With length 2 we can count two subarrays 1,2 and four subarrays 2,2
If you want to count any given subarray - for example, all suffixes beginning with (1,2), just use binary search to get the first and the last indexes (like std:upperbound and std:lowerbound operations in C++ STL).
For the same example indexes of the first and last occurrences of (1,2) in suffix array are 0 and 1, so count is last-first+1=2

For given array of pairs of two numbers, count pairs in which first element is bigger and second one is smaller

I'm trying to implement this in my program, namely we have given array of K pairs, where each pair is in form (i,j) and i<=N, j<=M, N,M<=1000, and K<=N*M.
Now for each pair we want to count pairs in which the first element is strictly bigger and the second one is strictly less, for example if our pair is: (2,3), we want to count the pair (4,1), but not the pair (1,2) because 1 is less than 2.
Is this possible to do in O(N*M) time complexity?

Resources