Perfect hashing for permutations - permutation

Consider the following list of permutations of {0,1,2,3,4,5,6,*,*,*} as generated with ordinary backtracking:
Index Permutation
1. 0123456***
2. 012345*6**
3. 012345**6*
4. 012345***6
5. 0123465***
...
60480. ***6543210
Is it there an O(1) function which, for any permutation gives back the permutation's index?

Related

How to find Longest non-decreasing Subsequence containing duplicates in O(n) or O(nlogn)?

We know about an algorithm that will find the Longest Increasing subsequence in O(nlogn). I was wondering whether we can find the Longest non-decreasing subsequence with similar time complexity?
For example, consider an array : (4,10,4,8,9).
The longest increasing subsequence is (4,8,9).
And a longest non-decreasing subsequence would be (4,4,8,9).
First, here’s a “black box” approach that will let you find the longest nondecreasing subsequence using an off-the-shelf solver for longest increasing subsequences. Let’s take your sample array:
4, 10, 4, 8, 9
Now, imagine we transformed this array as follows by adding a tiny fraction to each number:
4.0, 10.1, 4.2, 8.3, 9.4
Changing the numbers this way will not change the results of any comparisons between two different integers, since the integer components have a larger magnitude difference than the values after the decimal point. However, if you compare the two 4s now, the latter 4 compares bigger than the previous one. If you now find the longest nondecreasing subsequence, you get back [4.0, 4.2, 8.3, 9.4], which you can then map back to [4, 4, 8, 9].
More generally, if you’re working with an array of n integer values, you can add i / n to each of the numbers, where i is its index, and you’ll be left with a sequence of distinct numbers. From there running a regular LIS algorithm will do the trick.
If you can’t work with fractions this way, you could alternatively multiply each number by n and then add in i, which also works.
On the other hand, suppose you have the code for a solver for LIS and want to convert it to one that solves the longest nondecreasing subsequence problem. The reasoning above shows that if you treat later copies of numbers as being “larger” than earlier copies, then you can just use a regular LIS. Given that, just read over the code for LIS and find spots where comparisons are made. When a comparison is made between two equal values, break the tie by considering the later appearance to be bigger than the earlier one.
I think the following will work in O(nlogn):
Scan the array from right to left, and for each element solve a subproblem of finding a longest subsequence starting from the given element of the array. E.g. if your array has indices from 0 to 4, then you start with the subarray [4,4] and check what's the longest sequence starting from 4, then you check subarray [3,4] and what's the longest subsequence starting from 3, next [2,4], and so on, until [0,4]. Finally, you choose the longest subsequence established in either of the steps.
For the last element (so subarray [4,4]) the longest sequence is always of length 1.
When in the next iteration you consider another element to the left (e.g., in the second step you consider the subarray [3,4], so the new element is element with the index 3 in the original array) you check if that element is not greater than some of the elements to its right. If so, you can take the result for some element from the right and add one.
For instance:
[4,4] -> longest sequence of length 1 (9)
[3,4] -> longest sequence of length 2 (8,9) 1+1 (you take the longest sequence from above which starts with 9 and add one to its length)
[2,4] -> longest sequence of length 3 (4,8,9) 2+1 (you take the longest sequence from above, i.e. (8,9), and add one to its length)
[1,4] -> longest sequence of length 1 (10) nothing to add to (10 is greater than all the elements to its right)
[0,4] -> longest sequence of length 4 (4,4,8,9) 3+1 (you take the longest sequence above, i.e. (4,8,9), and add one to its length)
The main issue is how to browse all the candidates to the right in logarithmic time. For that you keep a sorted map (a balanced binary tree). The keys are the already visited elements of the array. The values are the longest sequence lengths obtainable from that element. No need to store duplicates - among duplicate keys store the entry with largest value.

Interview question: reverse pairs in an array

I got this for my interview: a pair of arrays [a,b],[b,a] are counted as a reverse pair. for example, the input array is [[a,b],[b,a],[a,c],[c,a],[a,b]], and the output is 2 because there are two reversed pairs.
Now I can get the time complexity to be O(n) using a hashmap, is there any way to get better than O(n)?
The algorithm is simple:
you iterate over an array, and count in hashmap the occurence of the pairs, getting the result like [ [a,b] : 2, [a,c] : 1, [b,a] : 1, ... ]
you iterate over the hashmap, computing the minimum of the occurence of the inversed pairs, for example [a,b] : 2, [b,a] : 1, so you have 1.
you add the results from step 2, which gives you the final result.
And you cannot do it faster than O(N), because you have to check every element at least once.

Find way to separate array so each subarrays sum is less or equal to a number

I have a mathematical/algorithmic problem here.
Given an array of numbers, find a way to separate it to 5 subarrays, so that sum of each subarrays is less than or equal to a given number. All numbers from the initial array, must go to one of the subarrays, and be part of one sum.
So the input to the algorithm would be:
d - representing the number that each subarrays sum has to be less or equal
A - representing the array of numbers that will be separated to different subarrays, and will be part of one sum
Algorithm complexity must be polynomial.
Thank you.
If by "subarray" you mean "subset" as opposed to "contiguous slice", it is impossible to find a polynomial time algorithm for this problem (unless P = NP). The Partition Problem is to partition a list of numbers into to sets such that the sum of both sets are equal. It is known to be NP-complete. The partition problem can be reduced to your problem as follows:
Suppose that x1, ..., x_n are positive numbers that you want to partition into 2 sets such that their sums are equal. Let d be this common sum (which would be the sum of the xi divided by 2). extend x_i to an array, A, of size n+3 by adding three copies of d. Clearly the only way to partition A into 5 subarrays so that the sum of each is less than or equal to d is if the sum of each actually equals d. This would in turn require 3 of the subarrays to have length 1, each consisting of the number d. The remaining 2 subarrays would be exactly a partition of the original n numbers.
On the other hand, if there are additional constraints on what the numbers are and/or the subarrays need to be, there might be a polynomial solution. But, if so, you should clearly spell out what there constraints are.
Set up of the problem:
d : the upper bound for the subarray
A : the initial array
Assuming A is not sorted.
(Heuristic)
Algorithm:
1.Sort A in ascending order using standard sorting algorithm->O(nlogn)
2.Check if the largest element of A is greater than d ->(constant)
if yes, no solution
if no, continue
3.Sum up all the element in A, denote S. Check if S/5 > d ->O(n)
if yes, no solution
if no, continue
4.Using greedy approach, create a new subarray Asi, add next biggest element aj in the sorted A to Asi so that the sum of Asi does not exceed d. Remove aj from sorted A ->O(n)
repeat step4 until either of the condition satisfied:
I.At creating subarray Asi, there are only 5-i element left
In this case, split the remaining element to individual subarray, done
II. i = 5. There are 5 subarray created.
The algorithm described above is bounded by O(nlogn) therefore in polynomial time.

Count of subarray

The problem is a variant of subarray counting. Given an array of numbers, let's say, 1,2,2,3,2,1,2,2,2,2 I look for subarrays and count the frequency of each. I start with looking from some K length subarrays (example K = 3).
Count of subarray 1,2,2 is C1:2.
Count of subarray 2,2,3 is 1.
Count of subarray 2,3,2 is 1.
and so on.
Now, I look for subarrays of length 2.
Count of subarray 1,2 is C2: 2. But (1,2) is a subset of the subarray 1,2,2. So, I calculate its count by subtracting C1 from C2 which gives count of 1,2 as 0. Similarly, count of 2,2 is 1.
My problem is in handling cases where more than one parent subset exists. I don't consider the sub-arrays in my result set whose frequency comes out to be 1. Example:
1,2,3,1,2,3,1,2,2,3
Here, Count of 1,2,3 is 2.
Count of 2,3,1 is 2.
Now, when I look for count of 2,3, it should be 1 as all the greater length parents have covered the occurrences. How shall I handle these cases?
The approach I thought was to mark all the pattern occurrences of the parent. In the above case, mark all the occurrences of 1,2,3 and 2,3,1. Array looks like this:
1,2,3,1,2,3,1,2,2,3
X,X,X,X,X,X,X,2,2,3
where X denotes the marked position. Now, frequency of 2,3 we see is 1 as per the positions which are unmarked. So, basically, I mark all the pattern occurrences I find at the current step. For the next step, I start looking for patterns from the unmarked locations only to get the correct count.
I am dealing with the large data on which this seems a bit not-so-good thing to do. Also, I'm not sure if it's correct or not. Any other approaches or ideas can be of big help?
Build suffix array for given array.
To count all repeating subarrays with given length - walk through this suffix array, comparing neighbor suffixes by needed prefix length.
For your first example
source array
1,2,2,3,2,1,2,2,2,2
suffix array is
5,0,9,4,8,7,6,1,2,3:
1,2,2,2,2 (5)
1,2,2,3,2,1,2,2,2,2 (0)
2 (9)
2,1,2,2,2,2 (4)
2,2 (8)
2,2,2 (7)
2,2,2,2 (6)
2,2,3,2,1,2,2,2,2 (1)
2,3,2,1,2,2,2,2 (2)
3,2,1,2,2,2,2 (3)
With length 2 we can count two subarrays 1,2 and four subarrays 2,2
If you want to count any given subarray - for example, all suffixes beginning with (1,2), just use binary search to get the first and the last indexes (like std:upperbound and std:lowerbound operations in C++ STL).
For the same example indexes of the first and last occurrences of (1,2) in suffix array are 0 and 1, so count is last-first+1=2

Find repeated number in an array of unique numbers

There is an array where except one number (say magic-number) all are unique. The magic-number
repeats itself more than half times the size of the array. e.g. 2, 10, 10, 10, 3. Find the magic-number without
using any extra-space and without sorting it. Now is there any method to do it in O(n).
Check each element against its neighbors, if any are equal then you have found the number. O(N)
If the first test did not find the number, then you in the following situation:
10,2,10,3,10.
In this case, the first number in the array is the magic number. O(1)

Resources