Closest pairs of elements in two sorted arrays - arrays

We are looking for an efficient algorithm to solve the following problem:
Given two increasingly sorted arrays.
Find the closest corresponding elements in each array that difference is
below a user given threshold. But only the closest of possible candidates
(in the range of array1[i] +/- threshold) should be returned.
The second closest could be matched to another element but
matches to more than one element are not allowed. If two elements in
array1 have the same distance to array2[j] the first (leftmost) match
should be reported.
The arrays can contain duplicated values. There the first (leftmost) match
should be reported (and all the others ignored/not matched).
Examples:
x: 1, 3, 5, 6, 8
y: 3, 4, 5, 7
threshold: 1
output: NA, 1, 3, 4, NA
(index of y that matches best to x)
x: 1, 1.5, 2, 2.1, 5, 6.1, 7.2
y: 4.6, 4.7, 4.8, 4.9, 5, 6, 7, 8
threshold: 3
output: NA, NA, NA, 1, 5, 6, 7
(index of y that matches best to x)
x: 1, 1, 1, 2, 2, 2
y: 1, 2
threshold: 0
output: 1, NA, NA, 2, NA, NA
(index of y that matches best to x, for duplicates choose to first one)
x: 1, 2
y: 1, 1, 1, 2, 2, 2
threshold: 0
output: 1, 4
(index of y that matches best to x, for duplicates choose to first one)
We use this to find the closest matching values between two m/z-values
(mass-to-charge ratios) while comparing mass spectra.
Currently we iterate through both arrays and lookahead the differences for the
next two elements and correct the previous element if a closer one was found.
But this fails for more than two duplicate elements in a row (second example):
Our current implementation (C code as part of an R package):
https://github.com/rformassspectrometry/MsCoreUtils/blob/master/src/closest.c#L73-L129
A commented version below:
SEXP C_closest_dup_closest(SEXP x, SEXP table, SEXP tolerance, SEXP nomatch) {
/* x is the first array of doubles */
double *px = REAL(x);
const unsigned int nx = LENGTH(x);
/* table is the second array of doubles where x should be matched against */
double *ptable = REAL(table);
const unsigned int ntable = LENGTH(table);
/* user given tolerance threshold */
double *ptolerance = REAL(tolerance);
/* integer array to store the results */
SEXP out = PROTECT(allocVector(INTSXP, nx));
int* pout = INTEGER(out);
/* integer that should returned if no valid match or a closer one was found */
const unsigned int inomatch = asInteger(nomatch);
/* indices */
unsigned int ix = 0, ixlastused = 1;
unsigned int itbl = 0, itbllastused = 1;
/* differences: current, difference to next element of x and table, respectively */
double diff = R_PosInf, diffnxtx = R_PosInf, diffnxttbl = R_PosInf;
while (ix < nx) {
if (itbl < ntable) {
/* difference for current pair */
diff = fabs(px[ix] - ptable[itbl]);
/* difference for next pairs */
diffnxtx =
ix + 1 < nx ? fabs(px[ix + 1] - ptable[itbl]) : R_PosInf;
diffnxttbl =
itbl + 1 < ntable ? fabs(px[ix] - ptable[itbl + 1]) : R_PosInf;
if (diff <= ptolerance[ix]) {
/* valid match, add + 1 to convert between R/C index */
pout[ix] = itbl + 1;
if (itbl == itbllastused &&
(diffnxtx < diffnxttbl || diff < diffnxttbl))
pout[ixlastused] = inomatch;
ixlastused = ix;
itbllastused = itbl;
} else
pout[ix] = inomatch;
if (diffnxtx < diff || diffnxttbl < diff) {
/* increment the index with the smaller distance */
if (diffnxtx < diffnxttbl)
++ix;
else
++itbl;
} else {
/* neither next x nor next table item offer a better match */
++ix;
++itbl;
}
} else
pout[ix++] = inomatch;
}
/* R provided MACRO to free allocated memory */
UNPROTECT(1);
return out;
}
Could anybody give us a hint for a better algorithm?

Related

Smallest subarray with sum equal to k

I want to find the length smallest subarray whose sum is equal to k.
Input: arr[] = {2, 4, 6, 10, 2, 1}, K = 12
Output: 2
Explanation:
All possible subarrays with sum 12 are {2, 4, 6} and {10, 2}.
Input: arr[] = { 1, 2, 4, 3, 2, 4, 1 }, K = 7
Output: 2
Here's a solution using JavaScript.
It could be made more efficient, for sure, but I've coded it to work.
function lengthOfShortestSubArrayOfSumK(array, k) {
var combos=[];
for(var i=0; i<Math.pow(2, array.length); i++) {
var bin=("0".repeat(array.length)+i.toString(2)).slice(-array.length).split("");
var ones=bin.reduce((count, digit)=>{count+=digit=="1";return count;},0);
var sum=bin.reduce((sum, digit, index)=>{sum+=digit=="1"?array[index]:0;return sum;},0);
combos.push([bin, ones, sum]);
};
return combos.filter(combo=>combo[2]==k).sort((a, b)=>a[1]-b[1])[0][1];
}
var arraysAndKs=[
{array:[2, 4, 6, 10, 2, 1], k:12},
{array:[1, 2, 4, 3, 2, 4, 1], k:7}
];
for(arrayAndK of arraysAndKs)
console.log("Length of shortest sub array of ["+arrayAndK.array.join(", ")+"] with sum "+arrayAndK.k+" is : "+lengthOfShortestSubArrayOfSumK(arrayAndK.array, arrayAndK.k));
The Binary number between 0 and array.length squared will give us a representation of included array items in the sum.
We count how many "ones" are in that Binary number.
We sum array items masked by those "one"s.
We save into combos array an array of the Binary number, "one"s count, and sum.
We filter combos for sum k, sort by count of "one"s, and retrun the first's "one"s count.
I'm sure this can be translated to any programming language.
You can use an algorithm that finds a subset in size K, and save another variable that stores the number of members that make up such a subarray.
The algorithm for finding a K subarray is:
initialize an array of size K, Each place (idx) indicates whether there is a subarray that amounts to idx (I used a dictionary)
Go over any number (i) in the array, and any sum (j) we can reach in the previous iteration now we can reach j + i.
If in the K place it is marked TRUE, then there is a subarray that amounts to K.
Here's the solution in Python
def foo(arr,k):
dynamic = {0:0}
for i in arr:
temp = {}
for j, l in dynamic.items():
if i + j <= k: # if not it's not interesting us
# choose the smallest subarray
temp[i+j] = min(l+1,dynamic.get(i+j,len(arr)))
dynamic.update(temp)
return dynamic.get(k,-1)
the complexity is O(N*K).
I assumed that the subarray refers to any possible combinations of original array.
Here is a Python code that solves the problem under the condition that the subset must be contiguous:
in O(N) complexity
def shortest_contiguous_subarray(arr,k):
if k in arr:
return 1
n = len(arr)
sub_length = float('inf')
sub = arr[(i:=0)]
j = 1
while j < n:
while sub < k and j < n:
sub += arr[j]
j += 1
while sub > k:
sub -= arr[i]
i += 1
if sub == k:
# print(arr[i:j],j-i)
sub_length = min(sub_length,j-i)
sub -= arr[i]
i += 1
return sub_length if sub_length <= n else -1
This answer works for any array of positive numbers, and can be modified to work with arrays that have zero or negative elements if an O(n) pre-processing pass is performed (1. find the minimum element m, m <= 0, 2. make the whole array positive by adding -m+1 to all elements, 3. solve for sum + n*(1-m))
function search(input, goal) {
let queue = [ { avail: input.slice(), used: [], sum: 0 } ]; // initial state
for (let qi = 0; qi < queue.length; qi ++) {
let s = queue[qi]; // like a pop, but without using O(n) shift
for (let i = 0; i < s.avail.length; i++) {
let e = s.avail[i];
if (s.sum + e > goal) continue; // dead end
if (s.sum + e == goal) return [...s.used, e]; // eureka!!
queue.push({ // keep digging
avail: [...s.avail.slice(0, i), ...s.avail.slice(i+1)],
used: [...s.used, e],
sum: s.sum + e
});
}
}
return undefined; // no subset of input adds up to goal
}
console.log(search([2, 4, 6, 10, 2, 1], 12))
This is a classic breadth-first-search that does a little bit of pruning when it detects that we are already over the target sum. It can be further optimized to avoid exploring the same branch several times (for example, [4,2] is equivalent to [2,4]) - but this would require extra memory to keep a set of "visited" states. Additionally, you could add heuristics to explore more promising branches first.
I have done this by using unordered_map in c++. Hope this helps .
`
/* smallest subarray of sum k*/
#include<bits/stdc++.h>
using namespace std;
int main()
{
vector <int> v = {2,4,6,10,2,12};
int k=12;
unordered_map<int,int>m;
int start=0,end=-1;
int len=0,mini=INT_MAX;
int currsum=0;
for(int i=0;i<v.size();i++){
currsum+=v[i];
if(currsum==k){
start=0,end=i;
len=end-start+1;
mini=min(mini,len);
}
if(v[i]==k){
mini=min(mini,1);
}
if(m.find(currsum-k)!=m.end()){
end=i;
start=m[(currsum-k)]+1;
len=end-start+1;
mini=min(mini,len);
}
m[currsum]=i;
}
cout<<mini;
return 0;
}`
class Solution
{
static int findSubArraySum(int arr[], int N, int k)
{
// code here
// i use prefix sum and hashmap approach
HashMap<Integer, Integer> map = new HashMap<>();
map.put(0,1);
// this is bcoz when 1st element is valid one
int count=0;
int sum=0;
for(int i=0;i<N;i++){
sum += arr[i];
// prefix sum
if(map.containsKey(sum-k)){
count += map.get(sum-k);
}
map.put(sum, map.getOrDefault(sum,0)+1);
}
return count;
}
}
// this approach even for -ve numbers
// i came to dis solution by prefix sum approach
This version finds the entire optimal sub-array, not only its length. It's based on a recursion. It will test each number of the array against the optimal sub-array of the rest.
const bestSum = (targetSum, numbers) => {
var shortestCombination = null
for (var i = 0; i < numbers.length; i++) {
var current = numbers[i];
if (current == 0) {
continue
}
if (current == targetSum) {
return [current]
}
if (current > targetSum) {
continue;
}
// "remove" current from array
numbers[i] = 0;
// now the recursion:
var rest = bestSum(targetSum - current, numbers)
if (rest && (!shortestCombination || rest.length + 1 < shortestCombination.length)) {
shortestCombination = [current].concat(rest);
}
// restore current to array
numbers[i] = current
}
return shortestCombination
}
console.log(bestSum(7, [5, 3, 4, 7])) // Should be 7, not [3, 4]
This is my code in Python 3. I used the same idea of find the longest subarray with a sum equal to K. But in the below code for every prefix sum I am storing the recent index.
def smallestSubArraySumLength(a, n, k):
d=defaultdict(lambda:-1)
d[0]=-1
psum=0
maxl=float('inf')
for i in range(n):
psum+=a[I]
if psum-k in d:
maxl=min(maxl, i-d[psum-k])
d[psum]=i
return maxl

Given an array A of size N, find all combination of four elements in the array whose sum is equal to a given value K

Given an array A of size N, find all combinations of four elements in the array whose sum is equal to a given value K. For example, if the given array is {10, 2, 3, 4, 5, 9, 7, 8} and K = 23, one of the quadruple is “3 5 7 8” (3 + 5 + 7 + 8 = 23).
The output should contain only unique quadruple For example, if the input array is {1, 1, 1, 1, 1, 1} and K = 4, then the output should be only one quadruple {1, 1, 1, 1}
My approach: I tried to solve this problem by storing all the distinct pairs formed from the given array into a hash table (std::unordered_multimap), with their sum as key. Then for each pair sum, I looked for (K - sum) key in the hash table. The problem with this approach is I am getting too many duplicated like (i, j, l, m) and (i, l, j, m) are the same, plus there are duplicates due to the same items in the array. I am not sure what is the optimal way to address that.
The code for the above-mentioned approach is:
#include <iostream>
#include <unordered_map>
#include <tuple>
#include <vector>
int main() {
size_t tc = 0;
std::cin >> tc; //number of test cases
while(tc--) {
size_t n = 0, k = 0;
std::cin >> n >> k;
std::vector<size_t> vec(n);
for (size_t i = 0; i < n; ++i)
std::cin >> vec[i];
std::unordered_multimap<size_t, std::tuple<size_t, size_t>> m;
for (size_t i = 0; i < n - 1; ++i)
for (size_t j = i + 1; j < n; ++j) {
const auto sum = vec[i] + vec[j];
m.emplace(sum, std::make_tuple(i, j));
}
for (size_t i = 0; i < n - 1; ++i)
for (size_t j = i + 1; j < n; ++j) {
const auto sum = vec[i] + vec[j];
auto r = m.equal_range(k - sum);
for (auto it = r.first; it != r.second; ++it) {
if ((i == std::get<0>(it->second))
||(i == std::get<1>(it->second))
||(j == std::get<0>(it->second))
|| (j == std::get<1>(it->second)))
continue;
std::cout << vec[i] << ' ' << vec[j] << ' '
<< vec[std::get<0>(it->second)] << ' '
<< vec[std::get<1>(it->second)] << '$';
}
r = m.equal_range(sum);
for (auto it = r.first; it != r.second; ++it) {
if ((i == std::get<0>(it->second))
&& (j == std::get<1>(it->second))) {
m.erase(it);
break;
}
}
}
std::cout << '\n';
}
return 0;
}
The above code will run as-is in the link mentioned below in the Note.
Note: This problem is taken from https://practice.geeksforgeeks.org/problems/find-all-four-sum-numbers/0
To handle duplicate values in array
Consider [2, 2, 2, 3, 3] with goal 10.
The only solution is the 4-tuple <2,2,3,3>. The main point is to avoid choosing two 2 among three 2.
Let's consider the k-class, the set of tuples in which every tuple contain only k.
e.g: in our array we have the 2-class and 3-class.
The 2-class contains:
<2>
<2,2>
<2,2,2>
while the 3-class contains:
<3>
<3,3>
An idea is to reduce the array of elem (elem being an integer value) to an array of k-class.
idem
[[<2>, <2,2>, <2,2,2>], [<3>, <3,3>]]
We can think of taking the cartesian product between the 2-class set and the 3-class set, and check which result lead to solution.
More concretely, let's take some tuple T, whose last (==rightmost) value is k. (in <2,3,4> rightmost value would be 4).
We can pick any l-class from our array (* l > k) and join the tuples from that l-class to T.
e.g
consider array [2, 9, 9, 3, 3, 4, 6] and tuple <2, 3, 3>
The rightmost value is 3.
The candidate k-class are 4-class, 6-class, 9-class
we can join:
<4>
<6>
<9>
<9,9>
so the next candidates will be:
<2, 3, 3, 4>
<2, 3, 3, 6>
<2, 3, 3, 9>
<2, 3, 3, 9, 9> //that one has too many elem, left for illustration
(* The purpose of l > k is to prevent the permutation. (if <1,2> is solution you don't want <2,1> since addition is commutative))
Algorithm
Foreach tuple, try to create new ones by rightjoining tuples from a "greater" k-class.
discard the resulting ones which have too many elements or whose sum is already too big...
At some point we won't have new candidates, so algorithm will stop
example of cuts:
given array [2,3,7,8,10,11], and tuple <2,3> and S == 13
<2,3,7> is candidate (2+3+7 = 12 < 13)
<2,3,10> is not candidate (2+3+10 = 15 > 13)
<2,3,11> even more so. not
<2,3,8> is not candidate either since the next rightjoin (to reach a 4-tuple) will overflow S
given array [2,3,4,4,4] given tuple <2,3> and candidate <4,4,4>
resulting tuple would be <2,3,4,4,4> which has too many elems, discard it!
Obviously the initialization is
some empty tuple
whose sum is 0
and whose rightmost element is less than any k from the array (you can rightjoin it anybody)
I believe it should not be too hard to translate to C++
class TupleClass {
sum = 0
rightIdx = -1
values = [] // an array of integers (hopefully summing to solution)
//idx is the position of the k-class found in array
add (val, idx) {
const t = new TupleClass()
t.values = this.values.concat(val)
t.sum = this.sum + val
t.rightIdx = idx
return t;
}
toString () {
return `<${this.values.join(',')}>`
}
addTuple (tuple, idx) {
const t = new TupleClass
t.values = this.values.concat(tuple.values)
t.sum = this.sum + tuple.sum
t.rightIdx = idx
return t;
}
get size () {
return this.values.length
}
}
function nodupes (v, S) {
v = v.reduce((acc, klass) => {
acc[klass] = (acc[klass] || {duplicity: 0, klass})
acc[klass].duplicity++
return acc
}, {})
v = Object.values(v).sort((a,b) => a.klass - b.klass).map(({ klass, duplicity }, i) => {
return Array(duplicity).fill(0).reduce((acc, _) => {
const t = acc[acc.length-1].add(klass, i)
acc.push(t)
return acc
}, [new TupleClass()]).slice(1)
})
//v is sorted by k-class asc
//each k-class is an array of tuples with increasing length
//[[<2>, <2,2>, <2,2,2>], [<3>,<3,3>]]
let tuples = [new TupleClass()]
const N = v.length
let nextTuples = []
const solutions = []
while (tuples.length) {
tuples.forEach(tuple => {
//foreach kclass after our rightmost value
for (let j = tuple.rightIdx + 1; j <= N - 1; ++j) {
//foreach tuple of that kclass
for (let tclass of v[j]) {
const nextTuple = tuple.addTuple(tclass, j)
if (nextTuple.sum > S || nextTuple.size > 4) {
break
}
//candidate to solution
if (nextTuple.size == 4) {
if (nextTuple.sum === S) {
solutions.push(nextTuple)
}
//invalid sum so adding more elem won't help, do not push
} else {
nextTuples.push(nextTuple)
}
}
}
})
tuples = nextTuples
nextTuples = []
}
return solutions;
}
const v = [1,1,1,1,1,2,2,2,3,3,3,4,0,0]
const S = 7
console.log('v:', v.join(','), 'and S:',S)
console.log(nodupes(v, 7).map(t=>t.toString()).join('\n'))

Finding an algorithm for indexing all combinations [duplicate]

Given an array of N elements representing the permutation atoms, is there an algorithm like that:
function getNthPermutation( $atoms, $permutation_index, $size )
where $atoms is the array of elements, $permutation_index is the index of the permutation and $size is the size of the permutation.
For instance:
$atoms = array( 'A', 'B', 'C' );
// getting third permutation of 2 elements
$perm = getNthPermutation( $atoms, 3, 2 );
echo implode( ', ', $perm )."\n";
Would print:
B, A
Without computing every permutation until $permutation_index ?
I heard something about factoradic permutations, but every implementation i've found gives as result a permutation with the same size of V, which is not my case.
Thanks.
As stated by RickyBobby, when considering the lexicographical order of permutations, you should use the factorial decomposition at your advantage.
From a practical point of view, this is how I see it:
Perform a sort of Euclidian division, except you do it with factorial numbers, starting with (n-1)!, (n-2)!, and so on.
Keep the quotients in an array. The i-th quotient should be a number between 0 and n-i-1 inclusive, where i goes from 0 to n-1.
This array is your permutation. The problem is that each quotient does not care for previous values, so you need to adjust them. More explicitly, you need to increment every value as many times as there are previous values that are lower or equal.
The following C code should give you an idea of how this works (n is the number of entries, and i is the index of the permutation):
/**
* #param n The number of entries
* #param i The index of the permutation
*/
void ithPermutation(const int n, int i)
{
int j, k = 0;
int *fact = (int *)calloc(n, sizeof(int));
int *perm = (int *)calloc(n, sizeof(int));
// compute factorial numbers
fact[k] = 1;
while (++k < n)
fact[k] = fact[k - 1] * k;
// compute factorial code
for (k = 0; k < n; ++k)
{
perm[k] = i / fact[n - 1 - k];
i = i % fact[n - 1 - k];
}
// readjust values to obtain the permutation
// start from the end and check if preceding values are lower
for (k = n - 1; k > 0; --k)
for (j = k - 1; j >= 0; --j)
if (perm[j] <= perm[k])
perm[k]++;
// print permutation
for (k = 0; k < n; ++k)
printf("%d ", perm[k]);
printf("\n");
free(fact);
free(perm);
}
For example, ithPermutation(10, 3628799) prints, as expected, the last permutation of ten elements:
9 8 7 6 5 4 3 2 1 0
Here's a solution that allows to select the size of the permutation. For example, apart from being able to generate all permutations of 10 elements, it can generate permutations of pairs among 10 elements. Also it permutes lists of arbitrary objects, not just integers.
function nth_permutation($atoms, $index, $size) {
for ($i = 0; $i < $size; $i++) {
$item = $index % count($atoms);
$index = floor($index / count($atoms));
$result[] = $atoms[$item];
array_splice($atoms, $item, 1);
}
return $result;
}
Usage example:
for ($i = 0; $i < 6; $i++) {
print_r(nth_permutation(['A', 'B', 'C'], $i, 2));
}
// => AB, BA, CA, AC, BC, CB
How does it work?
There's a very interesting idea behind it. Let's take the list A, B, C, D. We can construct a permutation by drawing elements from it like from a deck of cards. Initially we can draw one of the four elements. Then one of the three remaining elements, and so on, until finally we have nothing left.
Here is one possible sequence of choices. Starting from the top we're taking the third path, then the first, the the second, and finally the first. And that's our permutation #13.
Think about how, given this sequence of choices, you would get to the number thirteen algorithmically. Then reverse your algorithm, and that's how you can reconstruct the sequence from an integer.
Let's try to find a general scheme for packing a sequence of choices into an integer without redundancy, and unpacking it back.
One interesting scheme is called decimal number system. "27" can be thought of as choosing path #2 out of 10, and then choosing path #7 out of 10.
But each digit can only encode choices from 10 alternatives. Other systems that have a fixed radix, like binary and hexadecimal, also can only encode sequences of choices from a fixed number of alternatives. We want a system with a variable radix, kind of like time units, "14:05:29" is hour 14 from 24, minute 5 from 60, second 29 from 60.
What if we take generic number-to-string and string-to-number functions, and fool them into using mixed radixes? Instead of taking a single radix, like parseInt('beef', 16) and (48879).toString(16), they will take one radix per each digit.
function pack(digits, radixes) {
var n = 0;
for (var i = 0; i < digits.length; i++) {
n = n * radixes[i] + digits[i];
}
return n;
}
function unpack(n, radixes) {
var digits = [];
for (var i = radixes.length - 1; i >= 0; i--) {
digits.unshift(n % radixes[i]);
n = Math.floor(n / radixes[i]);
}
return digits;
}
Does that even work?
// Decimal system
pack([4, 2], [10, 10]); // => 42
// Binary system
pack([1, 0, 1, 0, 1, 0], [2, 2, 2, 2, 2, 2]); // => 42
// Factorial system
pack([1, 3, 0, 0, 0], [5, 4, 3, 2, 1]); // => 42
And now backwards:
unpack(42, [10, 10]); // => [4, 2]
unpack(42, [5, 4, 3, 2, 1]); // => [1, 3, 0, 0, 0]
This is so beautiful. Now let's apply this parametric number system to the problem of permutations. We'll consider length 2 permutations of A, B, C, D. What's the total number of them? Let's see: first we draw one of the 4 items, then one of the remaining 3, that's 4 * 3 = 12 ways to draw 2 items. These 12 ways can be packed into integers [0..11]. So, let's pretend we've packed them already, and try unpacking:
for (var i = 0; i < 12; i++) {
console.log(unpack(i, [4, 3]));
}
// [0, 0], [0, 1], [0, 2],
// [1, 0], [1, 1], [1, 2],
// [2, 0], [2, 1], [2, 2],
// [3, 0], [3, 1], [3, 2]
These numbers represent choices, not indexes in the original array. [0, 0] doesn't mean taking A, A, it means taking item #0 from A, B, C, D (that's A) and then item #0 from the remaining list B, C, D (that's B). And the resulting permutation is A, B.
Another example: [3, 2] means taking item #3 from A, B, C, D (that's D) and then item #2 from the remaining list A, B, C (that's C). And the resulting permutation is D, C.
This mapping is called Lehmer code. Let's map all these Lehmer codes to permutations:
AB, AC, AD, BA, BC, BD, CA, CB, CD, DA, DB, DC
That's exactly what we need. But if you look at the unpack function you'll notice that it produces digits from right to left (to reverse the actions of pack). The choice from 3 gets unpacked before the choice from 4. That's unfortunate, because we want to choose from 4 elements before choosing from 3. Without being able to do so we have to compute the Lehmer code first, accumulate it into a temporary array, and then apply it to the array of items to compute the actual permutation.
But if we don't care about the lexicographic order, we can pretend that we want to choose from 3 elements before choosing from 4. Then the choice from 4 will come out from unpack first. In other words, we'll use unpack(n, [3, 4]) instead of unpack(n, [4, 3]). This trick allows to compute the next digit of Lehmer code and immediately apply it to the list. And that's exactly how nth_permutation() works.
One last thing I want to mention is that unpack(i, [4, 3]) is closely related to the factorial number system. Look at that first tree again, if we want permutations of length 2 without duplicates, we can just skip every second permutation index. That'll give us 12 permutations of length 4, which can be trimmed to length 2.
for (var i = 0; i < 12; i++) {
var lehmer = unpack(i * 2, [4, 3, 2, 1]); // Factorial number system
console.log(lehmer.slice(0, 2));
}
It depends on the way you "sort" your permutations (lexicographic order for example).
One way to do it is the factorial number system, it gives you a bijection between [0 , n!] and all the permutations.
Then for any number i in [0,n!] you can compute the ith permutation without computing the others.
This factorial writing is based on the fact that any number between [ 0 and n!] can be written as :
SUM( ai.(i!) for i in range [0,n-1]) where ai <i
(it's pretty similar to base decomposition)
for more information on this decomposition, have a look at this thread : https://math.stackexchange.com/questions/53262/factorial-decomposition-of-integers
hope it helps
As stated on this wikipedia article this approach is equivalent to computing the lehmer code :
An obvious way to generate permutations of n is to generate values for
the Lehmer code (possibly using the factorial number system
representation of integers up to n!), and convert those into the
corresponding permutations. However the latter step, while
straightforward, is hard to implement efficiently, because it requires
n operations each of selection from a sequence and deletion from it,
at an arbitrary position; of the obvious representations of the
sequence as an array or a linked list, both require (for different
reasons) about n2/4 operations to perform the conversion. With n
likely to be rather small (especially if generation of all
permutations is needed) that is not too much of a problem, but it
turns out that both for random and for systematic generation there are
simple alternatives that do considerably better. For this reason it
does not seem useful, although certainly possible, to employ a special
data structure that would allow performing the conversion from Lehmer
code to permutation in O(n log n) time.
So the best you can do for a set of n element is O(n ln(n)) with an adapted data structure.
Here's an algorithm to convert between permutations and ranks in linear time. However, the ranking it uses is not lexicographic. It's weird, but consistent. I'm going to give two functions, one that converts from a rank to a permutation, and one that does the inverse.
First, to unrank (go from rank to permutation)
Initialize:
n = length(permutation)
r = desired rank
p = identity permutation of n elements [0, 1, ..., n]
unrank(n, r, p)
if n > 0 then
swap(p[n-1], p[r mod n])
unrank(n-1, floor(r/n), p)
fi
end
Next, to rank:
Initialize:
p = input permutation
q = inverse input permutation (in linear time, q[p[i]] = i for 0 <= i < n)
n = length(p)
rank(n, p, q)
if n=1 then return 0 fi
s = p[n-1]
swap(p[n-1], p[q[n-1]])
swap(q[s], q[n-1])
return s + n * rank(n-1, p, q)
end
The running time of both of these is O(n).
There's a nice, readable paper explaining why this works: Ranking & Unranking Permutations in Linear Time, by Myrvold & Ruskey, Information Processing Letters Volume 79, Issue 6, 30 September 2001, Pages 281–284.
http://webhome.cs.uvic.ca/~ruskey/Publications/RankPerm/MyrvoldRuskey.pdf
Here is a short and very fast (linear in the number of elements) solution in python, working for any list of elements (the 13 first letters in the example below) :
from math import factorial
def nthPerm(n,elems):#with n from 0
if(len(elems) == 1):
return elems[0]
sizeGroup = factorial(len(elems)-1)
q,r = divmod(n,sizeGroup)
v = elems[q]
elems.remove(v)
return v + ", " + ithPerm(r,elems)
Examples :
letters = ['a','b','c','d','e','f','g','h','i','j','k','l','m']
ithPerm(0,letters[:]) #--> a, b, c, d, e, f, g, h, i, j, k, l, m
ithPerm(4,letters[:]) #--> a, b, c, d, e, f, g, h, i, j, m, k, l
ithPerm(3587542868,letters[:]) #--> h, f, l, i, c, k, a, e, g, m, d, b, j
Note: I give letters[:] (a copy of letters) and not letters because the function modifies its parameter elems (removes chosen element)
The following code computes the kth permutation for given n.
i.e n=3.
The various permutations are
123
132
213
231
312
321
If k=5, return 312.
In other words, it gives the kth lexicographical permutation.
public static String getPermutation(int n, int k) {
char temp[] = IntStream.range(1, n + 1).mapToObj(i -> "" + i).collect(Collectors.joining()).toCharArray();
return getPermutationUTIL(temp, k, 0);
}
private static String getPermutationUTIL(char temp[], int k, int start) {
if (k == 1)
return new String(temp);
int p = factorial(temp.length - start - 1);
int q = (int) Math.floor(k / p);
if (k % p == 0)
q = q - 1;
if (p <= k) {
char a = temp[start + q];
for (int j = start + q; j > start; j--)
temp[j] = temp[j - 1];
temp[start] = a;
}
return k - p >= 0 ? getPermutationUTIL(temp, k - (q * p), start + 1) : getPermutationUTIL(temp, k, start + 1);
}
private static void swap(char[] arr, int j, int i) {
char temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
private static int factorial(int n) {
return n == 0 ? 1 : (n * factorial(n - 1));
}
It is calculable. This is a C# code that does it for you.
using System;
using System.Collections.Generic;
namespace WpfPermutations
{
public class PermutationOuelletLexico3<T>
{
// ************************************************************************
private T[] _sortedValues;
private bool[] _valueUsed;
public readonly long MaxIndex; // long to support 20! or less
// ************************************************************************
public PermutationOuelletLexico3(T[] sortedValues)
{
if (sortedValues.Length <= 0)
{
throw new ArgumentException("sortedValues.Lenght should be greater than 0");
}
_sortedValues = sortedValues;
Result = new T[_sortedValues.Length];
_valueUsed = new bool[_sortedValues.Length];
MaxIndex = Factorial.GetFactorial(_sortedValues.Length);
}
// ************************************************************************
public T[] Result { get; private set; }
// ************************************************************************
/// <summary>
/// Return the permutation relative to the index received, according to
/// _sortedValues.
/// Sort Index is 0 based and should be less than MaxIndex. Otherwise you get an exception.
/// </summary>
/// <param name="sortIndex"></param>
/// <param name="result">Value is not used as inpu, only as output. Re-use buffer in order to save memory</param>
/// <returns></returns>
public void GetValuesForIndex(long sortIndex)
{
int size = _sortedValues.Length;
if (sortIndex < 0)
{
throw new ArgumentException("sortIndex should be greater or equal to 0.");
}
if (sortIndex >= MaxIndex)
{
throw new ArgumentException("sortIndex should be less than factorial(the lenght of items)");
}
for (int n = 0; n < _valueUsed.Length; n++)
{
_valueUsed[n] = false;
}
long factorielLower = MaxIndex;
for (int index = 0; index < size; index++)
{
long factorielBigger = factorielLower;
factorielLower = Factorial.GetFactorial(size - index - 1); // factorielBigger / inverseIndex;
int resultItemIndex = (int)(sortIndex % factorielBigger / factorielLower);
int correctedResultItemIndex = 0;
for(;;)
{
if (! _valueUsed[correctedResultItemIndex])
{
resultItemIndex--;
if (resultItemIndex < 0)
{
break;
}
}
correctedResultItemIndex++;
}
Result[index] = _sortedValues[correctedResultItemIndex];
_valueUsed[correctedResultItemIndex] = true;
}
}
// ************************************************************************
/// <summary>
/// Calc the index, relative to _sortedValues, of the permutation received
/// as argument. Returned index is 0 based.
/// </summary>
/// <param name="values"></param>
/// <returns></returns>
public long GetIndexOfValues(T[] values)
{
int size = _sortedValues.Length;
long valuesIndex = 0;
List<T> valuesLeft = new List<T>(_sortedValues);
for (int index = 0; index < size; index++)
{
long indexFactorial = Factorial.GetFactorial(size - 1 - index);
T value = values[index];
int indexCorrected = valuesLeft.IndexOf(value);
valuesIndex = valuesIndex + (indexCorrected * indexFactorial);
valuesLeft.Remove(value);
}
return valuesIndex;
}
// ************************************************************************
}
}
If you store all the permutations in memory, for example in an array, you should be able to bring them back out one at a time in O(1) time.
This does mean you have to store all the permutations, so if computing all permutations takes a prohibitively long time, or storing them takes a prohibitively large space then this may not be a solution.
My suggestion would be to try it anyway, and come back if it is too big/slow - there's no point looking for a "clever" solution if a naive one will do the job.

Maximum sum such that no two elements are adjacent

Now the available solution every where is to have an include and exclude sum . At the end max of these two will give me the output.
Now initially I was having difficulty to understand this algorithm and I thought why not going in a simple way.
Algo:
Loop over the array by increasing array pointer two at a time
Calculate the odd positioned element sum in the array
Calculate the even positioned element sum
At the end, take max of this two sum.
in that way, I think complexity will be reduced to half O(n/2)
Is this algo correct?
It's a case of dynamic programming. The algorithm is:
Do not take (sum up) any non-positive items
For positive numbers, split the problem in two: try taking and skiping the item and return the maximum of these choices:
Let's show the 2nd step, imagine we are given:
[1, 2, 3, 4, 5, 6, 10, 125, -8, 9]
1 is positive, that's why
take_sum = max(1 + max_sum([3, 4, 5, 6, 10, 125, -8, 9])) // we take "1"
skip_sum = max_sum([2, 3, 4, 5, 6, 10, 125, -8, 9]) // we skip "1"
max_sum = max(take_sum, skip_sum)
C# implementation (the simplest code in order to show the naked idea, no further optimization):
private static int BestSum(int[] array, int index) {
if (index >= array.Length)
return 0;
if (array[index] <= 0)
return BestSum(array, index + 1);
int take = array[index] + BestSum(array, index + 2);
int skip = BestSum(array, index + 1);
return Math.Max(take, skip);
}
private static int BestSum(int[] array) {
return BestSum(array, 0);
}
Test:
Console.WriteLine(BestSum(new int[] { 1, -2, -3, 100 }));
Console.WriteLine(BestSum(new int[] { 100, 8, 10, 20, 7 }))
Outcome:
101
120
Please, check, that your initial algorithm returns 98 and 117 which are suboptimal sums.
Edit: In real life you may want to add some optimization, e.g. memoization and special cases tests:
private static Dictionary<int, int> s_Memo = new Dictionary<int, int>();
private static int BestSum(int[] array, int index) {
if (index >= array.Length)
return 0;
int result;
if (s_Memo.TryGetValue(index, out result)) // <- Memoization
return result;
if (array[index] <= 0)
return BestSum(array, index + 1);
// Always take, when the last item to choose or when followed by non-positive item
if (index >= array.Length - 1 || array[index + 1] <= 0) {
result = array[index] + BestSum(array, index + 2);
}
else {
int take = array[index] + BestSum(array, index + 2);
int skip = BestSum(array, index + 1);
result = Math.Max(take, skip);
}
s_Memo.Add(index, result); // <- Memoization
return result;
}
private static int BestSum(int[] array) {
s_Memo.Clear();
return BestSum(array, 0);
}
Test:
using System.Linq;
...
Random gen = new Random(0); // 0 - random, by repeatable (to reproduce the same result)
int[] test = Enumerable
.Range(1, 10000)
.Select(i => gen.Next(100))
.ToArray();
int evenSum = test.Where((v, i) => i % 2 == 0).Sum();
int oddSum = test.Where((v, i) => i % 2 != 0).Sum();
int suboptimalSum = Math.Max(evenSum, oddSum); // <- Your initial algorithm
int result = BestSum(test);
Console.WriteLine(
$"odd: {oddSum} even: {evenSum} suboptimal: {suboptimalSum} actual: {result}");
Outcome:
odd: 246117 even: 247137 suboptimal: 247137 actual: 290856
dynamic programming inclusion exclusion approach is correct your algorithm would not work for test cases like 3 2 7 10 in this test case the two elements we take are 3 10 and sum is 13 instead of 3,7 or 2,10.may you understand what i am saying and for further clarity code is below
Java Implementation
public int maxSum(int arr[]) { // array must contain +ve elements only
int excl = 0;
int incl = arr[0];
for (int i = 1; i < arr.length; i++) {
int temp = incl;
incl = Math.max(excl + arr[i], incl);
excl = temp;
}
return incl;
}

Pseudocode - Arrays - Maxrun

I'm trying to solve a problem but I have difficulties with algorithms.
I have to write pseudocode for an iterative algorithm maxRun(A) that takes an array A of integer as input and return the maximal length of a run in A.
The subarray A[k...l] is a run if A[j] <= A[j + 1] for all j where k <= j < l. So it is a non decreasing segment of A.
Ex. A = [1,5,2,3,4,1], the max length would be 3 [2,3,4].
Thanks.
Simple Java implementation:
public class FindRun {
public static int maxRun(int[] a) {
int max = 0;
int index = 0;
int previous = a[0] + 1;
int run = 0;
while (index < a.length) {
if (a[index] >= previous) {
run++;
} else {
max = Math.max(max, run);
run = 1;
}
previous = a[index];
index++;
}
return Math.max(max, run);
}
public static void main(String[] args) {
System.out.println(maxRun(new int[] { 1, 5, 2, 3, 4, 1 }));
}
}
Here's a solution similar to #Michael's, in Ruby:
a = [1,5,2,3,4,1]
r = [1]
(1...a.size).each { |i| r << ((a[i] == a[i-1] + 1) ? r[i-1] + 1 : 1) }
r.max #=> 3
r.index(r.max) #=> 4
indicating the the maximum run is of length 3 and ends at a offset 4; that is, the run 2,3,4. I will now explain the algorithm I used. For those who don't know Ruby, this will also give you a taste of the language:
r = [1] creates an array with one element, whose value is 1. This is read, "The longest run ending at a offset 0 is of length 1.
(1...a.size) is the sequence 1, 2, 3, 4, 5. Three dots between 1 and a.size means the sequence ends with a.size - 1, which is 5.
each causes the following block, enclosed by {} to be executed once for each element of the sequence. The block variable i (in |i|) represents the sequence element.
r << x means add x to the end of the array r.
the expression to the right of r << says, "if the element of a at index i is one greater than element of a at index i-1, then the length of the run ending at index i is one greater than the length of the run ending at index i-1; else, a new run begins, whose length at offset i is 1.
After each is finished:
# r => [1, 1, 1, 2, 3, 1]
All that is required now is to find the element of r whose value is greatest:
r.max #=> 3
and the associated index:
r.index(r.max) #=> 4
Actually, the code above would more typically be written like this:
(1...a.size).each_with_object([1]) {|i,r| r << a[i] == a[i-1]+1 ? r[i-1]+1 : 1}
start, length = r.index(r.max) + 1 - r.max, r.max #=> 2, 3
Alternatively, we could have made r a hash (call it h) rather than an array, and written:
(1...a.size).each_with_object({0 => 1}) {|i,h|
h[i] = a[i] == a[i-1]+1 ? h[i-1]+1 : 1}.max_by {|_,v| v} #=> [4, 3]

Resources