Count most frequent element in an array in Minizinc

Count most frequent element in an array in Minizinc - arrays

It's a very simple thing that I want to do in Minizinc. I have an array of integer values, and I want to know the number of times that the most common value in it occurs. I can't figure out how to do that. I hope someone can help.

I don't know if this is the most effective method, but it works, basically for each element in the array you sum the number of times that value appears in the array and store that in an auxliar array, and then take the find the maximum value that appears in the auxliar array, so in the example 14 appears 3 times so repeats holds 3 for every element corresponding to the 14.
At the end I added the one liner version of everything above, where instead of storing the array of repeats you generate it in place, its the line of max_repeats.
% count the number of times that the most common value is repeated in an array
% As an example lets make a 7 element array
% size
int : n = 7;
% index set
set of int : SET = 1..n;
% the values
array [SET] of int : x = [15,14,39,23,14,14,8];
% auxiliar variable to carry the count
array [SET] of var int : repeats;
% we will count the number of times that value repeats
constraint forall(i in SET)(repeats[i] = sum(j in SET)(x[i] = x[j]) );
% the value of the most repeated element in the array
var int : value;
% if the number of repeats of that element is the maximum
% then value is equal to that element
constraint forall(i in SET)(repeats[i] = max(repeats) -> value = x[i]);
% this does the same but in one line
var int : max_repeats = max([sum(j in SET)(x[i] = x[j]) | i in SET]);
solve satisfy;
output ["Original values " ++ show(x) ++ "\n"] ++
["Number of repeats of each element " ++ show(repeats) ++ "\n"] ++
["Maximum number of repeats : " ++ show(max(repeats))];
Original values [15, 14, 39, 23, 14, 14, 8]
Number of repeats of each element [1, 3, 1, 1, 3, 3, 1]
Maximum number of repeats : 3

The "classical" way of solving this problem is to use the global constriant global_cardinality together with max.
Below is one way to model this problem using these constraint; and it also shows the number that is the most frequent.
The drawback of using this approach is that one have to create a new array gcc (for "global cardinality count") which includes the number of occurrences for each number 0..upb (where upb is the upper bound of the array a), and that might be quite large if there are large numbers in the array. Also, one have to be a little careful about the indices, e.g. not forget to include 0 in gcc.
The advantage of this approach - apart from that is might be implemented efficient in a solver - is that one can add some extra constraints on the gcc array: here I added the the feature to show the number that is most frequent (using arg_max(a)); it might be more than one such numbers and will then give multiple solutions.
include "globals.mzn";
int: n = 7;
array[1..n] of int: a = [15, 14, 39, 23, 14, 14, 8];
% array[1..n] of var 0..29: a; % using decision variables
% upper value of a
int: upb = ub_array(a);
% Number of occurrences in a
array[0..upb] of var 0..n: gcc;
% max number of occurrenes
var 0..upb: z = max(gcc);
% The value of the max number of occurrences
var 0..upb: max_val = arg_max(gcc)-1;
solve satisfy;
constraint
% count the number of occurrences in a
global_cardinality(a, array1d(0..upb,[i | i in 0..upb]), gcc)
;
output [
"a: \(a)\n",
"upb: \(upb)\n",
"gcc: \(gcc)\n",
"z: \(z)\n",
"max_val: \(max_val)\n",
"ub_array(a): \(lb_array(a))..\(ub_array(a))\n",
];
Here is the output of this model:
a: [15, 14, 39, 23, 14, 14, 8]
upb: 39
gcc: [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 3, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
z: 3
max_val: 14
ub_array(a): 8..39
----------
==========

Related

Insert a smallest possible positive integer into an array of unique integers [duplicate]

This question already has answers here:
Find the Smallest Integer Not in a List
(28 answers)
Closed 3 years ago.
I am trying to tackle this interview question: given an array of unique positive integers, find the smallest possible number to insert into it so that every integer is still unique. The algorithm should be in O(n) and the additional space complexity should be constant. Assigning values in the array to other integers is allowed.
For example, for an array [5, 3, 2, 7], output should be 1. However for [5, 3, 2, 7, 1], the answer should then be 4.
My first idea is to sort the array, then go through the array again to find where the continuous sequence breaks, but sorting needs more than O(n).
Any ideas would be appreciated!

My attempt:
The array A is assumed 1-indexed. We call an active value one that is nonzero and does not exceed n.
Scan the array until you find an active value, let A[i] = k (if you can't find one, stop);
While A[k] is active,
Move A[k] to k while clearing A[k];
Continue from i until you reach the end of the array.
After this pass, all array entries corresponding to some integer in the array are cleared.
Find the first nonzero entry, and report its index.
E.g.
[5, 3, 2, 7], clear A[3]
[5, 3, 0, 7], clear A[2]
[5, 0, 0, 7], done
The answer is 1.
E.g.
[5, 3, 2, 7, 1], clear A[5],
[5, 3, 2, 7, 0], clear A[1]
[0, 3, 2, 7, 0], clear A[3],
[0, 3, 0, 7, 0], clear A[2],
[0, 0, 0, 7, 0], done
The answer is 4.
The behavior of the first pass is linear because every number is looked at once (and immediately cleared), and i increases regularly.
The second pass is a linear search.
A= [5, 3, 2, 7, 1]
N= len(A)
print(A)
for i in range(N):
k= A[i]
while k > 0 and k <= N:
A[k-1], k = 0, A[k-1] # -1 for 0-based indexing
print(A)
[5, 3, 2, 7, 1]
[5, 3, 2, 7, 0]
[0, 3, 2, 7, 0]
[0, 3, 2, 7, 0]
[0, 3, 0, 7, 0]
[0, 0, 0, 7, 0]
[0, 0, 0, 7, 0]
Update:
Based on גלעד ברקן's idea, we can mark the array elements in a way that does not destroy the values. Then you report the index of the first unmarked.
print(A)
for a in A:
a= abs(a)
if a <= N:
A[a-1]= - A[a-1] # -1 for 0-based indexing
print(A)
[5, 3, 2, 7, 1]
[5, 3, 2, 7, -1]
[5, 3, -2, 7, -1]
[5, -3, -2, 7, -1]
[5, -3, -2, 7, -1]
[-5, -3, -2, 7, -1]

From the question description: "Assigning values in the array to other integers is allowed." This is O(n) space, not constant.
Loop over the array and multiply A[ |A[i]| - 1 ] by -1 for |A[i]| < array length. Loop a second time and output (the index + 1) for the first cell not negative or (array length + 1) if they are all marked. This takes advantage of the fact that there could not be more than (array length) unique integers in the array.

I will use 1-based indexing.
The idea is to reuse input collection and arrange to swap integer i at ith place if its current position is larger than i. This can be performed in O(n).
Then on second iteration, you find the first index i not containing i, which is again O(n).
In Smalltalk, implemented in Array (self is the array):
firstMissing
self size to: 1 by: -1 do: [:i |
[(self at: i) < i] whileTrue: [self swap: i with: (self at: i)]].
1 to: self size do: [:i |
(self at: i) = i ifFalse: [^i]].
^self size + 1
So we have two loops in O(n), but we also have another loop inside the first loop (whileTrue:). So is the first loop really O(n)?
Yes, because each element will be swapped at most once, since they will arrive at their right place. We see that the cumulated number of swap is bounded by array size, and the overall cost of first loop is at most 2*n, the total cost incuding last seatch is at most 3*n, still O(n).
You also see that we don't care to swap case of (self at: i) > i and: [(self at:i) <= self size], why? Because we are sure that there will be a smaller missing element in this case.
A small test case:
| trial |
trial := (1 to: 100100) asArray shuffled first: 100000.
self assert: trial copy firstMissing = trial sorted firstMissing.

You could do the following.
Find the maximum (m), sum of all elements (s), number of elements (n)
There are m-n elements missing, their sum is q = sum(1..m) - s - there is a closed-form solution for the sum
If you are missing only one integer, you're done - report q
If you are missing more than one (m-n), you realize that the sum of the missing integers is q, and at least one of them will be smaller than q/(m-n)
You start from the top, except you will only take into account integers smaller than q/(m-n) - this will be the new m, only elements below that maximum contribute to the new s and n. Do this until you are left with only one missing integer.
Still, this may not be linear time, I'm not sure.

EDIT: you should use the candidate plus half the input size as a pivot to reduce the constant factor here – see Daniel Schepler’s comment – but I haven’t had time to get it working in the example code yet.
This isn’t optimal – there’s a clever solution being looked for – but it’s enough to meet the criteria :)
Define the smallest possible candidate so far: 1.
If the size of the input is 0, the smallest possible candidate is a valid candidate, so return it.
Partition the input into < pivot and > pivot (with median of medians pivot, like in quicksort).
If the size of ≤ pivot is less than pivot itself, there’s a free value in there, so start over at step 2 considering only the < pivot partition.
Otherwise (when it’s = pivot), the new smallest possible candidate is the pivot + 1. Start over at step 2 considering only the > pivot partition.
I think that works…?
'use strict';
const swap = (arr, i, j) => {
[arr[i], arr[j]] = [arr[j], arr[i]];
};
// dummy pivot selection, because this part isn’t important
const selectPivot = (arr, start, end) =>
start + Math.floor(Math.random() * (end - start));
const partition = (arr, start, end) => {
let mid = selectPivot(arr, start, end);
const pivot = arr[mid];
swap(arr, mid, start);
mid = start;
for (let i = start + 1; i < end; i++) {
if (arr[i] < pivot) {
mid++;
swap(arr, i, mid);
}
}
swap(arr, mid, start);
return mid;
};
const findMissing = arr => {
let candidate = 1;
let start = 0;
let end = arr.length;
for (;;) {
if (start === end) {
return candidate;
}
const pivotIndex = partition(arr, start, end);
const pivot = arr[pivotIndex];
if (pivotIndex + 1 < pivot) {
end = pivotIndex;
} else {
//assert(pivotIndex + 1 === pivot);
candidate = pivot + 1;
start = pivotIndex + 1;
}
}
};
const createTestCase = (size, max) => {
if (max < size) {
throw new Error('size must be < max');
}
const arr = Array.from({length: max}, (_, i) => i + 1);
const expectedIndex = Math.floor(Math.random() * size);
arr.splice(expectedIndex, 1 + Math.floor(Math.random() * (max - size - 1)));
for (let i = 0; i < size; i++) {
let j = i + Math.floor(Math.random() * (size - i));
swap(arr, i, j);
}
return {
input: arr.slice(0, size),
expected: expectedIndex + 1,
};
};
for (let i = 0; i < 5; i++) {
const test = createTestCase(1000, 1024);
console.log(findMissing(test.input), test.expected);
}

The correct method I almost got on my own, but I had to search for it, and I found it here: https://www.geeksforgeeks.org/find-the-smallest-positive-number-missing-from-an-unsorted-array/
Note: This method is destructive to the original data
Nothing in the original question said you could not be destructive.
I will explain what you need to do now.
The basic "aha" here is that the first missing number must come within the first N positive numbers, where N is the length of the array.
Once you understand this and realize you can use the values in the array itself as markers, you just have one problem you need to address: Does the array have numbers less than 1 in it? If so we need to deal with them.
Dealing with 0s or negative numbers can be done in O(n) time. Get two integers, one for our current value, and one for the end of the array. As we scan through, if we find a 0 or negative number, we perform a swap using the third integer, with the final value in the array. Then we decrement our end of an array pointer. We continue until our current pointer is past the end of the array pointer.
Code example:
while (list[end] < 1) {
end--;
}
while (cur< end) {
if (n < 1) {
swap(list[cur], list[end]);
while (list[end] < 1) {
end--;
}
}
}
Now we have the end of the array, and a truncated array. From here we need to see how we can use the array itself. Since all numbers that we care about are positive, and we have a pointer to the position of how many of them there are, we can simply multiply one by -1 to mark that place as present if there was a number in the array there.
e.g. [5, 3, 2, 7, 1] when we read 3, we change it to [5, 3, -2, 7, 1]
Code example:
for (cur = 0; cur <= end; begin++) {
if (!(abs(list[cur]) > end)) {
list[abs(list[cur]) - 1] *= -1;
}
}
Now, note: You need to read the absolute value of the integer in the position because it might be changed to be negative. Also note, if an integer is greater than your end of list pointer, do not change anything as that integer will not matter.
Finally, once you have read all the positive values, iterate through them to find the first one that is currently positive. This place represents your first missing number.
Step 1: Segregate 0 and negative numbers from your list to the right. O(n)
Step 2: Using the end of list pointer iterate through the entire list marking
relevant positions negative. O(n-k)
Step 3: Scan the numbers for the position of the first non-negative number. O(n-k)
Space Complexity: The original list is not counted, I used 3 integers beyond that. So
it is O(1)
One thing I should mention is the list [5, 4, 2, 1, 3] would end up [-5, -4, -2, -1, -3] so in this case, you would choose the first number after the end position of the list, or 6 as your result.
Code example for step 3:
for (cur = 0; cur < end; cur++) {
if (list[cur] > 0) {
break;
}
}
print(cur);

use this short and sweet algorithm:
A is [5, 3, 2, 7]
1- Define B With Length = A.Length; (O(1))
2- initialize B Cells With 1; (O(n))
3- For Each Item In A:
if (B.Length <= item) then B[Item] = -1 (O(n))
4- The answer is smallest index in B such that B[index] != -1 (O(n))

Iterate over an array in a certain order, so that it is sampled fairly

I want to iterate over an array in a certain fashion:
Starting with the first and the last element of the array, the next element I want to visit is the one furthest from all previously visited elements.
For an array of length n+1, the sequence would be
0,
n,
n/2 (furthest from 0 and n),
n/4 and n*3/4 (furthest from all 3 previous indices),
n/8, n*3/8, n*5/8, n*7/8, (furthest from all 5 previous indices)
n*1/16, n*3/16, n*5/16, n*7/16, n*9/16, n*11/16, n*13/16, n*15/16
...
if n is not a power of two, then some of these numbers will have to be rounded up or down, but I am not sure how to avoid duplicates when rounding.
At the end I want an integer sequence that contains all the numbers between 0 and n exactly once. (For any n, not just powers of two)
Is there a name for this permutation?
How would a function that generates these numbers work?
I am looking for a function that can generate these numbers on-the-fly.
If there are a billion elements, I do not want to manage a giant list of all previously visited elements, or generate the whole permutation list in advance.
The idea is that I can abort the iteration once I have found an element that fits certain criteria, so I will in most cases not need the whole permutation sequence.
So I am looking for a function f(int currentIndex, int maxIndex) with the following properties:
To interate over an array of size 8, i would call
f(0,8) returns 0, to get the index of the first element
f(1,8) returns 8
f(2,8) returns 4
f(3,8) returns 2
f(4,8) returns 6
f(5,8) returns 1
f(6,8) returns 3
f(7,8) returns 5
f(8,8) returns 7
(I am not quite sure how to extend this example to numbers that are not a power of two)
Is there a function with these properties?

The hopping about you describe is a feature of the Van der Corput sequence, as mentioned in a task I wrote on Rosetta Code.
I have an exact function to re-order an input sequence, but it needs arrays as large as the input array.
What follows is an approximate solution that yields indices one by one and only takes the length of the input array, then calculates the indices with constant memory.
The testing gives some indication of how "good" the routine is.
>>> from fractions import Fraction
>>> from math import ceil
>>>
>>> def vdc(n, base=2):
vdc, denom = 0,1
while n:
denom *= base
n, remainder = divmod(n, base)
vdc += remainder / denom
return vdc
>>> [vdc(i) for i in range(5)]
[0, 0.5, 0.25, 0.75, 0.125]
>>> def van_der_corput_index(sequence):
lenseq = len(sequence)
if lenseq:
lenseq1 = lenseq - 1
yield lenseq1 # last element
for i in range(lenseq1):
yield ceil(vdc(Fraction(i)) * lenseq1)
>>> seq = list(range(23))
>>> seq
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]
>>> list(van_der_corput_index(seq))
[22, 0, 11, 6, 17, 3, 14, 9, 20, 2, 13, 7, 18, 5, 16, 10, 21, 1, 12, 7, 18, 4, 15]
>>> len(set(van_der_corput_index(seq)))
21
>>> from collections import Counter
>>>
>>> for listlen in (2, 3, 5, 7, 11, 13, 17, 19, 23,
29, 31, 37, 41, 43, 47, 53, 59, 61,
67, 71, 73, 79, 83, 89, 97, 1023,
1024, 4095, 4096, 2**16 - 1, 2**16):
out = list(van_der_corput_index( list(range(listlen) )))
outcount = Counter(out)
if outcount and outcount.most_common(1)[0][1] > 1:
print("Duplicates in %i leaving %i unique nums." % (listlen, len(outcount)))
outlen = len(out)
if outlen != listlen:
print("Length change in %i to %i" % (listlen, outlen))
Duplicates in 23 leaving 21 unique nums.
Duplicates in 43 leaving 37 unique nums.
Duplicates in 47 leaving 41 unique nums.
Duplicates in 53 leaving 49 unique nums.
Duplicates in 59 leaving 55 unique nums.
Duplicates in 71 leaving 67 unique nums.
Duplicates in 79 leaving 69 unique nums.
Duplicates in 83 leaving 71 unique nums.
Duplicates in 89 leaving 81 unique nums.
>>> outlen
65536
>>> listlen
65536
>>>

Could you not use an array such that array[n][i]
such that
Array [0][i] = "1,2,3,4,5,6,7" 'start
Array [1][i] = "1,2,3,4" '1st gen split 1
Array [2][i] = "4,5,6,7" '1st gen split 2
Array [3][i] = "1,2" '2nd gen split 1 split 1
Array [4][i] = "3,4" '2nd gen split 1 split 2
Array [5][i] = "4,5" '2nd gen split 2 split 1
Array [6][i] = "6,7" '2nd gen split 2 split 1
'use dynamic iteration such that you know the size going into the array i.e. nextGen=Toint(Ubound(Array)/2)
If(
last(Array[n][i]) = first(Array[n+1][i]
then Pop(Array[n+1][i])
)

I see how to do this, but it's tricky to describe.. bear with me.
The key idea is to logically partition your array into two sets: One contains a number of elements equal to the greatest power of two still less than the size of the array, and the other contains everything else. (So, if your array holds 29 elements, you'd have one with 16 and the other with 13.) You want these to be mixed as fairly as possible, and you want:
A function to find the "Real" index of the i-th element of the first
logical set (equivalently: How many elements of the second set come before the i-th element of the first set)
A function to tell you whether some index i belongs
to the first or second logical set.
You then run the "Ideal" function you described over the first set (mapping with function 1, above), then do a single pass over the remaining elements. So long as you distribute fairly between the logical set, this will do as you describe.
To (logically) describe which indices belong to which partition: Call the size of the first logical partition k and the size of the second partition j. Assume that every element of the first set has j/k units of "credit" associated with it. Begin filling the true array with elements of the logical array, adding up credit as you go, but every time you would get to more than one unit of credit, place an element from the second array instead, and reduce the stored credit by one. This will fairly distribute exactly j elements from the second array between k elements of the first array. NOTE: You don't actually perform this calculation, it's just a logical definition.
With a little arithmetic, you can use this to implement the functions I described above. Before the i-th element of the first set will be exactly floor(i * j/k) elements of the second set. You only run the second function during the final pass, so you can run that exactly from the definition.
Does this make sense? I'm sure this will work, but it's difficult to describe.

Yes, it is called partitioning.
It is a very common methodology for searching in an ordered array.
also, it is used by QuickSort algorithm.
it mostly being implemented as a Recursive function that samples the "center" element, and then recurse on the "left" collection, then the "right" collection.
if the array is of length 1, sample it and don't recurse.
in the following example, i just search the array in the order you describe,
if the array was ordered, after checking the first pivot, i would have skipped checking the RightPart, or the LeftPart depending on the pivot value.
int partition(int* arr, int min, int max, int subject)
{ // [min, max] inclusive!
int pivot = (max - min + 1) >> 1; // (max - min)/2
if(arr[pivot] == subject)
return pivot;
if(pivot > 0)
{
int leftPart = partition(arr, min, pivot - 1, subject);
if(leftPart >= 0)
return leftPart;
}
if(max - pivot > 0)
{
int rightPart = partition(arr, pivot + 1, max, subject);
if(rightPart >= 0)
return rightPart;
}
return -1; // not found
}
int myArr[10] = {4,8,11,7,2,88,42,6,5,11 };
int idxOf5 = partition(myArr, 0, 9, 5);

I was able to solve this myself, with the tips given by Paddy3118 and Edward Peters.
I now have a method that generates a Van der Corput permutation for a given range, with no duplicates and no missed values, and with constant and negligible memory requirements and good performance.
The method uses a c# iterable to generate the sequence on the fly.
The method VanDerCorputPermutation() takes two parameters, the upper exclusive bound of the range, and the base that should be used for generating the sequence. By default, base 2 is used.
If the range is not a power of the given base, then the next larger power is used internally, and all indices that would be generated outside the range are simply discarded.
Usage:
Console.WriteLine(string.Join("; ",VanDerCorputPermutation(8,2)));
// 0; 4; 2; 6; 1; 3; 5; 7
Console.WriteLine(string.Join("; ",VanDerCorputPermutation(9,2)));
// 0; 8; 4; 2; 6; 1; 3; 5; 7
Console.WriteLine(string.Join("; ",VanDerCorputPermutation(10,3)));
// 0; 9; 3; 6; 1; 2; 4; 5; 7; 8
Console.WriteLine(VanDerCorputPermutation(Int32.MaxValue,2).Count());
// 2147483647 (with constant memory usage)
foreach(int i in VanDerCorputPermutation(bigArray.Length))
{
// do stuff with bigArray[i]
}
for (int max = 0; max < 100000; max++)
{
for (int numBase = 2; numBase < 1000; numBase++)
{
var perm = VanDerCorputPermutation(max, numBase).ToList();
Debug.Assert(perm.Count==max);
Debug.Assert(perm.Distinct().Count()==max);
}
}
The code itself uses only integer arithemtic and very few divisions:
IEnumerable<int> VanDerCorputPermutation(int lessThan, int numBase = 2)
{
if (numBase < 2) throw new ArgumentException("numBase must be greater than 1");
// no index is less than zero
if (lessThan <= 0) yield break;
// always return the first element
yield return 0;
// find the smallest power-of-n that is big enough to generate all values
int power = 1;
while (power < lessThan / numBase + 1) power *= numBase;
// starting with the largest power-of-n, this loop generates all values between 0 and lessThan
// that are multiples of this power, and have not been generated before.
// Then the process is repeated for the next smaller power-of-n
while (power >= 1)
{
int modulo = 0;
for (int result = power; result < lessThan; result+=power)
{
if (result < power) break; // overflow, bigger than MaxInt
if (++modulo == numBase)
{
//we have used this result before, with a larger power
modulo = 0;
continue;
}
yield return result;
}
power /= numBase; // get the next smaller power-of-n
}
}

Is it possible to invert an array with constant extra space?

Let's say I have an array A with n unique elements on the range [0, n). In other words, I have a permutation of the integers [0, n).
Is possible to transform A into B using O(1) extra space (AKA in-place) such that B[A[i]] = i?
For example:
A B
[3, 1, 0, 2, 4] -> [2, 1, 3, 0, 4]

Yes, it is possible, with O(n^2) time algorithm:
Take element at index 0, then write 0 to the cell indexed by that element. Then use just overwritten element to get next index and write previous index there. Continue until you go back to index 0. This is cycle leader algorithm.
Then do the same starting from index 1, 2, ... But before doing any changes perform cycle leader algorithm without any modifications starting from this index. If this cycle contains any index below the starting index, just skip it.
Or this O(n^3) time algorithm:
Take element at index 0, then write 0 to the cell indexed by that element. Then use just overwritten element to get next index and write previous index there. Continue until you go back to index 0.
Then do the same starting from index 1, 2, ... But before doing any changes perform cycle leader algorithm without any modifications starting from all preceding indexes. If current index is present in any preceding cycle, just skip it.
I have written (slightly optimized) implementation of O(n^2) algorithm in C++11 to determine how many additional accesses are needed for each element on average if random permutation is inverted. Here are the results:
size accesses
2^10 2.76172
2^12 4.77271
2^14 6.36212
2^16 7.10641
2^18 9.05811
2^20 10.3053
2^22 11.6851
2^24 12.6975
2^26 14.6125
2^28 16.0617
While size grows exponentially, number of element accesses grows almost linearly, so expected time complexity for random permutations is something like O(n log n).

Inverting an array A requires us to find a permutation B which fulfills the requirement A[B[i]] == i for all i.
To build the inverse in-place, we have to swap elements and indices by setting A[A[i]] = i for each element A[i]. Obviously, if we would simply iterate through A and perform aforementioned replacement, we might override upcoming elements in A and our computation would fail.
Therefore, we have to swap elements and indices along cycles of A by following c = A[c] until we reach our cycle's starting index c = i.
Every element of A belongs to one such cycle. Since we have no space to store whether or not an element A[i] has already been processed and needs to be skipped, we have to follow its cycle: If we reach an index c < i we would know that this element is part of a previously processed cycle.
This algorithm has a worst-case run-time complexity of O(n²), an average run-time complexity of O(n log n) and a best-case run-time complexity of O(n).
function invert(array) {
main:
for (var i = 0, length = array.length; i < length; ++i) {
// check if this cycle has already been traversed before:
for (var c = array[i]; c != i; c = array[c]) {
if (c <= i) continue main;
}
// Replacing each cycle element with its predecessors index:
var c_index = i,
c = array[i];
do {
var tmp = array[c];
array[c] = c_index; // replace
c_index = c; // move forward
c = tmp;
} while (i != c_index)
}
return array;
}
console.log(invert([3, 1, 0, 2, 4])); // [2, 1, 3, 0, 4]
Example for A = [1, 2, 3, 0] :
The first element 1 at index 0 belongs to the cycle of elements 1 - 2 - 3 - 0. Once we shift indices 0, 1, 2 and 3 along this cycle, we have completed the first step.
The next element 0 at index 1 belongs to the same cycle and our check tells us so in only one step (since it is a backwards step).
The same holds for the remaining elements 1 and 2.
In total, we perform 4 + 1 + 1 + 1 'operations'. This is the best-case scenario.

Implementation of this explanation in Python:
def inverse_permutation_zero_based(A):
"""
Swap elements and indices along cycles of A by following `c = A[c]` until we reach
our cycle's starting index `c = i`.
Every element of A belongs to one such cycle. Since we have no space to store
whether or not an element A[i] has already been processed and needs to be skipped,
we have to follow its cycle: If we reach an index c < i we would know that this
element is part of a previously processed cycle.
Time Complexity: O(n*n), Space Complexity: O(1)
"""
def cycle(i, A):
"""
Replacing each cycle element with its predecessors index
"""
c_index = i
c = A[i]
while True:
temp = A[c]
A[c] = c_index # replace
c_index = c # move forward
c = temp
if i == c_index:
break
for i in range(len(A)):
# check if this cycle has already been traversed before
j = A[i]
while j != i:
if j <= i:
break
j = A[j]
else:
cycle(i, A)
return A
>>> inverse_permutation_zero_based([3, 1, 0, 2, 4])
[2, 1, 3, 0, 4]

This can be done in O(n) time complexity and O(1) space if we try to store 2 numbers at a single position.
First, let's see how we can get 2 values from a single variable. Suppose we have a variable x and we want to get two values from it, 2 and 1. So,
x = n*1 + 2 , suppose n = 5 here.
x = 5*1 + 2 = 7
Now for 2, we can take remainder of x, ie, x%5. And for 1, we can take quotient of x, ie , x/5
and if we take n = 3
x = 3*1 + 2 = 5
x%3 = 5%3 = 2
x/3 = 5/3 = 1
We know here that the array contains values in range [0, n-1], so we can take the divisor as n, size of array. So, we will use the above concept to store 2 numbers at every index, one will represent old value and other will represent the new value.
A B
0 1 2 3 4 0 1 2 3 4
[3, 1, 0, 2, 4] -> [2, 1, 3, 0, 4]
.
a[0] = 3, that means, a[3] = 0 in our answer.
a[a[0]] = 2 //old
a[a[0]] = 0 //new
a[a[0]] = n* new + old = 5*0 + 2 = 2
a[a[i]] = n*i + a[a[i]]
And during array traversal, a[i] value can be greater than n because we are modifying it. So we will use a[i]%n to get the old value.
So the logic should be
a[a[i]%n] = n*i + a[a[i]%n]
Array -> 13 6 15 2 24
Now, to get the older values, take the remainder on dividing each value by n, and to get the new values, just divide each value by n, in this case, n=5.
Array -> 2 1 3 0 4

Following approach Optimizes the cycle walk if it is already handled. Also each element is 1 based. Need to convert accordingly while trying to access the elements in the given array.
enter code here
#include <stdio.h>
#include <iostream>
#include <vector>
#include <bits/stdc++.h>
using namespace std;
// helper function to traverse cycles
void cycle(int i, vector<int>& A) {
int cur_index = i+1, next_index = A[i];
while (next_index > 0) {
int temp = A[next_index-1];
A[next_index-1] = -(cur_index);
cur_index = next_index;
next_index = temp;
if (i+1 == abs(cur_index)) {
break;
}
}
}
void inverse_permutation(vector<int>& A) {
for (int i = 0; i < A.size(); i++) {
cycle(i, A);
}
for (int i = 0; i < A.size(); i++) {
A[i] = abs(A[i]);
}
for (int i = 0; i < A.size(); i++) {
cout<<A[i]<<" ";
}
}
int main(){
// vector<int> perm = {4,0,3,1,2,5,6,7,8};
vector<int> perm = {5,1,4,2,3,6,7,9,8};
//vector<int> perm = { 17,2,15,19,3,7,12,4,18,20,5,14,13,6,11,10,1,9,8,16};
// vector<int> perm = {4, 1, 2, 3};
// { 6,17,9,23,2,10,20,7,11,5,14,13,4,1,25,22,8,24,21,18,19,12,15,16,3 } =
// { 14,5,25,13,10,1,8,17,3,6,9,22,12,11,23,24,2,20,21,7,19,16,4,18,15 }
// vector<int> perm = {6, 17, 9, 23, 2, 10, 20, 7, 11, 5, 14, 13, 4, 1, 25, 22, 8, 24, 21, 18, 19, 12, 15, 16, 3};
inverse_permutation(perm);
return 0;
}

Move items to the front of an array

I have items in the centre of an array and I want to move them to the front of this array. For example:
array[8] = {10, 38, 38, 0, 8, 39, 10, 22}
and I have an index array
index[6] = {0, 3, 4, 6, 7, 1}
and I want to move these 6 items to the front of the array
result[8] = {10, 0, 8, 10, 22, 38, 38, 39}
Actually the order doesn't matter, just make sure the item whose index is in the index array should always before the item whose index is not in the index array.
Can anyone give me a fast algorithm? Actually this is one step in an KNN problem, the data array could be very large. The algorithm should run as fast as possible and the extra space needed should be as small as possible. It is better if you can give me CUDA implementation.
Update: Compare to the data array, the size of the index array is very small. In my case, it is only about 200.
Update: Please note that the size of the data array could be very very very large! It goes to 1M, 10M even higher(The data array is loaded to GPU memory which is quite limited). Any algorithm needs a temp array which has the same size with data array is not acceptable.

Sort the index array in increasing order, this step will make sure that we will not make any unnecessary swap.
Starting from 0 to n - 1 (n is the length of array index), swap the ith element in the array with index[i]th element.
Pseudo Code
sort(index);
for(int i = 0; i < index.length; i++){
swap(array, i , index[i]);
}
If you don't want to sort index, we can always find the smallest element in the index array which is not at the beginning of the array. (as the size of index is small)
Use an boolean used to mark which position in the array index is already put at the correct position.
Pseudocode:
bool []used = //
for(int i = 0; i < index.length; i++){
int nxt = -1;
for(int j = 0; j < index.length; j++){
if(!used[j]){
if(nxt == -1 || index[j] < index[nxt]){
nxt = j;
}
}
}
used[nxt] = true;
swap(array, i, nxt);
}

const int ARRAY_SIZE = sizeof(array) / sizeof(array[0]);
const int INDEX_SIZE = sizeof(index) / sizeof(index[0]);
bool used[ARRAY_SIZE] = {};
for (int i = 0; i < INDEX_SIZE; ++i)
{
int id = index[i];
result[i] = array[id];
used[id] = 1;
}
for (int i = 0, id = INDEX_SIZE; i < ARRAY_SIZE; ++i)
{
if (!used[i])
{
result[id] = array[i];
++id;
}
}

Approach 1
You can modify insertion sort to solve your problem which will eventually give you O(n^2) time complexity.
But if you want to keep run time in order of N then you can use following approach.
Approach 2
Here we can use index array as auxiliary space as follows :
step 1
store all actual values instead of indexes in index table(array) and replace the value array with negative/non accepting value.
Value Array
[ -1, -1, 38, -1, -1, 39, -1, -1 ]
Index Array
[ 10, 0, 8, 10, 22, 38 ]
complexity in this operation is O(n)
step 2
shift all the remaining at last which will take O(n) time complexity.
Value Array ###
[ -1, -1, -1, -1, -1, -1, 38, 39 ]
Index Array
[ 10, 0, 8, 10, 22, 38 ]
step 3
not put the element from index array to value array.
Value Array
[ 10, 0, 8, 10, 22, 38, 38, 39 ]
Index Array
[ 10, 0, 8, 10, 22, 38 ]
time complexity for this operation is O(n)
Total run time complexity for this approach : O(n)
Improvement
Here in this approach you are not able to preserve your index array. While you can preserve it using O(index array size) space complexity OR with the condition that value array does not contain any non negative values then while keeping -1/non accepting value in it you can use storing index with -ve and in third step you can recover your index array as it is.

Finding a subset which satisfies a certain condition

I have several arrays of numbers (each element of the array can only take a value of 0 or 1) like this
v1: 1; 0; 0; 1; 1;
v2: 0; 1; 0; 0; 1;
v3: 1; 1; 0; 1; 0;
v4: 1; 0; 0; 1; 0;
v5: 1; 1; 0; 1; 1;
v6: 1; 1; 0; 1; 1;
I wish to find subsets such that, when the arrays are summed, the resulting array has individual elements which are multiples of 2. For example, v1+v2+v3 gives a resulting array of 2, 2, 0, 2, 2. The resulting array can have any value that is a multiple of 2.
Another example:
v1: 1, 1, 1, 0, 1, 0
v2: 0, 0, 1, 0, 0, 0
v3: 1, 0, 0, 0, 0, 0
v4: 0, 0, 0, 1, 0, 0
v5: 1, 1, 0, 0, 1, 0
v6: 0, 0, 1, 1, 0, 0
v7: 1, 0, 1, 1, 0, 0
In this example, v1+v2+v5 and v3+v6+v7 are suitable answers.
I have a brute force solution in mind, but I wanted to check if there is a more efficient method. Is this equivalent to the subset sum problem?

Do you want to find all solutions or one?
This can find one solution (and I think it may be possible to extend it to find all solutions).
Represent each array as a binary number.
So v1 becomes 10011, v2 becomes 01001 etc.
Let * denote bitwise mod 2 addition.
e.g.
v1*v2*v3 = 00000
So our objective is to find arrays whose mod 2 addition is all zeroes.
Let u and v be any binary number.
Then u*v = 0 iff u = v.
e.g.
(v1*v2)*v3 = 0
v1*v2 = 11010 = v3.
Also if u*v = w then
u*v*v = w*v, so
u*0 = w*v,
u = w*v
So we can do a reverse search starting from 0. Suppose the final set of arrays contains v. Then v*T = 0, where T is some binary number. We have T = 0*v. If T is one of the given arrays then we are done. Otherwise we continue the search starting from T.
This is formally described below.
Each state is a binary number.
Let 0 be the initial state.
The given arrays are some subset of the state space, say S.
Our goal state is any element in S.
Let T be the required subset of arrays whose sum is 0.
At each state let the possible actions be * with any state not in T.
After each action put the array used in T.
If S = T at any non goal stage, then there is no solution.
Now we can run a DFS on this space to find a solution.