Find missing numbers in array, time complexity O(N), space complexity O(1) - arrays

You are given an array of n unique integer numbers 0 <= x_i < 2 * n.
Print all integers 0 <= x < 2 * n that are not present in this array.
Example:
find_missing([0]) = [1]
find_missing([0, 2, 4]) = [1, 3, 5] # because all numbers are [0, 1, 2, 3, 4, 5]
find_missing([]) = []
find_missing([0, 1, 4, 5]) = [2, 3, 6, 7] # because all numbers are [0, 1, 2, 3, 4, 5, 6, 7]
Quirks are about requirements:
Time complexity O(n) - BUT there should be some fixed constant C independent of size of input such that every element of array is written/read < C times, so radix sorting the array is a no go.
Space complexity O(1) - you may modify the initial array, BUT sorted(initial_array) must equal sorted(array_after_executing_program) AND you can't store integers outside range [0, 2n) in this array (imagine that it's an array of uint32_t).
I saw a lot of complex solutions, but then I found this:
public void printNotInArr(int[] arr) {
if(arr == null)
return null;
int len = arr.length;
int max = 2 * len;
for(int i = 0; i < len; i++) {
System.out.println(max - arr[i] - 1);
}
}
I believe that is the best solution, but I am not sure. I would like to know why that would NOT work.

As #LasseV.Karlsen pointed out, [0,3] is a simple counter-example that shows how that solution doesn't work. This, however, is a pretty simple solution (in Python):
def show_missing(l):
n = len(l)
# put numbers less than n into the proper slot
for i in range(0,n):
while l[i]<n and l[i]!=i:
j = l[i]
l[i] = l[j]
l[j] = j
for i in range(0,n):
if l[i]!=i:
print('Missing %s'%i)
# put numbers greater than n into the proper slot
for i in range(0,n):
while l[i]>=n and l[i]!=i+n:
j = l[i]
l[i] = l[j-n]
l[j-n] = j
for i in range(0,n):
if l[i]!=i+n:
print('Missing %s'%(i+n))
The idea is simple. We first rearrange the elements so that every value j that is less than n is stored at index j. We can then go through the array and easily pick out the ones below n that are missing.
We then rearrange the elements so that every value j that is greater than or equal to n is stored at index j-n. Again, we can go through the array and easily pick out the ones greater than or equal to n that are missing.
Since only a couple of local variables are used, the O(1) space complexity is satisfied.
Because of the nested loops, the O(n) time complexity is a little harder to see, but it isn't too hard to show that we never swap more than n elements, since one new element is put into its proper place with each swap.
Since we've only swapped elements of the array, the requirement that all the original elements are still in the array is also satisfied.

Related

Find if array can be divided into two subarrays of equal sum if any one element can be deleted

Given a array of numbers find if there is a way to delete/remove a number from the array and make one partition in the array( dividing the array into two subarrays ) such that sum of elements in subarray1 is equal to sum of elements in subarray2.
A subarray is a contiguous part of array.
Array [1, 2, 3, 4] has (1), (1,2), (2,3,4),(1,2,3,4) etc.. as its subarrays but not (1,3) , (2,4) , (1,3,4), etc..
Now let us consider one example:-
(Follow 0-based indexing )
Array[] = [ 6, 2, 2, 1, 3 ]
Possible solutions
Delete Array[0] => updated array: - [ 2,2,1,3 ]
Possible partition : - [2,2] and [3,1] where (2+2) = (3+1) = 4
or
Delete Array[1] => updated array: - [ 6,2,1,3 ]
Possible partition : - [6] and [2,1,3] where (6) = (2+1+3) = 6
or
Delete Array[2] => updated array: - [ 6,2,1,3 ]
Possible partition : - [6] and [2,1,3] where (6) = (2+1+3) = 6
Now a similar question already exists where we just have to, find if array can be divided into two subarrays of equal sum , can be done in O(n) =>
PsuedoCode:- The efficient solution involves calculating sum of all
elements of the array in advance. Then for each element of the array,
we can calculate its right sum in O(1) time by using total sum of the
array elements minus sum of elements found so far. The time complexity
of this solution would be O(n) and auxiliary space used by it will be
O(1).
So to solve our problem one brute force method is:- remove every element once and check if the array can be divided into two subarrays of equal sum. Thus it will require O(n^2) time.
So can we do better than this time complexity?
You can use a map to keep track of the position at which each value in the array occurs. Then, as you move through the array considering each partition point, if the difference between the left and right halves is present in the map, and in the correct half (determined by comparing whether the left-right difference is positive or negative with the position of the value relative to the current partition point) then you have a solution.
Here's some Java code to illustrate:
static boolean splitDelete(int[] a)
{
Map<Integer, List<Integer>> map = new HashMap<>();
for(int i=0; i<a.length; i++)
{
List<Integer> idx = map.get(a[i]);
if(idx == null) map.put(a[i], idx = new ArrayList<>());
idx.add(i);
}
int sum = 0;
for(int v : a) sum += v;
int diff = sum;
for(int i=0; i<a.length-1; i++)
{
diff -= 2*a[i];
if(map.containsKey(Math.abs(diff)))
for(int j : map.get(Math.abs(diff)))
if(diff > 0 == j > i) return true;
}
return false;
}
As RaffleBuffle pointed out, there could be a few scenarios for the deleted element as we traverse different separation points. For example,
a a a a a a a a a a a a a a
<-----X--------->|<------->
a a a a a a a a a a a a a a
<--------------->|<---X--->
One way to solve it with O(n) overall complexity, could be to traverse twice. Each time checking if the difference between the two sums is in a map of values we've been tracking for the side we came from.
Python code:
def f(A):
values = set()
total_sum = sum(A)
# Traverse from left, each part
# must have at least one element.
left_sum = A[0]
right_sum = total_sum - A[0]
values.add(A[0])
for i in range(1, len(A) - 1):
values.add(A[i])
left_sum += A[i]
right_sum -= A[i]
# We have an element in the left part
# that's the difference between left
# and right sums.
if (left_sum - right_sum) in values:
return True
# Traverse from right, each part
# must have at least one element.
right_sum = A[len(A)-1]
left_sum = total_sum - A[len(A)-1]
values.clear()
values.add(A[len(A)-1])
for i in range(len(A) - 2, 0, -1):
values.add(A[i])
right_sum += A[i]
left_sum -= A[i]
# We have an element in the right part
# that's the difference between right
# and left sums.
if (right_sum - left_sum) in values:
return True
return False
As = [
[1, 2, 1, 1, 1], # True
[1, 1, 1, 2, 1], # True
[1, 1, 1, 1, 1, 1], # False
[6, 2, 2, 1, 3] # True
]
for A in As:
print("%s\n%s\n\n" % (A, f(A)))

Sum of distance between every pair of same element in an array

I have an array [a0,a1,...., an] I want to calculate the sum of the distance between every pair of the same element.
1)First element of array will always be zero.
2)Second element of array will be greater than zero.
3) No two consecutive elements can be same.
4) Size of array can be upto 10^5+1 and elements of array can be from 0 to 10^7
For example, if array is [0,2,5 ,0,5,7,0] then distance between first 0 and second 0 is 2*. distance between first 0 and third 0 is 5* and distance between second 0 and third 0 is 2*. distance between first 5 and second 5 is 1*. Hence sum of distances between same element is 2* + 5* + 2* + 1* = 10;
For this I tried to build a formula:- for every element having occurence more than 1 (0 based indexing and first element is always zero)--> sum = sum + (lastIndex - firstIndex - 1) * (NumberOfOccurence - 1)
if occurence of element is odd subtract -1 from sum else leave as it is. But this approach is not working in every case.
,,But this approach works if array is [0,5,7,0] or if array is [0,2,5,0,5,7,0,1,2,3,0]
Can you suggest another efficient approach or formula?
Edit :- This problem is not a part of any coding contest, it's just a little part of a bigger problem
My method requires space that scales with the number of possible values for elements, but has O(n) time complexity.
I've made no effort to check that the sum doesn't overflow an unsigned long, I just assume that it won't. Same for checking that any input values are in fact no more than max_val. These are details that would have to be addressed.
For each possible value, it keeps track of how much would be added to the sum if one of that element is encountered in total_distance. In instances_so_far, it keeps track of how many instances of a value have already been seen. This is how much would be added to total_distance each step. To make this more efficient, the last index at which a value was encountered is tracked, such that total_distance need only be added to when that particular value is encountered, instead of having nested loops that add every value at every step.
#include <stdio.h>
#include <stddef.h>
// const size_t max_val = 15;
const size_t max_val = 10000000;
unsigned long instances_so_far[max_val + 1] = {0};
unsigned long total_distance[max_val + 1] = {0};
unsigned long last_index_encountered[max_val + 1];
// void print_array(unsigned long *array, size_t len) {
// printf("{");
// for (size_t i = 0; i < len; ++i) {
// printf("%lu,", array[i]);
// }
// printf("}\n");
// }
unsigned long get_sum(unsigned long *array, size_t len) {
unsigned long sum = 0;
for (size_t i = 0; i < len; ++i) {
if (instances_so_far[array[i]] >= 1) {
total_distance[array[i]] += (i - last_index_encountered[array[i]]) * instances_so_far[array[i]] - 1;
}
sum += total_distance[array[i]];
instances_so_far[array[i]] += 1;
last_index_encountered[array[i]] = i;
// printf("inst ");
// print_array(instances_so_far, max_val + 1);
// printf("totd ");
// print_array(total_distance, max_val + 1);
// printf("encn ");
// print_array(last_index_encountered, max_val + 1);
// printf("sums %lu\n", sum);
// printf("\n");
}
return sum;
}
unsigned long test[] = {0,1,0,2,0,3,0,4,5,6,7,8,9,10,0};
int main(void) {
printf("%lu\n", get_sum(test, sizeof(test) / sizeof(test[0])));
return 0;
}
I've tested it with a few of the examples here, and gotten the answers I expected.
I had to use static storage for the arrays because they overflowed the stack if put there.
I've left in the commented-out code I used for debugging, it's helpful to understand what's going on, if you reduce max_val to a smaller number.
Please let me know if you find a counter-example that fails.
Here is Python 3 code for your problem. This works on all the examples given in your question and in the comments--I included the test code.
This works by looking at how each consecutive pair of repeated elements adds to the overall sum of distances. If the list has 6 elements, the pair distances are:
x x x x x x The repeated element's locations in the array
-- First, consecutive pairs
--
--
--
--
----- Now, pairs that have one element inside
-----
-----
-----
-------- Now, pairs that have two elements inside
--------
--------
----------- Now, pairs that have three elements inside
-----------
-------------- Now, pairs that have four elements inside
If we look down between each consecutive pair, we see that it adds to the overall sum of all pairs:
5 8 9 8 5
And if we look at the differences between those values we get
3 1 -1 -3
Now if we use my preferred definition of "distance" for a pairs, namely the difference of their indices, we can use those multiplicities for consecutive pairs to calculate the overall sum of distances for all pairs. But since your definition is not mine, we calculate the sum for my definition then adjust it for your definition.
This code makes one pass through the original array to get the occurrences for each element value in the array, then another pass through those distinct element values. (I used the pairwise routine to avoid another pass through the array.) That makes my algorithm O(n) in time complexity, where n is the length of the array. This is much better than the naive O(n^2). Since my code builds an array of the repeated elements, once per unique element value, this has space complexity of at worst O(n).
import collections
import itertools
def pairwise(iterable):
"""s -> (s0,s1), (s1,s2), (s2, s3), ..."""
a, b = itertools.tee(iterable)
next(b, None)
return zip(a, b)
def sum_distances_of_pairs(alist):
# Make a dictionary giving the indices for each element of the list.
element_ndxs = collections.defaultdict(list)
for ndx, element in enumerate(alist):
element_ndxs[element].append(ndx)
# Sum the distances of pairs for each element, using my def of distance
sum_of_all_pair_distances = 0
for element, ndx_list in element_ndxs.items():
# Filter out elements not occurring more than once and count the rest
if len(ndx_list) < 2:
continue
# Sum the distances of pairs for this element, using my def of distance
sum_of_pair_distances = 0
multiplicity = len(ndx_list) - 1
delta_multiplicity = multiplicity - 2
for ndx1, ndx2 in pairwise(ndx_list):
# Update the contribution of this consecutive pair to the sum
sum_of_pair_distances += multiplicity * (ndx2 - ndx1)
# Prepare for the next consecutive pair
multiplicity += delta_multiplicity
delta_multiplicity -= 2
# Adjust that sum of distances for the desired definition of distance
cnt_all_pairs = len(ndx_list) * (len(ndx_list) - 1) // 2
sum_of_pair_distances -= cnt_all_pairs
# Add that sum for this element into the overall sum
sum_of_all_pair_distances += sum_of_pair_distances
return sum_of_all_pair_distances
assert sum_distances_of_pairs([0, 2, 5, 0, 5, 7, 0]) == 10
assert sum_distances_of_pairs([0, 5, 7, 0]) == 2
assert sum_distances_of_pairs([0, 2, 5, 0, 5, 7, 0, 1, 2, 3, 0]) == 34
assert sum_distances_of_pairs([0, 0, 0, 0, 1, 2, 0]) == 18
assert sum_distances_of_pairs([0, 1, 0, 2, 0, 3, 4, 5, 6, 7, 8, 9, 0, 10, 0]) == 66
assert sum_distances_of_pairs([0, 1, 0, 2, 0, 3, 0, 4, 5, 6, 7, 8, 9, 10, 0]) == 54

Insert a smallest possible positive integer into an array of unique integers [duplicate]

This question already has answers here:
Find the Smallest Integer Not in a List
(28 answers)
Closed 3 years ago.
I am trying to tackle this interview question: given an array of unique positive integers, find the smallest possible number to insert into it so that every integer is still unique. The algorithm should be in O(n) and the additional space complexity should be constant. Assigning values in the array to other integers is allowed.
For example, for an array [5, 3, 2, 7], output should be 1. However for [5, 3, 2, 7, 1], the answer should then be 4.
My first idea is to sort the array, then go through the array again to find where the continuous sequence breaks, but sorting needs more than O(n).
Any ideas would be appreciated!
My attempt:
The array A is assumed 1-indexed. We call an active value one that is nonzero and does not exceed n.
Scan the array until you find an active value, let A[i] = k (if you can't find one, stop);
While A[k] is active,
Move A[k] to k while clearing A[k];
Continue from i until you reach the end of the array.
After this pass, all array entries corresponding to some integer in the array are cleared.
Find the first nonzero entry, and report its index.
E.g.
[5, 3, 2, 7], clear A[3]
[5, 3, 0, 7], clear A[2]
[5, 0, 0, 7], done
The answer is 1.
E.g.
[5, 3, 2, 7, 1], clear A[5],
[5, 3, 2, 7, 0], clear A[1]
[0, 3, 2, 7, 0], clear A[3],
[0, 3, 0, 7, 0], clear A[2],
[0, 0, 0, 7, 0], done
The answer is 4.
The behavior of the first pass is linear because every number is looked at once (and immediately cleared), and i increases regularly.
The second pass is a linear search.
A= [5, 3, 2, 7, 1]
N= len(A)
print(A)
for i in range(N):
k= A[i]
while k > 0 and k <= N:
A[k-1], k = 0, A[k-1] # -1 for 0-based indexing
print(A)
[5, 3, 2, 7, 1]
[5, 3, 2, 7, 0]
[0, 3, 2, 7, 0]
[0, 3, 2, 7, 0]
[0, 3, 0, 7, 0]
[0, 0, 0, 7, 0]
[0, 0, 0, 7, 0]
Update:
Based on גלעד ברקן's idea, we can mark the array elements in a way that does not destroy the values. Then you report the index of the first unmarked.
print(A)
for a in A:
a= abs(a)
if a <= N:
A[a-1]= - A[a-1] # -1 for 0-based indexing
print(A)
[5, 3, 2, 7, 1]
[5, 3, 2, 7, -1]
[5, 3, -2, 7, -1]
[5, -3, -2, 7, -1]
[5, -3, -2, 7, -1]
[-5, -3, -2, 7, -1]
From the question description: "Assigning values in the array to other integers is allowed." This is O(n) space, not constant.
Loop over the array and multiply A[ |A[i]| - 1 ] by -1 for |A[i]| < array length. Loop a second time and output (the index + 1) for the first cell not negative or (array length + 1) if they are all marked. This takes advantage of the fact that there could not be more than (array length) unique integers in the array.
I will use 1-based indexing.
The idea is to reuse input collection and arrange to swap integer i at ith place if its current position is larger than i. This can be performed in O(n).
Then on second iteration, you find the first index i not containing i, which is again O(n).
In Smalltalk, implemented in Array (self is the array):
firstMissing
self size to: 1 by: -1 do: [:i |
[(self at: i) < i] whileTrue: [self swap: i with: (self at: i)]].
1 to: self size do: [:i |
(self at: i) = i ifFalse: [^i]].
^self size + 1
So we have two loops in O(n), but we also have another loop inside the first loop (whileTrue:). So is the first loop really O(n)?
Yes, because each element will be swapped at most once, since they will arrive at their right place. We see that the cumulated number of swap is bounded by array size, and the overall cost of first loop is at most 2*n, the total cost incuding last seatch is at most 3*n, still O(n).
You also see that we don't care to swap case of (self at: i) > i and: [(self at:i) <= self size], why? Because we are sure that there will be a smaller missing element in this case.
A small test case:
| trial |
trial := (1 to: 100100) asArray shuffled first: 100000.
self assert: trial copy firstMissing = trial sorted firstMissing.
You could do the following.
Find the maximum (m), sum of all elements (s), number of elements (n)
There are m-n elements missing, their sum is q = sum(1..m) - s - there is a closed-form solution for the sum
If you are missing only one integer, you're done - report q
If you are missing more than one (m-n), you realize that the sum of the missing integers is q, and at least one of them will be smaller than q/(m-n)
You start from the top, except you will only take into account integers smaller than q/(m-n) - this will be the new m, only elements below that maximum contribute to the new s and n. Do this until you are left with only one missing integer.
Still, this may not be linear time, I'm not sure.
EDIT: you should use the candidate plus half the input size as a pivot to reduce the constant factor here – see Daniel Schepler’s comment – but I haven’t had time to get it working in the example code yet.
This isn’t optimal – there’s a clever solution being looked for – but it’s enough to meet the criteria :)
Define the smallest possible candidate so far: 1.
If the size of the input is 0, the smallest possible candidate is a valid candidate, so return it.
Partition the input into < pivot and > pivot (with median of medians pivot, like in quicksort).
If the size of ≤ pivot is less than pivot itself, there’s a free value in there, so start over at step 2 considering only the < pivot partition.
Otherwise (when it’s = pivot), the new smallest possible candidate is the pivot + 1. Start over at step 2 considering only the > pivot partition.
I think that works…?
'use strict';
const swap = (arr, i, j) => {
[arr[i], arr[j]] = [arr[j], arr[i]];
};
// dummy pivot selection, because this part isn’t important
const selectPivot = (arr, start, end) =>
start + Math.floor(Math.random() * (end - start));
const partition = (arr, start, end) => {
let mid = selectPivot(arr, start, end);
const pivot = arr[mid];
swap(arr, mid, start);
mid = start;
for (let i = start + 1; i < end; i++) {
if (arr[i] < pivot) {
mid++;
swap(arr, i, mid);
}
}
swap(arr, mid, start);
return mid;
};
const findMissing = arr => {
let candidate = 1;
let start = 0;
let end = arr.length;
for (;;) {
if (start === end) {
return candidate;
}
const pivotIndex = partition(arr, start, end);
const pivot = arr[pivotIndex];
if (pivotIndex + 1 < pivot) {
end = pivotIndex;
} else {
//assert(pivotIndex + 1 === pivot);
candidate = pivot + 1;
start = pivotIndex + 1;
}
}
};
const createTestCase = (size, max) => {
if (max < size) {
throw new Error('size must be < max');
}
const arr = Array.from({length: max}, (_, i) => i + 1);
const expectedIndex = Math.floor(Math.random() * size);
arr.splice(expectedIndex, 1 + Math.floor(Math.random() * (max - size - 1)));
for (let i = 0; i < size; i++) {
let j = i + Math.floor(Math.random() * (size - i));
swap(arr, i, j);
}
return {
input: arr.slice(0, size),
expected: expectedIndex + 1,
};
};
for (let i = 0; i < 5; i++) {
const test = createTestCase(1000, 1024);
console.log(findMissing(test.input), test.expected);
}
The correct method I almost got on my own, but I had to search for it, and I found it here: https://www.geeksforgeeks.org/find-the-smallest-positive-number-missing-from-an-unsorted-array/
Note: This method is destructive to the original data
Nothing in the original question said you could not be destructive.
I will explain what you need to do now.
The basic "aha" here is that the first missing number must come within the first N positive numbers, where N is the length of the array.
Once you understand this and realize you can use the values in the array itself as markers, you just have one problem you need to address: Does the array have numbers less than 1 in it? If so we need to deal with them.
Dealing with 0s or negative numbers can be done in O(n) time. Get two integers, one for our current value, and one for the end of the array. As we scan through, if we find a 0 or negative number, we perform a swap using the third integer, with the final value in the array. Then we decrement our end of an array pointer. We continue until our current pointer is past the end of the array pointer.
Code example:
while (list[end] < 1) {
end--;
}
while (cur< end) {
if (n < 1) {
swap(list[cur], list[end]);
while (list[end] < 1) {
end--;
}
}
}
Now we have the end of the array, and a truncated array. From here we need to see how we can use the array itself. Since all numbers that we care about are positive, and we have a pointer to the position of how many of them there are, we can simply multiply one by -1 to mark that place as present if there was a number in the array there.
e.g. [5, 3, 2, 7, 1] when we read 3, we change it to [5, 3, -2, 7, 1]
Code example:
for (cur = 0; cur <= end; begin++) {
if (!(abs(list[cur]) > end)) {
list[abs(list[cur]) - 1] *= -1;
}
}
Now, note: You need to read the absolute value of the integer in the position because it might be changed to be negative. Also note, if an integer is greater than your end of list pointer, do not change anything as that integer will not matter.
Finally, once you have read all the positive values, iterate through them to find the first one that is currently positive. This place represents your first missing number.
Step 1: Segregate 0 and negative numbers from your list to the right. O(n)
Step 2: Using the end of list pointer iterate through the entire list marking
relevant positions negative. O(n-k)
Step 3: Scan the numbers for the position of the first non-negative number. O(n-k)
Space Complexity: The original list is not counted, I used 3 integers beyond that. So
it is O(1)
One thing I should mention is the list [5, 4, 2, 1, 3] would end up [-5, -4, -2, -1, -3] so in this case, you would choose the first number after the end position of the list, or 6 as your result.
Code example for step 3:
for (cur = 0; cur < end; cur++) {
if (list[cur] > 0) {
break;
}
}
print(cur);
use this short and sweet algorithm:
A is [5, 3, 2, 7]
1- Define B With Length = A.Length; (O(1))
2- initialize B Cells With 1; (O(n))
3- For Each Item In A:
if (B.Length <= item) then B[Item] = -1 (O(n))
4- The answer is smallest index in B such that B[index] != -1 (O(n))

Is it possible to invert an array with constant extra space?

Let's say I have an array A with n unique elements on the range [0, n). In other words, I have a permutation of the integers [0, n).
Is possible to transform A into B using O(1) extra space (AKA in-place) such that B[A[i]] = i?
For example:
A B
[3, 1, 0, 2, 4] -> [2, 1, 3, 0, 4]
Yes, it is possible, with O(n^2) time algorithm:
Take element at index 0, then write 0 to the cell indexed by that element. Then use just overwritten element to get next index and write previous index there. Continue until you go back to index 0. This is cycle leader algorithm.
Then do the same starting from index 1, 2, ... But before doing any changes perform cycle leader algorithm without any modifications starting from this index. If this cycle contains any index below the starting index, just skip it.
Or this O(n^3) time algorithm:
Take element at index 0, then write 0 to the cell indexed by that element. Then use just overwritten element to get next index and write previous index there. Continue until you go back to index 0.
Then do the same starting from index 1, 2, ... But before doing any changes perform cycle leader algorithm without any modifications starting from all preceding indexes. If current index is present in any preceding cycle, just skip it.
I have written (slightly optimized) implementation of O(n^2) algorithm in C++11 to determine how many additional accesses are needed for each element on average if random permutation is inverted. Here are the results:
size accesses
2^10 2.76172
2^12 4.77271
2^14 6.36212
2^16 7.10641
2^18 9.05811
2^20 10.3053
2^22 11.6851
2^24 12.6975
2^26 14.6125
2^28 16.0617
While size grows exponentially, number of element accesses grows almost linearly, so expected time complexity for random permutations is something like O(n log n).
Inverting an array A requires us to find a permutation B which fulfills the requirement A[B[i]] == i for all i.
To build the inverse in-place, we have to swap elements and indices by setting A[A[i]] = i for each element A[i]. Obviously, if we would simply iterate through A and perform aforementioned replacement, we might override upcoming elements in A and our computation would fail.
Therefore, we have to swap elements and indices along cycles of A by following c = A[c] until we reach our cycle's starting index c = i.
Every element of A belongs to one such cycle. Since we have no space to store whether or not an element A[i] has already been processed and needs to be skipped, we have to follow its cycle: If we reach an index c < i we would know that this element is part of a previously processed cycle.
This algorithm has a worst-case run-time complexity of O(n²), an average run-time complexity of O(n log n) and a best-case run-time complexity of O(n).
function invert(array) {
main:
for (var i = 0, length = array.length; i < length; ++i) {
// check if this cycle has already been traversed before:
for (var c = array[i]; c != i; c = array[c]) {
if (c <= i) continue main;
}
// Replacing each cycle element with its predecessors index:
var c_index = i,
c = array[i];
do {
var tmp = array[c];
array[c] = c_index; // replace
c_index = c; // move forward
c = tmp;
} while (i != c_index)
}
return array;
}
console.log(invert([3, 1, 0, 2, 4])); // [2, 1, 3, 0, 4]
Example for A = [1, 2, 3, 0] :
The first element 1 at index 0 belongs to the cycle of elements 1 - 2 - 3 - 0. Once we shift indices 0, 1, 2 and 3 along this cycle, we have completed the first step.
The next element 0 at index 1 belongs to the same cycle and our check tells us so in only one step (since it is a backwards step).
The same holds for the remaining elements 1 and 2.
In total, we perform 4 + 1 + 1 + 1 'operations'. This is the best-case scenario.
Implementation of this explanation in Python:
def inverse_permutation_zero_based(A):
"""
Swap elements and indices along cycles of A by following `c = A[c]` until we reach
our cycle's starting index `c = i`.
Every element of A belongs to one such cycle. Since we have no space to store
whether or not an element A[i] has already been processed and needs to be skipped,
we have to follow its cycle: If we reach an index c < i we would know that this
element is part of a previously processed cycle.
Time Complexity: O(n*n), Space Complexity: O(1)
"""
def cycle(i, A):
"""
Replacing each cycle element with its predecessors index
"""
c_index = i
c = A[i]
while True:
temp = A[c]
A[c] = c_index # replace
c_index = c # move forward
c = temp
if i == c_index:
break
for i in range(len(A)):
# check if this cycle has already been traversed before
j = A[i]
while j != i:
if j <= i:
break
j = A[j]
else:
cycle(i, A)
return A
>>> inverse_permutation_zero_based([3, 1, 0, 2, 4])
[2, 1, 3, 0, 4]
This can be done in O(n) time complexity and O(1) space if we try to store 2 numbers at a single position.
First, let's see how we can get 2 values from a single variable. Suppose we have a variable x and we want to get two values from it, 2 and 1. So,
x = n*1 + 2 , suppose n = 5 here.
x = 5*1 + 2 = 7
Now for 2, we can take remainder of x, ie, x%5. And for 1, we can take quotient of x, ie , x/5
and if we take n = 3
x = 3*1 + 2 = 5
x%3 = 5%3 = 2
x/3 = 5/3 = 1
We know here that the array contains values in range [0, n-1], so we can take the divisor as n, size of array. So, we will use the above concept to store 2 numbers at every index, one will represent old value and other will represent the new value.
A B
0 1 2 3 4 0 1 2 3 4
[3, 1, 0, 2, 4] -> [2, 1, 3, 0, 4]
.
a[0] = 3, that means, a[3] = 0 in our answer.
a[a[0]] = 2 //old
a[a[0]] = 0 //new
a[a[0]] = n* new + old = 5*0 + 2 = 2
a[a[i]] = n*i + a[a[i]]
And during array traversal, a[i] value can be greater than n because we are modifying it. So we will use a[i]%n to get the old value.
So the logic should be
a[a[i]%n] = n*i + a[a[i]%n]
Array -> 13 6 15 2 24
Now, to get the older values, take the remainder on dividing each value by n, and to get the new values, just divide each value by n, in this case, n=5.
Array -> 2 1 3 0 4
Following approach Optimizes the cycle walk if it is already handled. Also each element is 1 based. Need to convert accordingly while trying to access the elements in the given array.
enter code here
#include <stdio.h>
#include <iostream>
#include <vector>
#include <bits/stdc++.h>
using namespace std;
// helper function to traverse cycles
void cycle(int i, vector<int>& A) {
int cur_index = i+1, next_index = A[i];
while (next_index > 0) {
int temp = A[next_index-1];
A[next_index-1] = -(cur_index);
cur_index = next_index;
next_index = temp;
if (i+1 == abs(cur_index)) {
break;
}
}
}
void inverse_permutation(vector<int>& A) {
for (int i = 0; i < A.size(); i++) {
cycle(i, A);
}
for (int i = 0; i < A.size(); i++) {
A[i] = abs(A[i]);
}
for (int i = 0; i < A.size(); i++) {
cout<<A[i]<<" ";
}
}
int main(){
// vector<int> perm = {4,0,3,1,2,5,6,7,8};
vector<int> perm = {5,1,4,2,3,6,7,9,8};
//vector<int> perm = { 17,2,15,19,3,7,12,4,18,20,5,14,13,6,11,10,1,9,8,16};
// vector<int> perm = {4, 1, 2, 3};
// { 6,17,9,23,2,10,20,7,11,5,14,13,4,1,25,22,8,24,21,18,19,12,15,16,3 } =
// { 14,5,25,13,10,1,8,17,3,6,9,22,12,11,23,24,2,20,21,7,19,16,4,18,15 }
// vector<int> perm = {6, 17, 9, 23, 2, 10, 20, 7, 11, 5, 14, 13, 4, 1, 25, 22, 8, 24, 21, 18, 19, 12, 15, 16, 3};
inverse_permutation(perm);
return 0;
}

Find the number of unordered pair in an array

I ran into an interesting algorithm problem:
Given an array of integer, find the number of un-ordered pairs in that array, say given {1, 3, 2}, the answer is 1 because {3, 2} is un-ordered, and for array {3, 2, 1}, the answer is 3 because {3, 2}, {3, 1}, {2, 1}.
Obviously, this can be solved by brute force with O(n^2) running time, or permute all possible pairs then eliminate those invalid pairs.
My question is does any body have any better solution and how would you do it because it seems like a dynamic programming problem. A snippet of code would be helpful
It is possible to solve this problem in O(n log n) time using a balanced binary search tree.
Here is a pseudo-code of this algorithm:
tree = an empty balanced binary search tree
answer = 0
for each element in the array:
answer += number of the elements in the tree greater then this element
add this element to the tree
If you are just looking for the number of un-ordered pair and the array is sorted in ascending order. You can use this formula n * (n - 1) / 2.
Suppose your array has n elements, for example 3 in your case. It will be 3 * 2 / 2 = 3. Assuming there are no duplicate elements.
You can use a modified version of merge sort to count the number of inversions. The trick is that while merging two sorted sub arrays you can come to know the elements which are out of place.
If there are any elements in right subarray which need to go before the ones in left subarray, they are the inverted ones.
I've written the code for this in python. You can check the explanation below it for better understanding. If you not able to understand merge sort I'd suggest you to revist merge sort after which this would be intuitive.
def merge_sort(l):
if len(l) <= 1:
return (0, l)
else:
mid = len(l) / 2
count_left, ll = merge_sort(l[0:mid])
count_right, lr = merge_sort(l[mid:])
count_merge, merged = merge(ll, lr)
total = count_left + count_right + count_merge
return total, merged
def merge(left, right):
li, ri = 0, 0
merged = []
count = 0
while li < len(left) and ri < len(right):
if left[li] < right[ri]:
merged.append(left[li])
li += 1
else:
count += 1
merged.append(right[ri])
ri += 1
if li < len(left):
merged.extend(left[li:])
elif ri < len(right):
merged.extend(right[ri:])
return count, merged
if __name__ == '__main__':
# example
l = [6, 1 , 2, 3, 4, 5]
print 'inverse pair count is %s'%merge_sort(l)[0]
Merge sort runs in n * log(n) time.
for the passed list l, merge_sort returns a tuple (in the form of (inversion_count, list)) of number of inversions and the sorted list
Merge step counts the number of inversions and stores it in the variable count.
It was in one of my practice midterms and i think a nested for loop does the job pretty nice.
public static void main(String args[])
{
int IA[] = {6,2,9,5,8,7};
int cntr = 0;
for(int i = 0; i <= IA.length-1;i++)
{
for(int j = i; j <= IA.length-1; j++)
{
if(IA[i]>IA[j])
{
System.out.print("("+IA[i]+","+ IA[j]+")"+";");
cntr++;
}
}
}
System.out.println(cntr);
}
You can use a modified merge-sort algorithm. Merging would look something like this.
merge(a, b):
i = 0
j = 0
c = new int[a.length+b.length]
inversions = 0
for(k = 0 ; k < Math.min(a.length, b.length); k++)
if(a[i] > b[j]):
inversions++
c[k] = b[j]
j++
else:
c[k] = a[i]
i++
//dump the rest of the longer array in c
return inversions
Merging is done in O(n) time. Time complexity of the whole merge sort is O(n log n)

Resources