Algorithm to apply permutation in constant memory space - arrays

I saw this question is a programming interview book, here I'm simplifying the question.
Assume you have an array A of length n, and you have a permutation array P of length n as well. Your method will return an array where elements of A will appear in the order with indices specified in P.
Quick example: Your method takes A = [a, b, c, d, e] and P = [4, 3, 2, 0, 1]. then it will return [e, d, c, a, b]. You are allowed to use only constant space (i.e. you can't allocate another array, which takes O(n) space).
Ideas?

There is a trivial O(n^2) algorithm, but you can do this in O(n). E.g.:
A = [a, b, c, d, e]
P = [4, 3, 2, 0, 1]
We can swap each element in A with the right element required by P, after each swap, there will be one more element in the right position, and do this in a circular fashion for each of the positions (swap elements pointed with ^s):
[a, b, c, d, e] <- P[0] = 4 != 0 (where a initially was), swap 0 (where a is) with 4
^ ^
[e, b, c, d, a] <- P[4] = 1 != 0 (where a initially was), swap 4 (where a is) with 1
^ ^
[e, a, c, d, b] <- P[1] = 3 != 0 (where a initially was), swap 1 (where a is) with 3
^ ^
[e, d, c, a, b] <- P[3] = 0 == 0 (where a initially was), finish step
After one circle, we find the next element in the array that does not stay in the right position, and do this again. So in the end you will get the result you want, and since each position is touched a constant time (for each position, at most one operation (swap) is performed), it is O(n) time.
You can stored the information of which one is in its right place by:
set the corresponding entry in P to -1, which is unrecoverable: after the operations above, P will become [-1, -1, 2, -1, -1], which denotes that only the second one might be not in the right position, and a further step will make sure it is in the right position and terminates the algorithm;
set the corresponding entry in P to -n - 1: P becomes [-5, -4, 2, -1, -2], which can be recovered in O(n) trivially.

Yet another unnecessary answer! This one preserves the permutation array P explicitly, which was necessary for my situation, but sacrifices in cost. Also this does not require tracking the correctly placed elements. I understand that a previous answer provides the O(N) solution, so I guess this one is just for amusement!
We get best case complexity O(N), worst case O(N^2), and average case O(NlogN). For large arrays (N~10000 or greater), the average case is essentially O(N).
Here is the core algorithm in Java (I mean pseudo-code *cough cough*)
int ind=0;
float temp=0;
for(int i=0; i<(n-1); i++){
// get next index
ind = P[i];
while(ind<i)
ind = P[ind];
// swap elements in array
temp = A[i];
A[i] = A[ind];
A[ind] = temp;
}
Here is an example of the algorithm running (similar to previous answers):
let A = [a, b, c, d, e]
and P = [2, 4, 3, 0, 1]
then expected = [c, e, d, a, b]
i=0: [a, b, c, d, e] // (ind=P[0]=2)>=0 no while loop, swap A[0]<->A[2]
^ ^
i=1: [c, b, a, d, e] // (ind=P[1]=4)>=1 no while loop, swap A[1]<->A[4]
^ ^
i=2: [c, e, a, d, b] // (ind=P[2]=3)>=2 no while loop, swap A[2]<->A[3]
^ ^
i=3a: [c, e, d, a, b] // (ind=P[3]=0)<3 uh-oh! enter while loop...
^
i=3b: [c, e, d, a, b] // loop iteration: ind<-P[0]. now have (ind=2)<3
? ^
i=3c: [c, e, d, a, b] // loop iteration: ind<-P[2]. now have (ind=3)>=3
? ^
i=3d: [c, e, d, a, b] // good index found. Swap A[3]<->A[3]
^
done.
This algorithm can bounce around in that while loop for any indices j<i, up to at most i times during the ith iteration. In the worst case (I think!) each iteration of the outer for loop would result in i extra assignments from the while loop, so we'd have an arithmetic series thing going on, which would add an N^2 factor to the complexity! Running this for a range of N and averaging the number of 'extra' assignments needed by the while loop (averaged over many permutations for each N, that is), though, strongly suggests to me that the average case is O(NlogN).
Thanks!

The simplest case is when there is only a single swap for an element to the destination index. for ex:
array=abcd
perm =1032. you just need two direct swaps: ab swap, cd swap
for other cases, we need to keep swapping until an element reaches its final destination. for ex: abcd, 3021 starting with first element, we swap a and d. we check if a's destination is 0 at perm[perm[0]]. its not, so we swap a with elem at array[perm[perm[0]]] which is b. again we check if a's has reached its destination at perm[perm[perm[0]]] and yes it is. so we stop.
we repeat this for each array index.
Every item is moved in-place only once, so it's O(N) with O(1) storage.
def permute(array, perm):
for i in range(len(array)):
elem, p = array[i], perm[i]
while( p != i ):
elem, array[p] = array[p], elem
elem = array[p]
p = perm[p]
return array

#RinRisson has given the only completely correct answer so far! Every other answer has been something that required extra storage — O(n) stack space, or assuming that the permutation P was conveniently stored adjacent to O(n) unused-but-mutable sign bits, or whatever.
Here's RinRisson's correct answer written out in C++. This passes every test I have thrown at it, including an exhaustive test of every possible permutation of length 0 through 11.
Notice that you don't even need the permutation to be materialized; we can treat it as a completely black-box function OldIndex -> NewIndex:
template<class RandomIt, class F>
void permute(RandomIt first, RandomIt last, const F& p)
{
using IndexType = std::decay_t<decltype(p(0))>;
IndexType n = last - first;
for (IndexType i = 0; i + 1 < n; ++i) {
IndexType ind = p(i);
while (ind < i) {
ind = p(ind);
}
using std::swap;
swap(*(first + i), *(first + ind));
}
}
Or slap a more STL-ish interface on top:
template<class RandomIt, class ForwardIt>
void permute(RandomIt first, RandomIt last, ForwardIt pfirst, ForwardIt plast)
{
assert(std::distance(first, last) == std::distance(pfirst, plast));
permute(first, last, [&](auto i) { return *std::next(pfirst, i); });
}

You can consequently put the desired element to the front of the array, while working with the remaining array of the size (n-1) in the the next iteration step.
The permutation array needs to be accordingly adjusted to reflect the decreasing size of the array. Namely, if the element you placed in the front was found at position "X" you need to decrease by one all the indexes greater or equal to X in the permutation table.
In the case of your example:
array permutation -> adjusted permutation
A = {[a b c d e]} [4 3 2 0 1]
A1 = { e [a b c d]} [3 2 0 1] -> [3 2 0 1] (decrease all indexes >= 4)
A2 = { e d [a b c]} [2 0 1] -> [2 0 1] (decrease all indexes >= 3)
A3 = { e d c [a b]} [0 1] -> [0 1] (decrease all indexes >= 2)
A4 = { e d c a [b]} [1] -> [0] (decrease all indexes >= 0)
Another example:
A0 = {[a b c d e]} [0 2 4 3 1]
A1 = { a [b c d e]} [2 4 3 1] -> [1 3 2 0] (decrease all indexes >= 0)
A2 = { a c [b d e]} [3 2 0] -> [2 1 0] (decrease all indexes >= 2)
A3 = { a c e [b d]} [1 0] -> [1 0] (decrease all indexes >= 2)
A4 = { a c e d [b]} [0] -> [0] (decrease all indexes >= 1)
The algorithm, though not the fastest, avoids the extra memory allocation while still keeping the track of the initial order of elements.

Here a clearer version which takes a swapElements function that accepts indices, e.g., std::swap(Item[cycle], Item[P[cycle]])$
Essentially it runs through all elements and follows the cycles if they haven't been visited yet. Instead of the second check !visited[P[cycle]], we could also compare with the first element in the cycle which has been done somewhere else above.
bool visited[n] = {0};
for (int i = 0; i < n; i++) {
int cycle = i;
while(! visited[cycle] && ! visited[P[cycle]]) {
swapElements(cycle,P[cycle]);
visited[cycle]=true;
cycle = P[cycle];
}
}

Just a simple example C/C++ code addition to the Ziyao Wei's answer. Code is not allowed in comments, so as an answer, sorry:
for (int i = 0; i < count; ++i)
{
// Skip to the next non-processed item
if (destinations[i] < 0)
continue;
int currentPosition = i;
// destinations[X] = Y means "an item on position Y should be at position X"
// So we should move an item that is now at position X somewhere
// else - swap it with item on position Y. Then we have a right
// item on position X, but the original X-item now on position Y,
// maybe should be occupied by someone else (an item Z). So we
// check destinations[Y] = Z and move the X-item further until we got
// destinations[?] = X which mean that on position ? should be an item
// from position X - which is exactly the X-item we've been kicking
// around all this time. Loop closed.
//
// Each permutation has one or more such loops, they obvisouly
// don't intersect, so we may mark each processed position as such
// and once the loop is over go further down by an array from
// position X searching for a non-marked item to start a new loop.
while (destinations[currentPosition] != i)
{
const int target = destinations[currentPosition];
std::swap(items[currentPosition], items[target]);
destinations[currentPosition] = -1 - target;
currentPosition = target;
}
// Mark last current position as swapped before moving on
destinations[currentPosition] = -1 - destinations[currentPosition];
}
for (int i = 0; i < count; ++i)
destinations[i] = -1 - destinations[i];
(for C - replace std::swap with something else)

Traceback what we have swapped by checking index.
Java, O(N) swaps, O(1) space:
static void swap(char[] arr, int x, int y) {
char tmp = arr[x];
arr[x] = arr[y];
arr[y] = tmp;
}
public static void main(String[] args) {
int[] intArray = new int[]{4,2,3,0,1};
char[] charArray = new char[]{'A','B','C','D','E'};
for(int i=0; i<intArray.length; i++) {
int index_to_swap = intArray[i];
// Check index if it has already been swapped before
while (index_to_swap < i) {
// trace back the index
index_to_swap = intArray[index_to_swap];
}
swap(charArray, index_to_swap, i);
}
}

I agree with many solutions here, but below is a very short code snippet that permute throughout a permutation cycle:
def _swap(a, i, j):
a[i], a[j] = a[j], a[i]
def apply_permutation(a, p):
idx = 0
while p[idx] != 0:
_swap(a, idx, p[idx])
idx = p[idx]
So the code snippet below
a = list(range(4))
p = [1, 3, 2, 0]
apply_permutation(a, p)
print(a)
Outputs [2, 4, 3, 1]

Related

Mutating an array without extra space

I was given the following question in an interview, and couldn't find the solution.
Given is an array of chars length n, and "important section" (all chars in this section must be saved) length m where n >= m >= 0 as follows:
Without extra space, perform the following process:
Remove all occurrences of A and duplicate all occurrences of B, return a sub array of the mutated array. For example, for the above array [C,A,X,B,B,F,Q] n=7, m=5 ,output will be [C,X,B,B,B,B]. Note that the mutated array length is 6, since Q was in the redundant section and B was duplicated.
Return -1 if the operation can't be performed.
Examples:
n=2, m=2 , [A,B] => [B,B]
n=2, m=2 , [B,B] => -1 (since the result [B,B,B,B] is larger then the array)
n=3, m=2 , [A,B,C] => [B,B]
n=3, m=3 , [A,B,C] => [B,B,C]
n=3, m=2 , [Z,B,A] => [Z,B,B] (since A was in the redundant section)
Looking for a code example, Could this be done in O(n) time complexity?
Scan array to determine if is it possible to store mutated array in available space -- count As and B, and check N-M >= numB-numA
Walk array left to right: Shift elements to the left by the number of As so far (filling places of A)
Walk array right to left: Shift elements to the right by numB-B_so_far, inserting additional Bs
Start from the end of the input array. We will figure out from the back to the front what to fill in.
Look at the last significant character in the input (position m). If it is a, ignore it. Otherwise, add the symbol. Repeat until you read all the input.
This removes as. Now we will duplicate bs.
Start from the beginning of the array. Find the last value you wrote during the above steps. If it is a b, write two bs. If it is something else, just write one of them. Repeat. NOTE: if you ever "catch up", needing to write where you need to read, you don't have enough room and you output -1. Otherwise, return the part of the array from position 1 to the last read position.
Example:
Phase 1: removing A
CAXBBFQ
CAXBBFB
CAXBBBB
CAXBXBB
CAXCXBB
Phase 2: duplicating B
CAXCXBB
CXXCXBB
CXBBXBB
CXBBBBB
^^^^^^
Phase 1 is linear (we read m symbols and write no more than m).
Phase 2 is linear (we read fewer than m symbols and write no more than 2m).
m is less than n so everything is O(m) and O(n).
The code, with some optimizations, would look something like this, O(n):
// returns length of the relevant part of the mutated array or -1
public static int mutate(char[] a, int m) {
// delete As and count Bs in the relevant part
int bCount = 0, position = 0;
for (int i = 0; i < m; i++) {
if (a[i] != 'A') {
if (a[i] == 'B')
bCount++;
a[position++] = a[i];
}
}
// check if it is possible
int n = bCount + position;
if (n > a.length)
return -1;
// duplicate the Bs in the relevant part
for (int i = position - 1, index = n - 1; i >= 0; i--) {
if (a[i] != 'B') {
a[index--] = a[i];
} else {
a[index--] = 'B';
a[index--] = 'B';
}
}
return n;
}

Is it possible to invert an array with constant extra space?

Let's say I have an array A with n unique elements on the range [0, n). In other words, I have a permutation of the integers [0, n).
Is possible to transform A into B using O(1) extra space (AKA in-place) such that B[A[i]] = i?
For example:
A B
[3, 1, 0, 2, 4] -> [2, 1, 3, 0, 4]
Yes, it is possible, with O(n^2) time algorithm:
Take element at index 0, then write 0 to the cell indexed by that element. Then use just overwritten element to get next index and write previous index there. Continue until you go back to index 0. This is cycle leader algorithm.
Then do the same starting from index 1, 2, ... But before doing any changes perform cycle leader algorithm without any modifications starting from this index. If this cycle contains any index below the starting index, just skip it.
Or this O(n^3) time algorithm:
Take element at index 0, then write 0 to the cell indexed by that element. Then use just overwritten element to get next index and write previous index there. Continue until you go back to index 0.
Then do the same starting from index 1, 2, ... But before doing any changes perform cycle leader algorithm without any modifications starting from all preceding indexes. If current index is present in any preceding cycle, just skip it.
I have written (slightly optimized) implementation of O(n^2) algorithm in C++11 to determine how many additional accesses are needed for each element on average if random permutation is inverted. Here are the results:
size accesses
2^10 2.76172
2^12 4.77271
2^14 6.36212
2^16 7.10641
2^18 9.05811
2^20 10.3053
2^22 11.6851
2^24 12.6975
2^26 14.6125
2^28 16.0617
While size grows exponentially, number of element accesses grows almost linearly, so expected time complexity for random permutations is something like O(n log n).
Inverting an array A requires us to find a permutation B which fulfills the requirement A[B[i]] == i for all i.
To build the inverse in-place, we have to swap elements and indices by setting A[A[i]] = i for each element A[i]. Obviously, if we would simply iterate through A and perform aforementioned replacement, we might override upcoming elements in A and our computation would fail.
Therefore, we have to swap elements and indices along cycles of A by following c = A[c] until we reach our cycle's starting index c = i.
Every element of A belongs to one such cycle. Since we have no space to store whether or not an element A[i] has already been processed and needs to be skipped, we have to follow its cycle: If we reach an index c < i we would know that this element is part of a previously processed cycle.
This algorithm has a worst-case run-time complexity of O(n²), an average run-time complexity of O(n log n) and a best-case run-time complexity of O(n).
function invert(array) {
main:
for (var i = 0, length = array.length; i < length; ++i) {
// check if this cycle has already been traversed before:
for (var c = array[i]; c != i; c = array[c]) {
if (c <= i) continue main;
}
// Replacing each cycle element with its predecessors index:
var c_index = i,
c = array[i];
do {
var tmp = array[c];
array[c] = c_index; // replace
c_index = c; // move forward
c = tmp;
} while (i != c_index)
}
return array;
}
console.log(invert([3, 1, 0, 2, 4])); // [2, 1, 3, 0, 4]
Example for A = [1, 2, 3, 0] :
The first element 1 at index 0 belongs to the cycle of elements 1 - 2 - 3 - 0. Once we shift indices 0, 1, 2 and 3 along this cycle, we have completed the first step.
The next element 0 at index 1 belongs to the same cycle and our check tells us so in only one step (since it is a backwards step).
The same holds for the remaining elements 1 and 2.
In total, we perform 4 + 1 + 1 + 1 'operations'. This is the best-case scenario.
Implementation of this explanation in Python:
def inverse_permutation_zero_based(A):
"""
Swap elements and indices along cycles of A by following `c = A[c]` until we reach
our cycle's starting index `c = i`.
Every element of A belongs to one such cycle. Since we have no space to store
whether or not an element A[i] has already been processed and needs to be skipped,
we have to follow its cycle: If we reach an index c < i we would know that this
element is part of a previously processed cycle.
Time Complexity: O(n*n), Space Complexity: O(1)
"""
def cycle(i, A):
"""
Replacing each cycle element with its predecessors index
"""
c_index = i
c = A[i]
while True:
temp = A[c]
A[c] = c_index # replace
c_index = c # move forward
c = temp
if i == c_index:
break
for i in range(len(A)):
# check if this cycle has already been traversed before
j = A[i]
while j != i:
if j <= i:
break
j = A[j]
else:
cycle(i, A)
return A
>>> inverse_permutation_zero_based([3, 1, 0, 2, 4])
[2, 1, 3, 0, 4]
This can be done in O(n) time complexity and O(1) space if we try to store 2 numbers at a single position.
First, let's see how we can get 2 values from a single variable. Suppose we have a variable x and we want to get two values from it, 2 and 1. So,
x = n*1 + 2 , suppose n = 5 here.
x = 5*1 + 2 = 7
Now for 2, we can take remainder of x, ie, x%5. And for 1, we can take quotient of x, ie , x/5
and if we take n = 3
x = 3*1 + 2 = 5
x%3 = 5%3 = 2
x/3 = 5/3 = 1
We know here that the array contains values in range [0, n-1], so we can take the divisor as n, size of array. So, we will use the above concept to store 2 numbers at every index, one will represent old value and other will represent the new value.
A B
0 1 2 3 4 0 1 2 3 4
[3, 1, 0, 2, 4] -> [2, 1, 3, 0, 4]
.
a[0] = 3, that means, a[3] = 0 in our answer.
a[a[0]] = 2 //old
a[a[0]] = 0 //new
a[a[0]] = n* new + old = 5*0 + 2 = 2
a[a[i]] = n*i + a[a[i]]
And during array traversal, a[i] value can be greater than n because we are modifying it. So we will use a[i]%n to get the old value.
So the logic should be
a[a[i]%n] = n*i + a[a[i]%n]
Array -> 13 6 15 2 24
Now, to get the older values, take the remainder on dividing each value by n, and to get the new values, just divide each value by n, in this case, n=5.
Array -> 2 1 3 0 4
Following approach Optimizes the cycle walk if it is already handled. Also each element is 1 based. Need to convert accordingly while trying to access the elements in the given array.
enter code here
#include <stdio.h>
#include <iostream>
#include <vector>
#include <bits/stdc++.h>
using namespace std;
// helper function to traverse cycles
void cycle(int i, vector<int>& A) {
int cur_index = i+1, next_index = A[i];
while (next_index > 0) {
int temp = A[next_index-1];
A[next_index-1] = -(cur_index);
cur_index = next_index;
next_index = temp;
if (i+1 == abs(cur_index)) {
break;
}
}
}
void inverse_permutation(vector<int>& A) {
for (int i = 0; i < A.size(); i++) {
cycle(i, A);
}
for (int i = 0; i < A.size(); i++) {
A[i] = abs(A[i]);
}
for (int i = 0; i < A.size(); i++) {
cout<<A[i]<<" ";
}
}
int main(){
// vector<int> perm = {4,0,3,1,2,5,6,7,8};
vector<int> perm = {5,1,4,2,3,6,7,9,8};
//vector<int> perm = { 17,2,15,19,3,7,12,4,18,20,5,14,13,6,11,10,1,9,8,16};
// vector<int> perm = {4, 1, 2, 3};
// { 6,17,9,23,2,10,20,7,11,5,14,13,4,1,25,22,8,24,21,18,19,12,15,16,3 } =
// { 14,5,25,13,10,1,8,17,3,6,9,22,12,11,23,24,2,20,21,7,19,16,4,18,15 }
// vector<int> perm = {6, 17, 9, 23, 2, 10, 20, 7, 11, 5, 14, 13, 4, 1, 25, 22, 8, 24, 21, 18, 19, 12, 15, 16, 3};
inverse_permutation(perm);
return 0;
}

Finding contiguous ranges in arrays

You are given an array of integers. You have to output the largest range so that all numbers in the range are present in the array. The numbers might be present in any order. For example, suppose that the array is
{2, 10, 3, 12, 5, 4, 11, 8, 7, 6, 15}
Here we find two (nontrivial) ranges for which all the integers in these ranges are present in the array, namely [2,8] and [10,12]. Out of these [2,8] is the longer one. So we need to output that.
When I was given this question, I was asked to do this in linear time and without using any sorting. I thought that there might be a hash-based solution, but I couldn't come up with anything.
Here's my attempt at a solution:
void printRange(int arr[])
{
int n=sizeof(arr)/sizeof(int);
int size=2;
int tempans[2];
int answer[2];// the range is stored in another array
for(int i =0;i<n;i++)
{
if(arr[0]<arr[1])
{
answer[0]=arr[0];
answer[1]=arr[1];
}
if(arr[1]<arr[0])
{
answer[0]=arr[1];
answer[1]=arr[0];
}
if(arr[i] < answer[1])
size += 1;
else if(arr[i]>answer[1]) {
initialize tempans to new range;
size2=2;
}
else {
initialize tempans to new range
}
}
//I have to check when the count becomes equal to the diff of the range
I am stuck at this part... I can't figure out how many tempanswer[] arrays should be used.
I think that the following solution will work in O(n) time using O(n) space.
Begin by putting all of the entries in the array into a hash table. Next, create a second hash table which stores elements that we have "visited," which is initially empty.
Now, iterate across the array of elements one at a time. For each element, check if the element is in the visited set. If so, skip it. Otherwise, count up from that element upward. At each step, check if the current number is in the main hash table. If so, continue onward and mark the current value as part of the visited set. If not, stop. Next, repeat this procedure, except counting downward. This tells us the number of contiguous elements in the range containing this particular array value. If we keep track of the largest range found this way, we will have a solution to our problem.
The runtime complexity of this algorithm is O(n). To see this, note that we can build the hash table in the first step in O(n) time. Next, when we begin scanning to array to find the largest range, each range scanned takes time proportional to the length of that range. Since the total sum of the lengths of the ranges is the number of elements in the original array, and since we never scan the same range twice (because we mark each number that we visit), this second step takes O(n) time as well, for a net runtime of O(n).
EDIT: If you're curious, I have a Java implementation of this algorithm, along with a much more detailed analysis of why it works and why it has the correct runtime. It also explores a few edge cases that aren't apparent in the initial description of the algorithm (for example, how to handle integer overflow).
Hope this helps!
The solution could use BitSet:
public static void detect(int []ns) {
BitSet bs = new BitSet();
for (int i = 0; i < ns.length; i++) {
bs.set(ns[i]);
}
int begin = 0;
int setpos = -1;
while((setpos = bs.nextSetBit(begin)) >= 0) {
begin = bs.nextClearBit(setpos);
System.out.print("[" + setpos + " , " + (begin - 1) + "]");
}
}
Sample I/O:
detect(new int[] {2,10, 3, 12, 5,4, 11, 8, 7, 6, 15} );
[2,8] [10,12] [15,15]
Here is the solution in Java:
public class Solution {
public int longestConsecutive(int[] num) {
int longest = 0;
Map<Integer, Boolean> map = new HashMap<Integer, Boolean>();
for(int i = 0; i< num.length; i++){
map.put(num[i], false);
}
int l, k;
for(int i = 0;i < num.length;i++){
if(map.containsKey(num[i]-1) || map.get(num[i])) continue;
map.put(num[i], true);
l = 0; k = num[i];
while (map.containsKey(k)){
l++;
k++;
}
if(longest < l) longest = l;
}
return longest;
}
}
Other approaches here.
The above answer by template will work but you don't need a hash table. Hashing could take a long time depending on what algorithm you use. You can ask the interviewer if there's a max number the integer can be, then create an array of that size. Call it exist[] Then scan through arr and mark exist[i] = 1; Then iterate through exist[] keeping track of 4 variables, size of current largest range, and the beginning of the current largest range, size of current range, and beginning of current range. When you see exist[i] = 0, compare the current range values vs largest range values and update the largest range values if needed.
If there's no max value then you might have to go with the hashing method.
Actually considering that we're only sorting integers and therefore a comparision sort is NOT necessary, you can just sort the array using a Radix- or BucketSort and then iterate through it.
Simple and certainly not what the interviewee wanted to hear, but correct nonetheless ;)
A Haskell implementation of Grigor Gevorgyan's solution, from another who didn't get a chance to post before the question was marked as a duplicate...(simply updates the hash and the longest range so far, while traversing the list)
import qualified Data.HashTable.IO as H
import Control.Monad.Random
f list = do
h <- H.new :: IO (H.BasicHashTable Int Int)
g list (0,[]) h where
g [] best h = return best
g (x:xs) best h = do
m <- H.lookup h x
case m of
Just _ -> g xs best h
otherwise -> do
(xValue,newRange) <- test
H.insert h x xValue
g xs (maximum [best,newRange]) h
where
test = do
m1 <- H.lookup h (x-1)
m2 <- H.lookup h (x+1)
case m1 of
Just x1 -> case m2 of
Just x2 -> do H.insert h (x-1) x2
H.insert h (x+1) x1
return (x,(x2 - x1 + 1,[x1,x2]))
Nothing -> do H.insert h (x-1) x
return (x1,(x - x1 + 1,[x,x1]))
Nothing -> case m2 of
Just x2 -> do H.insert h (x+1) x
return (x2,(x2 - x + 1,[x,x2]))
Nothing -> do return (x,(1,[x]))
rnd :: (RandomGen g) => Rand g Int
rnd = getRandomR (-100,100)
main = do
values <- evalRandIO (sequence (replicate (1000000) rnd))
f values >>= print
Output:
*Main> main
(10,[40,49])
(5.30 secs, 1132898932 bytes)
I read a lot of solutions on multiple platforms to this problem and one got my attention, as it solves the problem very elegantly and it is easy to follow.
The backbone of this method is to create a set/hash which takes O(n) time and from there every access to the set/hash will be O(1). As the O-Notation omit's constant terms, this Algorithm still can be described overall as O(n)
def longestConsecutive(self, nums):
nums = set(nums) # Create Hash O(1)
best = 0
for x in nums:
if x - 1 not in nums: # Optimization
y = x + 1 # Get possible next number
while y in nums: # If the next number is in set/hash
y += 1 # keep counting
best = max(best, y - x) # counting done, update best
return best
It's straight forward if you ran over it with simple numbers. The Optimization step is just a short-circuit to make sure you start counting, when that specific number is the beginning of a sequence.
All Credits to Stefan Pochmann.
Very short solution using Javascript sparse array feature:
O(n) time using O(n) additional space.
var arr = [2, 10, 3, 12, 5, 4, 11, 8, 7, 6, 15];
var a = [];
var count = 0, max_count = 0;
for (var i=0; i < arr.length; i++) a[arr[i]] = true;
for (i = 0; i < a.length; i++) {
count = (a[i]) ? count + 1 : 0;
max_count = Math.max(max_count, count);
}
console.log(max_count); // 7
A quick way to do it (PHP) :
$tab = array(14,12,1,5,7,3,4,10,11,8);
asort($tab);
$tab = array_values($tab);
$tab_contiguous = array();
$i=0;
foreach ($tab as $key => $val) {
$tab_contiguous[$i][] = $tab[$key];
if (isset($tab[$key+1])) {
if($tab[$key] + 1 != $tab[$key+1])
$i++;
}
}
echo(json_encode($tab_contiguous));

Interleave array in constant space

Suppose we have an array
a1, a2,... , an, b1, b2, ..., bn.
The goal is to change this array to
a1, b1, a2, b2, ..., an, bn in O(n) time and in O(1) space.
In other words, we need a linear-time algorithm to modify the array in place, with no more than a constant amount of extra storage.
How can this be done?
This is the sequence and notes I worked out with pen and paper. I think it, or a variation, will hold for any larger n.
Each line represents a different step and () signifies what is being moved this step and [] is what has been moved from last step. The array itself is used as storage and two pointers (one for L and one for N) are required to determine what to move next. L means "letter line" and N is "number line" (what is moved).
A B C D 1 2 3 4
L A B C (D) 1 2 3 4 First is L, no need to move last N
N A B C (3) 1 2 [D] 4
L A B (C) 2 1 [3] D 4
N A B 1 (2) [C] 3 D 4
L A (B) 1 [2] C 3 D 4
N A (1) [B] 2 C 3 D 4
A [1] B 2 C 3 D 4 Done, no need to move A
Note the varying "pointer jumps" - the L pointer always decrements by 1 (as it can not be eaten into faster than that), but the N pointer jumps according to if it "replaced itself" (in spot, jump down two) or if it swapped something in (no jump, so the next something can get its go!).
This problem isn't as easy as it seems, but after some thought, the algorithm to accomplish this isn't too bad. You'll notice the first and last element are already in place, so we don't need to worry about them. We will keep a left index variable which represents the first item in the first half of the array that needs changed. After that we set a right index variable to the first item in the 2nd half of the array that needs changed. Now all we do is swap the item at the right index down one-by-one until it reaches the left index item. Increment the left index by 2 and the right index by 1, and repeat until the indexes overlap or the left goes past the right index (the right index will always end on the last index of the array). We increment the left index by two every time because the item at left + 1 has already naturally fallen into place.
Pseudocode
Set left index to 1
Set right index to the middle (array length / 2)
Swap the item at the right index with the item directly preceding it until it replaces the item at the left index
Increment the left index by 2
Increment the right index by 1
Repeat 3 through 5 until the left index becomes greater than or equal to the right index
Interleaving algorithm in C(#)
protected void Interleave(int[] arr)
{
int left = 1;
int right = arr.Length / 2;
int temp;
while (left < right)
{
for (int i = right; i > left; i--)
{
temp = arr[i];
arr[i] = arr[i - 1];
arr[i - 1] = temp;
}
left += 2;
right += 1;
}
}
This algorithm uses O(1) storage (with the temp variable, which could be eliminated using the addition/subtraction swap technique) I'm not very good at runtime analysis, but I believe this is still O(n) even though we're performing many swaps. Perhaps someone can further explore its runtime analysis.
First, the theory: Rearrange the elements in 'permutation cycles'. Take an element and place it at its new position, displacing the element that is currently there. Then you take that displaced element and put it in its new position. This displaces yet another element, so rinse and repeat. If the element displaced belongs to the position of the element you first started with, you have completed one cycle.
Actually, yours is a special case of the question I asked here, which was: How do you rearrange an array to any given order in O(N) time and O(1) space? In my question, the rearranged positions are described by an array of numbers, where the number at the nth position specifies the index of the element in the original array.
However, you don't have this additional array in your problem, and allocating it would take O(N) space. Fortunately, we can calculate the value of any element in this array on the fly, like this:
int rearrange_pos(int x) {
if (x % 2 == 0) return x / 2;
else return (x - 1) / 2 + n; // where n is half the size of the total array
}
I won't duplicate the rearranging algorithm itself here; it can be found in the accepted answer for my question.
Edit: As Jason has pointed out, the answer I linked to still needs to allocate an array of bools, making it O(N) space. This is because a permutation can be made up of multiple cycles. I've been trying to eliminate the need for this array for your special case, but without success.. There doesn't seem to be any usable pattern. Maybe someone else can help you here.
It's called in-place in-shuffle problem. Here is its implementation in C++ based on here.
void in_place_in_shuffle(int arr[], int length)
{
assert(arr && length>0 && !(length&1));
// shuffle to {5, 0, 6, 1, 7, 2, 8, 3, 9, 4}
int i,startPos=0;
while(startPos<length)
{
i=_LookUp(length-startPos);
_ShiftN(&arr[startPos+(i-1)/2],(length-startPos)/2,(i-1)/2);
_PerfectShuffle(&arr[startPos],i-1);
startPos+=(i-1);
}
// local swap to {0, 5, 1, 6, 2, 7, 3, 8, 4, 9}
for (int i=0; i<length; i+=2)
swap(arr[i], arr[i+1]);
}
// cycle
void _Cycle(int Data[],int Lenth,int Start)
{
int Cur_index,Temp1,Temp2;
Cur_index=(Start*2)%(Lenth+1);
Temp1=Data[Cur_index-1];
Data[Cur_index-1]=Data[Start-1];
while(Cur_index!=Start)
{
Temp2=Data[(Cur_index*2)%(Lenth+1)-1];
Data[(Cur_index*2)%(Lenth+1)-1]=Temp1;
Temp1=Temp2;
Cur_index=(Cur_index*2)%(Lenth+1);
}
}
// loop-move array
void _Reverse(int Data[],int Len)
{
int i,Temp;
for(i=0;i<Len/2;i++)
{
Temp=Data[i];
Data[i]=Data[Len-i-1];
Data[Len-i-1]=Temp;
}
}
void _ShiftN(int Data[],int Len,int N)
{
_Reverse(Data,Len-N);
_Reverse(&Data[Len-N],N);
_Reverse(Data,Len);
}
// perfect shuffle of satisfying [Lenth=3^k-1]
void _PerfectShuffle(int Data[],int Lenth)
{
int i=1;
if(Lenth==2)
{
i=Data[Lenth-1];
Data[Lenth-1]=Data[Lenth-2];
Data[Lenth-2]=i;
return;
}
while(i<Lenth)
{
_Cycle(Data,Lenth,i);
i=i*3;
}
}
// look for 3^k that nearnest to N
int _LookUp(int N)
{
int i=3;
while(i<=N+1) i*=3;
if(i>3) i=i/3;
return i;
}
Test:
int arr[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
int length = sizeof(arr)/sizeof(int);
in_place_in_shuffle(arr, length);
After this, arr[] will be {0, 5, 1, 6, 2, 7, 3, 8, 4, 9}.
If you can transform the array into a linked-list first, the problem becomes trivial.

Algorithm to determine if array contains n...n+m?

I saw this question on Reddit, and there were no positive solutions presented, and I thought it would be a perfect question to ask here. This was in a thread about interview questions:
Write a method that takes an int array of size m, and returns (True/False) if the array consists of the numbers n...n+m-1, all numbers in that range and only numbers in that range. The array is not guaranteed to be sorted. (For instance, {2,3,4} would return true. {1,3,1} would return false, {1,2,4} would return false.
The problem I had with this one is that my interviewer kept asking me to optimize (faster O(n), less memory, etc), to the point where he claimed you could do it in one pass of the array using a constant amount of memory. Never figured that one out.
Along with your solutions please indicate if they assume that the array contains unique items. Also indicate if your solution assumes the sequence starts at 1. (I've modified the question slightly to allow cases where it goes 2, 3, 4...)
edit: I am now of the opinion that there does not exist a linear in time and constant in space algorithm that handles duplicates. Can anyone verify this?
The duplicate problem boils down to testing to see if the array contains duplicates in O(n) time, O(1) space. If this can be done you can simply test first and if there are no duplicates run the algorithms posted. So can you test for dupes in O(n) time O(1) space?
Under the assumption numbers less than one are not allowed and there are no duplicates, there is a simple summation identity for this - the sum of numbers from 1 to m in increments of 1 is (m * (m + 1)) / 2. You can then sum the array and use this identity.
You can find out if there is a dupe under the above guarantees, plus the guarantee no number is above m or less than n (which can be checked in O(N))
The idea in pseudo-code:
0) Start at N = 0
1) Take the N-th element in the list.
2) If it is not in the right place if the list had been sorted, check where it should be.
3) If the place where it should be already has the same number, you have a dupe - RETURN TRUE
4) Otherwise, swap the numbers (to put the first number in the right place).
5) With the number you just swapped with, is it in the right place?
6) If no, go back to step two.
7) Otherwise, start at step one with N = N + 1. If this would be past the end of the list, you have no dupes.
And, yes, that runs in O(N) although it may look like O(N ^ 2)
Note to everyone (stuff collected from comments)
This solution works under the assumption you can modify the array, then uses in-place Radix sort (which achieves O(N) speed).
Other mathy-solutions have been put forth, but I'm not sure any of them have been proved. There are a bunch of sums that might be useful, but most of them run into a blowup in the number of bits required to represent the sum, which will violate the constant extra space guarantee. I also don't know if any of them are capable of producing a distinct number for a given set of numbers. I think a sum of squares might work, which has a known formula to compute it (see Wolfram's)
New insight (well, more of musings that don't help solve it but are interesting and I'm going to bed):
So, it has been mentioned to maybe use sum + sum of squares. No one knew if this worked or not, and I realized that it only becomes an issue when (x + y) = (n + m), such as the fact 2 + 2 = 1 + 3. Squares also have this issue thanks to Pythagorean triples (so 3^2 + 4^2 + 25^2 == 5^2 + 7^2 + 24^2, and the sum of squares doesn't work). If we use Fermat's last theorem, we know this can't happen for n^3. But we also don't know if there is no x + y + z = n for this (unless we do and I don't know it). So no guarantee this, too, doesn't break - and if we continue down this path we quickly run out of bits.
In my glee, however, I forgot to note that you can break the sum of squares, but in doing so you create a normal sum that isn't valid. I don't think you can do both, but, as has been noted, we don't have a proof either way.
I must say, finding counterexamples is sometimes a lot easier than proving things! Consider the following sequences, all of which have a sum of 28 and a sum of squares of 140:
[1, 2, 3, 4, 5, 6, 7]
[1, 1, 4, 5, 5, 6, 6]
[2, 2, 3, 3, 4, 7, 7]
I could not find any such examples of length 6 or less. If you want an example that has the proper min and max values too, try this one of length 8:
[1, 3, 3, 4, 4, 5, 8, 8]
Simpler approach (modifying hazzen's idea):
An integer array of length m contains all the numbers from n to n+m-1 exactly once iff
every array element is between n and n+m-1
there are no duplicates
(Reason: there are only m values in the given integer range, so if the array contains m unique values in this range, it must contain every one of them once)
If you are allowed to modify the array, you can check both in one pass through the list with a modified version of hazzen's algorithm idea (there is no need to do any summation):
For all array indexes i from 0 to m-1 do
If array[i] < n or array[i] >= n+m => RETURN FALSE ("value out of range found")
Calculate j = array[i] - n (this is the 0-based position of array[i] in a sorted array with values from n to n+m-1)
While j is not equal to i
If list[i] is equal to list[j] => RETURN FALSE ("duplicate found")
Swap list[i] with list[j]
Recalculate j = array[i] - n
RETURN TRUE
I'm not sure if the modification of the original array counts against the maximum allowed additional space of O(1), but if it doesn't this should be the solution the original poster wanted.
By working with a[i] % a.length instead of a[i] you reduce the problem to needing to determine that you've got the numbers 0 to a.length - 1.
We take this observation for granted and try to check if the array contains [0,m).
Find the first node that's not in its correct position, e.g.
0 1 2 3 7 5 6 8 4 ; the original dataset (after the renaming we discussed)
^
`---this is position 4 and the 7 shouldn't be here
Swap that number into where it should be. i.e. swap the 7 with the 8:
0 1 2 3 8 5 6 7 4 ;
| `--------- 7 is in the right place.
`--------------- this is now the 'current' position
Now we repeat this. Looking again at our current position we ask:
"is this the correct number for here?"
If not, we swap it into its correct place.
If it is in the right place, we move right and do this again.
Following this rule again, we get:
0 1 2 3 4 5 6 7 8 ; 4 and 8 were just swapped
This will gradually build up the list correctly from left to right, and each number will be moved at most once, and hence this is O(n).
If there are dupes, we'll notice it as soon is there is an attempt to swap a number backwards in the list.
Why do the other solutions use a summation of every value? I think this is risky, because when you add together O(n) items into one number, you're technically using more than O(1) space.
Simpler method:
Step 1, figure out if there are any duplicates. I'm not sure if this is possible in O(1) space. Anyway, return false if there are duplicates.
Step 2, iterate through the list, keep track of the lowest and highest items.
Step 3, Does (highest - lowest) equal m ? If so, return true.
Any one-pass algorithm requires Omega(n) bits of storage.
Suppose to the contrary that there exists a one-pass algorithm that uses o(n) bits. Because it makes only one pass, it must summarize the first n/2 values in o(n) space. Since there are C(n,n/2) = 2^Theta(n) possible sets of n/2 values drawn from S = {1,...,n}, there exist two distinct sets A and B of n/2 values such that the state of memory is the same after both. If A' = S \ A is the "correct" set of values to complement A, then the algorithm cannot possibly answer correctly for the inputs
A A' - yes
B A' - no
since it cannot distinguish the first case from the second.
Q.E.D.
Vote me down if I'm wrong, but I think we can determine if there are duplicates or not using variance. Because we know the mean beforehand (n + (m-1)/2 or something like that) we can just sum up the numbers and square of difference to mean to see if the sum matches the equation (mn + m(m-1)/2) and the variance is (0 + 1 + 4 + ... + (m-1)^2)/m. If the variance doesn't match, it's likely we have a duplicate.
EDIT: variance is supposed to be (0 + 1 + 4 + ... + [(m-1)/2]^2)*2/m, because half of the elements are less than the mean and the other half is greater than the mean.
If there is a duplicate, a term on the above equation will differ from the correct sequence, even if another duplicate completely cancels out the change in mean. So the function returns true only if both sum and variance matches the desrired values, which we can compute beforehand.
Here's a working solution in O(n)
This is using the pseudocode suggested by Hazzen plus some of my own ideas. It works for negative numbers as well and doesn't require any sum-of-the-squares stuff.
function testArray($nums, $n, $m) {
// check the sum. PHP offers this array_sum() method, but it's
// trivial to write your own. O(n) here.
if (array_sum($nums) != ($m * ($m + 2 * $n - 1) / 2)) {
return false; // checksum failed.
}
for ($i = 0; $i < $m; ++$i) {
// check if the number is in the proper range
if ($nums[$i] < $n || $nums[$i] >= $n + $m) {
return false; // value out of range.
}
while (($shouldBe = $nums[$i] - $n) != $i) {
if ($nums[$shouldBe] == $nums[$i]) {
return false; // duplicate
}
$temp = $nums[$i];
$nums[$i] = $nums[$shouldBe];
$nums[$shouldBe] = $temp;
}
}
return true; // huzzah!
}
var_dump(testArray(array(1, 2, 3, 4, 5), 1, 5)); // true
var_dump(testArray(array(5, 4, 3, 2, 1), 1, 5)); // true
var_dump(testArray(array(6, 4, 3, 2, 0), 1, 5)); // false - out of range
var_dump(testArray(array(5, 5, 3, 2, 1), 1, 5)); // false - checksum fail
var_dump(testArray(array(5, 4, 3, 2, 5), 1, 5)); // false - dupe
var_dump(testArray(array(-2, -1, 0, 1, 2), -2, 5)); // true
Awhile back I heard about a very clever sorting algorithm from someone who worked for the phone company. They had to sort a massive number of phone numbers. After going through a bunch of different sort strategies, they finally hit on a very elegant solution: they just created a bit array and treated the offset into the bit array as the phone number. They then swept through their database with a single pass, changing the bit for each number to 1. After that, they swept through the bit array once, spitting out the phone numbers for entries that had the bit set high.
Along those lines, I believe that you can use the data in the array itself as a meta data structure to look for duplicates. Worst case, you could have a separate array, but I'm pretty sure you can use the input array if you don't mind a bit of swapping.
I'm going to leave out the n parameter for time being, b/c that just confuses things - adding in an index offset is pretty easy to do.
Consider:
for i = 0 to m
if (a[a[i]]==a[i]) return false; // we have a duplicate
while (a[a[i]] > a[i]) swapArrayIndexes(a[i], i)
sum = sum + a[i]
next
if sum = (n+m-1)*m return true else return false
This isn't O(n) - probably closer to O(n Log n) - but it does provide for constant space and may provide a different vector of attack for the problem.
If we want O(n), then using an array of bytes and some bit operations will provide the duplication check with an extra n/32 bytes of memory used (assuming 32 bit ints, of course).
EDIT: The above algorithm could be improved further by adding the sum check to the inside of the loop, and check for:
if sum > (n+m-1)*m return false
that way it will fail fast.
Assuming you know only the length of the array and you are allowed to modify the array it can be done in O(1) space and O(n) time.
The process has two straightforward steps.
1. "modulo sort" the array. [5,3,2,4] => [4,5,2,3] (O(2n))
2. Check that each value's neighbor is one higher than itself (modulo) (O(n))
All told you need at most 3 passes through the array.
The modulo sort is the 'tricky' part, but the objective is simple. Take each value in the array and store it at its own address (modulo length). This requires one pass through the array, looping over each location 'evicting' its value by swapping it to its correct location and moving in the value at its destination. If you ever move in a value which is congruent to the value you just evicted, you have a duplicate and can exit early.
Worst case, it's O(2n).
The check is a single pass through the array examining each value with it's next highest neighbor. Always O(n).
Combined algorithm is O(n)+O(2n) = O(3n) = O(n)
Pseudocode from my solution:
foreach(values[])
while(values[i] not congruent to i)
to-be-evicted = values[i]
evict(values[i]) // swap to its 'proper' location
if(values[i]%length == to-be-evicted%length)
return false; // a 'duplicate' arrived when we evicted that number
end while
end foreach
foreach(values[])
if((values[i]+1)%length != values[i+1]%length)
return false
end foreach
I've included the java code proof of concept below, it's not pretty, but it passes all the unit tests I made for it. I call these a 'StraightArray' because they correspond to the poker hand of a straight (contiguous sequence ignoring suit).
public class StraightArray {
static int evict(int[] a, int i) {
int t = a[i];
a[i] = a[t%a.length];
a[t%a.length] = t;
return t;
}
static boolean isStraight(int[] values) {
for(int i = 0; i < values.length; i++) {
while(values[i]%values.length != i) {
int evicted = evict(values, i);
if(evicted%values.length == values[i]%values.length) {
return false;
}
}
}
for(int i = 0; i < values.length-1; i++) {
int n = (values[i]%values.length)+1;
int m = values[(i+1)]%values.length;
if(n != m) {
return false;
}
}
return true;
}
}
Hazzen's algorithm implementation in C
#include<stdio.h>
#define swapxor(a,i,j) a[i]^=a[j];a[j]^=a[i];a[i]^=a[j];
int check_ntom(int a[], int n, int m) {
int i = 0, j = 0;
for(i = 0; i < m; i++) {
if(a[i] < n || a[i] >= n+m) return 0; //invalid entry
j = a[i] - n;
while(j != i) {
if(a[i]==a[j]) return -1; //bucket already occupied. Dupe.
swapxor(a, i, j); //faster bitwise swap
j = a[i] - n;
if(a[i]>=n+m) return 0; //[NEW] invalid entry
}
}
return 200; //OK
}
int main() {
int n=5, m=5;
int a[] = {6, 5, 7, 9, 8};
int r = check_ntom(a, n, m);
printf("%d", r);
return 0;
}
Edit: change made to the code to eliminate illegal memory access.
boolean determineContinuousArray(int *arr, int len)
{
// Suppose the array is like below:
//int arr[10] = {7,11,14,9,8,100,12,5,13,6};
//int len = sizeof(arr)/sizeof(int);
int n = arr[0];
int *result = new int[len];
for(int i=0; i< len; i++)
result[i] = -1;
for (int i=0; i < len; i++)
{
int cur = arr[i];
int hold ;
if ( arr[i] < n){
n = arr[i];
}
while(true){
if ( cur - n >= len){
cout << "array index out of range: meaning this is not a valid array" << endl;
return false;
}
else if ( result[cur - n] != cur){
hold = result[cur - n];
result[cur - n] = cur;
if (hold == -1) break;
cur = hold;
}else{
cout << "found duplicate number " << cur << endl;
return false;
}
}
}
cout << "this is a valid array" << endl;
for(int j=0 ; j< len; j++)
cout << result[j] << "," ;
cout << endl;
return true;
}
def test(a, n, m):
seen = [False] * m
for x in a:
if x < n or x >= n+m:
return False
if seen[x-n]:
return False
seen[x-n] = True
return False not in seen
print test([2, 3, 1], 1, 3)
print test([1, 3, 1], 1, 3)
print test([1, 2, 4], 1, 3)
Note that this only makes one pass through the first array, not considering the linear search involved in not in. :)
I also could have used a python set, but I opted for the straightforward solution where the performance characteristics of set need not be considered.
Update: Smashery pointed out that I had misparsed "constant amount of memory" and this solution doesn't actually solve the problem.
If you want to know the sum of the numbers [n ... n + m - 1] just use this equation.
var sum = m * (m + 2 * n - 1) / 2;
That works for any number, positive or negative, even if n is a decimal.
Why do the other solutions use a summation of every value? I think this is risky, because when you add together O(n) items into one number, you're technically using more than O(1) space.
O(1) indicates constant space which does not change by the number of n. It does not matter if it is 1 or 2 variables as long as it is a constant number. Why are you saying it is more than O(1) space? If you are calculating the sum of n numbers by accumulating it in a temporary variable, you would be using exactly 1 variable anyway.
Commenting in an answer because the system does not allow me to write comments yet.
Update (in reply to comments): in this answer i meant O(1) space wherever "space" or "time" was omitted. The quoted text is a part of an earlier answer to which this is a reply to.
Given this -
Write a method that takes an int array of size m ...
I suppose it is fair to conclude there is an upper limit for m, equal to the value of the largest int (2^32 being typical). In other words, even though m is not specified as an int, the fact that the array can't have duplicates implies there can't be more than the number of values you can form out of 32 bits, which in turn implies m is limited to be an int also.
If such a conclusion is acceptable, then I propose to use a fixed space of (2^33 + 2) * 4 bytes = 34,359,738,376 bytes = 34.4GB to handle all possible cases. (Not counting the space required by the input array and its loop).
Of course, for optimization, I would first take m into account, and allocate only the actual amount needed, (2m+2) * 4 bytes.
If this is acceptable for the O(1) space constraint - for the stated problem - then let me proceed to an algorithmic proposal... :)
Assumptions: array of m ints, positive or negative, none greater than what 4 bytes can hold. Duplicates are handled. First value can be any valid int. Restrict m as above.
First, create an int array of length 2m-1, ary, and provide three int variables: left, diff, and right. Notice that makes 2m+2...
Second, take the first value from the input array and copy it to position m-1 in the new array. Initialize the three variables.
set ary[m-1] - nthVal // n=0
set left = diff = right = 0
Third, loop through the remaining values in the input array and do the following for each iteration:
set diff = nthVal - ary[m-1]
if (diff > m-1 + right || diff < 1-m + left) return false // out of bounds
if (ary[m-1+diff] != null) return false // duplicate
set ary[m-1+diff] = nthVal
if (diff>left) left = diff // constrains left bound further right
if (diff<right) right = diff // constrains right bound further left
I decided to put this in code, and it worked.
Here is a working sample using C#:
public class Program
{
static bool puzzle(int[] inAry)
{
var m = inAry.Count();
var outAry = new int?[2 * m - 1];
int diff = 0;
int left = 0;
int right = 0;
outAry[m - 1] = inAry[0];
for (var i = 1; i < m; i += 1)
{
diff = inAry[i] - inAry[0];
if (diff > m - 1 + right || diff < 1 - m + left) return false;
if (outAry[m - 1 + diff] != null) return false;
outAry[m - 1 + diff] = inAry[i];
if (diff > left) left = diff;
if (diff < right) right = diff;
}
return true;
}
static void Main(string[] args)
{
var inAry = new int[3]{ 2, 3, 4 };
Console.WriteLine(puzzle(inAry));
inAry = new int[13] { -3, 5, -1, -2, 9, 8, 2, 3, 0, 6, 4, 7, 1 };
Console.WriteLine(puzzle(inAry));
inAry = new int[3] { 21, 31, 41 };
Console.WriteLine(puzzle(inAry));
Console.ReadLine();
}
}
note: this comment is based on the original text of the question (it has been corrected since)
If the question is posed exactly as written above (and it is not just a typo) and for array of size n the function should return (True/False) if the array consists of the numbers 1...n+1,
... then the answer will always be false because the array with all the numbers 1...n+1 will be of size n+1 and not n. hence the question can be answered in O(1). :)
Counter-example for XOR algorithm.
(can't post it as a comment)
#popopome
For a = {0, 2, 7, 5,} it return true (means that a is a permutation of the range [0, 4) ), but it must return false in this case (a is obviously is not a permutaton of [0, 4) ).
Another counter example: {0, 0, 1, 3, 5, 6, 6} -- all values are in range but there are duplicates.
I could incorrectly implement popopome's idea (or tests), therefore here is the code:
bool isperm_popopome(int m; int a[m], int m, int n)
{
/** O(m) in time (single pass), O(1) in space,
no restrictions on n,
no overflow,
a[] may be readonly
*/
int even_xor = 0;
int odd_xor = 0;
for (int i = 0; i < m; ++i)
{
if (a[i] % 2 == 0) // is even
even_xor ^= a[i];
else
odd_xor ^= a[i];
const int b = i + n;
if (b % 2 == 0) // is even
even_xor ^= b;
else
odd_xor ^= b;
}
return (even_xor == 0) && (odd_xor == 0);
}
A C version of b3's pseudo-code
(to avoid misinterpretation of the pseudo-code)
Counter example: {1, 1, 2, 4, 6, 7, 7}.
int pow_minus_one(int power)
{
return (power % 2 == 0) ? 1 : -1;
}
int ceil_half(int n)
{
return n / 2 + (n % 2);
}
bool isperm_b3_3(int m; int a[m], int m, int n)
{
/**
O(m) in time (single pass), O(1) in space,
doesn't use n
possible overflow in sum
a[] may be readonly
*/
int altsum = 0;
int mina = INT_MAX;
int maxa = INT_MIN;
for (int i = 0; i < m; ++i)
{
const int v = a[i] - n + 1; // [n, n+m-1] -> [1, m] to deal with n=0
if (mina > v)
mina = v;
if (maxa < v)
maxa = v;
altsum += pow_minus_one(v) * v;
}
return ((maxa-mina == m-1)
and ((pow_minus_one(mina + m-1) * ceil_half(mina + m-1)
- pow_minus_one(mina-1) * ceil_half(mina-1)) == altsum));
}
In Python:
def ispermutation(iterable, m, n):
"""Whether iterable and the range [n, n+m) have the same elements.
pre-condition: there are no duplicates in the iterable
"""
for i, elem in enumerate(iterable):
if not n <= elem < n+m:
return False
return i == m-1
print(ispermutation([1, 42], 2, 1) == False)
print(ispermutation(range(10), 10, 0) == True)
print(ispermutation((2, 1, 3), 3, 1) == True)
print(ispermutation((2, 1, 3), 3, 0) == False)
print(ispermutation((2, 1, 3), 4, 1) == False)
print(ispermutation((2, 1, 3), 2, 1) == False)
It is O(m) in time and O(1) in space. It does not take into account duplicates.
Alternate solution:
def ispermutation(iterable, m, n):
"""Same as above.
pre-condition: assert(len(list(iterable)) == m)
"""
return all(n <= elem < n+m for elem in iterable)
MY CURRENT BEST OPTION
def uniqueSet( array )
check_index = 0;
check_value = 0;
min = array[0];
array.each_with_index{ |value,index|
check_index = check_index ^ ( 1 << index );
check_value = check_value ^ ( 1 << value );
min = value if value < min
}
check_index = check_index << min;
return check_index == check_value;
end
O(n) and Space O(1)
I wrote a script to brute force combinations that could fail that and it didn't find any.
If you have an array which contravenes this function do tell. :)
#J.F. Sebastian
Its not a true hashing algorithm. Technically, its a highly efficient packed boolean array of "seen" values.
ci = 0, cv = 0
[5,4,3]{
i = 0
v = 5
1 << 0 == 000001
1 << 5 == 100000
0 ^ 000001 = 000001
0 ^ 100000 = 100000
i = 1
v = 4
1 << 1 == 000010
1 << 4 == 010000
000001 ^ 000010 = 000011
100000 ^ 010000 = 110000
i = 2
v = 3
1 << 2 == 000100
1 << 3 == 001000
000011 ^ 000100 = 000111
110000 ^ 001000 = 111000
}
min = 3
000111 << 3 == 111000
111000 === 111000
The point of this being mostly that in order to "fake" most the problem cases one uses duplicates to do so. In this system, XOR penalises you for using the same value twice and assumes you instead did it 0 times.
The caveats here being of course:
both input array length and maximum array value is limited by the maximum value for $x in ( 1 << $x > 0 )
ultimate effectiveness depends on how your underlying system implements the abilities to:
shift 1 bit n places right.
xor 2 registers. ( where 'registers' may, depending on implementation, span several registers )
edit
Noted, above statements seem confusing. Assuming a perfect machine, where an "integer" is a register with Infinite precision, which can still perform a ^ b in O(1) time.
But failing these assumptions, one has to start asking the algorithmic complexity of simple math.
How complex is 1 == 1 ?, surely that should be O(1) every time right?.
What about 2^32 == 2^32 .
O(1)? 2^33 == 2^33? Now you've got a question of register size and the underlying implementation.
Fortunately XOR and == can be done in parallel, so if one assumes infinite precision and a machine designed to cope with infinite precision, it is safe to assume XOR and == take constant time regardless of their value ( because its infinite width, it will have infinite 0 padding. Obviously this doesn't exist. But also, changing 000000 to 000100 is not increasing memory usage.
Yet on some machines , ( 1 << 32 ) << 1 will consume more memory, but how much is uncertain.
A C version of Kent Fredric's Ruby solution
(to facilitate testing)
Counter-example (for C version): {8, 33, 27, 30, 9, 2, 35, 7, 26, 32, 2, 23, 0, 13, 1, 6, 31, 3, 28, 4, 5, 18, 12, 2, 9, 14, 17, 21, 19, 22, 15, 20, 24, 11, 10, 16, 25}. Here n=0, m=35. This sequence misses 34 and has two 2.
It is an O(m) in time and O(1) in space solution.
Out-of-range values are easily detected in O(n) in time and O(1) in space, therefore tests are concentrated on in-range (means all values are in the valid range [n, n+m)) sequences. Otherwise {1, 34} is a counter example (for C version, sizeof(int)==4, standard binary representation of numbers).
The main difference between C and Ruby version:
<< operator will rotate values in C due to a finite sizeof(int),
but in Ruby numbers will grow to accomodate the result e.g.,
Ruby: 1 << 100 # -> 1267650600228229401496703205376
C: int n = 100; 1 << n // -> 16
In Ruby: check_index ^= 1 << i; is equivalent to check_index.setbit(i). The same effect could be implemented in C++: vector<bool> v(m); v[i] = true;
bool isperm_fredric(int m; int a[m], int m, int n)
{
/**
O(m) in time (single pass), O(1) in space,
no restriction on n,
?overflow?
a[] may be readonly
*/
int check_index = 0;
int check_value = 0;
int min = a[0];
for (int i = 0; i < m; ++i) {
check_index ^= 1 << i;
check_value ^= 1 << (a[i] - n); //
if (a[i] < min)
min = a[i];
}
check_index <<= min - n; // min and n may differ e.g.,
// {1, 1}: min=1, but n may be 0.
return check_index == check_value;
}
Values of the above function were tested against the following code:
bool *seen_isperm_trusted = NULL;
bool isperm_trusted(int m; int a[m], int m, int n)
{
/** O(m) in time, O(m) in space */
for (int i = 0; i < m; ++i) // could be memset(s_i_t, 0, m*sizeof(*s_i_t));
seen_isperm_trusted[i] = false;
for (int i = 0; i < m; ++i) {
if (a[i] < n or a[i] >= n + m)
return false; // out of range
if (seen_isperm_trusted[a[i]-n])
return false; // duplicates
else
seen_isperm_trusted[a[i]-n] = true;
}
return true; // a[] is a permutation of the range: [n, n+m)
}
Input arrays are generated with:
void backtrack(int m; int a[m], int m, int nitems)
{
/** generate all permutations with repetition for the range [0, m) */
if (nitems == m) {
(void)test_array(a, nitems, 0); // {0, 0}, {0, 1}, {1, 0}, {1, 1}
}
else for (int i = 0; i < m; ++i) {
a[nitems] = i;
backtrack(a, m, nitems + 1);
}
}
The Answer from "nickf" dows not work if the array is unsorted
var_dump(testArray(array(5, 3, 1, 2, 4), 1, 5)); //gives "duplicates" !!!!
Also your formula to compute sum([n...n+m-1]) looks incorrect....
the correct formula is (m(m+1)/2 - n(n-1)/2)
An array contains N numbers, and you want to determine whether two of the
numbers sum to a given number K. For instance, if the input is 8,4, 1,6 and K is 10,
the answer is yes (4 and 6). A number may be used twice. Do the following.
a. Give an O(N2) algorithm to solve this problem.
b. Give an O(N log N) algorithm to solve this problem. (Hint: Sort the items first.
After doing so, you can solve the problem in linear time.)
c. Code both solutions and compare the running times of your algorithms.
4.
Product of m consecutive numbers is divisible by m! [ m factorial ]
so in one pass you can compute the product of the m numbers, also compute m! and see if the product modulo m ! is zero at the end of the pass
I might be missing something but this is what comes to my mind ...
something like this in python
my_list1 = [9,5,8,7,6]
my_list2 = [3,5,4,7]
def consecutive(my_list):
count = 0
prod = fact = 1
for num in my_list:
prod *= num
count +=1
fact *= count
if not prod % fact:
return 1
else:
return 0
print consecutive(my_list1)
print consecutive(my_list2)
HotPotato ~$ python m_consecutive.py
1
0
I propose the following:
Choose a finite set of prime numbers P_1,P_2,...,P_K, and compute the occurrences of the elements in the input sequence (minus the minimum) modulo each P_i. The pattern of a valid sequence is known.
For example for a sequence of 17 elements, modulo 2 we must have the profile: [9 8], modulo 3: [6 6 5], modulo 5: [4 4 3 3 3], etc.
Combining the test using several bases we obtain a more and more precise probabilistic test. Since the entries are bounded by the integer size, there exists a finite base providing an exact test. This is similar to probabilistic pseudo primality tests.
S_i is an int array of size P_i, initially filled with 0, i=1..K
M is the length of the input sequence
Mn = INT_MAX
Mx = INT_MIN
for x in the input sequence:
for i in 1..K: S_i[x % P_i]++ // count occurrences mod Pi
Mn = min(Mn,x) // update min
Mx = max(Mx,x) // and max
if Mx-Mn != M-1: return False // Check bounds
for i in 1..K:
// Check profile mod P_i
Q = M / P_i
R = M % P_i
Check S_i[(Mn+j) % P_i] is Q+1 for j=0..R-1 and Q for j=R..P_i-1
if this test fails, return False
return True
Any contiguous array [ n, n+1, ..., n+m-1 ] can be mapped on to a 'base' interval [ 0, 1, ..., m ] using the modulo operator. For each i in the interval, there is exactly one i%m in the base interval and vice versa.
Any contiguous array also has a 'span' m (maximum - minimum + 1) equal to it's size.
Using these facts, you can create an "encountered" boolean array of same size containing all falses initially, and while visiting the input array, put their related "encountered" elements to true.
This algorithm is O(n) in space, O(n) in time, and checks for duplicates.
def contiguous( values )
#initialization
encountered = Array.new( values.size, false )
min, max = nil, nil
visited = 0
values.each do |v|
index = v % encountered.size
if( encountered[ index ] )
return "duplicates";
end
encountered[ index ] = true
min = v if min == nil or v < min
max = v if max == nil or v > max
visited += 1
end
if ( max - min + 1 != values.size ) or visited != values.size
return "hole"
else
return "contiguous"
end
end
tests = [
[ false, [ 2,4,5,6 ] ],
[ false, [ 10,11,13,14 ] ] ,
[ true , [ 20,21,22,23 ] ] ,
[ true , [ 19,20,21,22,23 ] ] ,
[ true , [ 20,21,22,23,24 ] ] ,
[ false, [ 20,21,22,23,24+5 ] ] ,
[ false, [ 2,2,3,4,5 ] ]
]
tests.each do |t|
result = contiguous( t[1] )
if( t[0] != ( result == "contiguous" ) )
puts "Failed Test : " + t[1].to_s + " returned " + result
end
end
I like Greg Hewgill's idea of Radix sorting. To find duplicates, you can sort in O(N) time given the constraints on the values in this array.
For an in-place O(1) space O(N) time that restores the original ordering of the list, you don't have to do an actual swap on that number; you can just mark it with a flag:
//Java: assumes all numbers in arr > 1
boolean checkArrayConsecutiveRange(int[] arr) {
// find min/max
int min = arr[0]; int max = arr[0]
for (int i=1; i<arr.length; i++) {
min = (arr[i] < min ? arr[i] : min);
max = (arr[i] > max ? arr[i] : max);
}
if (max-min != arr.length) return false;
// flag and check
boolean ret = true;
for (int i=0; i<arr.length; i++) {
int targetI = Math.abs(arr[i])-min;
if (arr[targetI] < 0) {
ret = false;
break;
}
arr[targetI] = -arr[targetI];
}
for (int i=0; i<arr.length; i++) {
arr[i] = Math.abs(arr[i]);
}
return ret;
}
Storing the flags inside the given array is kind of cheating, and doesn't play well with parallelization. I'm still trying to think of a way to do it without touching the array in O(N) time and O(log N) space. Checking against the sum and against the sum of least squares (arr[i] - arr.length/2.0)^2 feels like it might work. The one defining characteristic we know about a 0...m array with no duplicates is that it's uniformly distributed; we should just check that.
Now if only I could prove it.
I'd like to note that the solution above involving factorial takes O(N) space to store the factorial itself. N! > 2^N, which takes N bytes to store.
Oops! I got caught up in a duplicate question and did not see the already identical solutions here. And I thought I'd finally done something original! Here is a historical archive of when I was slightly more pleased:
Well, I have no certainty if this algorithm satisfies all conditions. In fact, I haven't even validated that it works beyond a couple test cases I have tried. Even if my algorithm does have problems, hopefully my approach sparks some solutions.
This algorithm, to my knowledge, works in constant memory and scans the array three times. Perhaps an added bonus is that it works for the full range of integers, if that wasn't part of the original problem.
I am not much of a pseudo-code person, and I really think the code might simply make more sense than words. Here is an implementation I wrote in PHP. Take heed of the comments.
function is_permutation($ints) {
/* Gather some meta-data. These scans can
be done simultaneously */
$lowest = min($ints);
$length = count($ints);
$max_index = $length - 1;
$sort_run_count = 0;
/* I do not have any proof that running this sort twice
will always completely sort the array (of course only
intentionally happening if the array is a permutation) */
while ($sort_run_count < 2) {
for ($i = 0; $i < $length; ++$i) {
$dest_index = $ints[$i] - $lowest;
if ($i == $dest_index) {
continue;
}
if ($dest_index > $max_index) {
return false;
}
if ($ints[$i] == $ints[$dest_index]) {
return false;
}
$temp = $ints[$dest_index];
$ints[$dest_index] = $ints[$i];
$ints[$i] = $temp;
}
++$sort_run_count;
}
return true;
}
So there is an algorithm that takes O(n^2) that does not require modifying the input array and takes constant space.
First, assume that you know n and m. This is a linear operation, so it does not add any additional complexity. Next, assume there exists one element equal to n and one element equal to n+m-1 and all the rest are in [n, n+m). Given that, we can reduce the problem to having an array with elements in [0, m).
Now, since we know that the elements are bounded by the size of the array, we can treat each element as a node with a single link to another element; in other words, the array describes a directed graph. In this directed graph, if there are no duplicate elements, every node belongs to a cycle, that is, a node is reachable from itself in m or less steps. If there is a duplicate element, then there exists one node that is not reachable from itself at all.
So, to detect this, you walk the entire array from start to finish and determine if each element returns to itself in <=m steps. If any element is not reachable in <=m steps, then you have a duplicate and can return false. Otherwise, when you finish visiting all elements, you can return true:
for (int start_index= 0; start_index<m; ++start_index)
{
int steps= 1;
int current_element_index= arr[start_index];
while (steps<m+1 && current_element_index!=start_index)
{
current_element_index= arr[current_element_index];
++steps;
}
if (steps>m)
{
return false;
}
}
return true;
You can optimize this by storing additional information:
Record sum of the length of the cycle from each element, unless the cycle visits an element before that element, call it sum_of_steps.
For every element, only step m-sum_of_steps nodes out. If you don't return to the starting element and you don't visit an element before the starting element, you have found a loop containing duplicate elements and can return false.
This is still O(n^2), e.g. {1, 2, 3, 0, 5, 6, 7, 4}, but it's a little bit faster.
ciphwn has it right. It is all to do with statistics. What the question is asking is, in statistical terms, is whether or not the sequence of numbers form a discrete uniform distribution. A discrete uniform distribution is where all values of a finite set of possible values are equally probable. Fortunately there are some useful formulas to determine if a discrete set is uniform. Firstly, to determine the mean of the set (a..b) is (a+b)/2 and the variance is (n.n-1)/12. Next, determine the variance of the given set:
variance = sum [i=1..n] (f(i)-mean).(f(i)-mean)/n
and then compare with the expected variance. This will require two passes over the data, once to determine the mean and again to calculate the variance.
References:
uniform discrete distribution
variance
Here is a solution in O(N) time and O(1) extra space for finding duplicates :-
public static boolean check_range(int arr[],int n,int m) {
for(int i=0;i<m;i++) {
arr[i] = arr[i] - n;
if(arr[i]>=m)
return(false);
}
System.out.println("In range");
int j=0;
while(j<m) {
System.out.println(j);
if(arr[j]<m) {
if(arr[arr[j]]<m) {
int t = arr[arr[j]];
arr[arr[j]] = arr[j] + m;
arr[j] = t;
if(j==arr[j]) {
arr[j] = arr[j] + m;
j++;
}
}
else return(false);
}
else j++;
}
Explanation:-
Bring number to range (0,m-1) by arr[i] = arr[i] - n if out of range return false.
for each i check if arr[arr[i]] is unoccupied that is it has value less than m
if so swap(arr[i],arr[arr[i]]) and arr[arr[i]] = arr[arr[i]] + m to signal that it is occupied
if arr[j] = j and simply add m and increment j
if arr[arr[j]] >=m means it is occupied hence current value is duplicate hence return false.
if arr[j] >= m then skip

Resources