Find all combinations of strings in javascript array - arrays

How can I loop through an array of string values and output all combinations for N number of string values?
I also need to preserve each combination so that I can loop through those for creating records in a database.
I found a combination generator that helps require a minimum and I modified it as follows:
var combine = function(a, min) {
var fn = function(n, src, got, all) {
if (n == 0) {
if (got.length > 0) {
all[all.length] = got;
}
return;
}
for (var j = 0; j < src.length; j++) {
fn(n - 1, src.slice(j + 1), got.concat([src[j]]), all);
}
return;
}
var all = [];
for (var i = min; i < a.length; i++) {
fn(i, a, [], all);
}
all.push(a);
return all;
}
var subsets = combine(["RT","PT","DDA"], 1);
This outputs:
RT,PT,DDA,RT,PT,RT,DDA,PT,DDA,RT,PT,DDA
Which technically has all of the correct outputs, but I need to break them up into unique combos so that it outputs like this:
[[RT],[PT],[DDA],[RT,PT],[RT,DDA],[PT,DDA],[RT,PT,DDA]]
Having an array of arrays would allow me to loop through using the index later on to create unique records in the database. That's my thought at least and I could really use some help as I am still trying to fully understand recursive functions.

Note that every combination (including empty one) of N items corresponds to binary number K in range 0..2^N-1. If K contains 1 bit in i-th position, then i-th item is included in K-th combination. For example, value k=5=101binary corresponds to the combination of 0-th and 2-th item ([RT,DDA] in your case).
So just make loop for K=1..2^N-1, examine bits of K and build corresponding combination. Pseudocode:
for (k = 1; k < (1 << N); k++):
comb = []
t = k
idx = 0
while t > 0:
if (t and 1):
comb = comb + A[idx]
t = t << 1
idx++

Related

Max length increasing super array made by joining two increasing subarray

Problem:
Given an array, find two increasing subarrays(say a and b) such that when joined they produce one increasing array(say ab). We need to find max possible length of array ab.
For example:
given array = [2 3 1 2 5 4 5]
Two sub arrays are a = [2 3], b = [4, 5] and ab = [2 3 4 5]
Output: length(ab) = 4
I would solve it using brute force. By brute force I can get all the subarrays and then check if it is increasing. Then I can use a merge array and check if there are overlapping elements. If there are overlapping elements will remove them and store the length.
The time complexity for getting all the subarrays will be O(n^2) (I am assuming subarray will maintain the relative order and you do not mean all the subsets). And then will sort the subarrays using a queue and the sorting strategy would be the according to the first element. Then I will check how many can be merged with increasing property(something you use to merge already sorted array).
Then count the strictly increasing arrays after merging.
The other two approaches can be used by dynamic programming(this is same as longest contiguous increasing subarray): (Look here)
First approach:
public int lengthOfLIS(int[] nums) {
if(nums.length == 0) { return 0; }
int[] dp = new int[nums.length];
int len = 0;
for(int n: nums) {
// Find the position of it in binary tree.
int pos = Arrays.binarySearch(dp, 0, len, n);
// Convert the negative position to positive.
if(pos < 0) { pos = -1*(pos + 1); }
// assign the value to n
dp[pos] = n;
// If the length of the dp grows and becomes equal to the current len
// assign the output length to that.
if(pos == len) {
len++;
}
}
// Return the length.
return len;
}
Another method:
public int lengthOfLIS(int[] nums) {
if(nums == null || nums.length == 0) { return 0; }
int n = nums.length;
Integer lis[] = new Integer[n];
int max = 0;
/* Initialize LIS values for all indexes
for ( int i = 0; i < n; i++ ) {
lis[i] = 1;
}
/* Compute optimized LIS values in bottom up manner
for (int i = 1; i < n; i++ ) {
for ( int j = 0; j < i; j++ ) {
if ( nums[i] > nums[j] && lis[i] < lis[j] + 1) {
lis[i] = lis[j] + 1;
}
}
}
max = Collections.max(Arrays.asList(lis));
return max;
}
Idea is of brute force, By brute force I can get all the increasing sub arrays. Then I can use check if there are overlapping elements. If there are overlapping elements will calculate the length after merge and then compare and store the max length.

Lowest Maximum Visits in the milestones problem

Hi I was given a problem where I was given the numbers like 1, 50000, 10000, 3, 10001, 10003 and assume that they are the milestones that a runner crosse.
So that means, first he will run from 1 to 50k and then comes back to 10k, and then to 3, and to 10001 and to 10003 etc. Now I have to find the lowest maximum visited milestone in his entire journey.
here is the program I have written. is there any better version for this, to use less space instead of 50k array.
public class LowestMaxVisits {
public static void main(String[] args) {
List<Integer> list = new ArrayList<>();
list.add(1);
list.add(50000);
list.add(10000);
list.add(3);
list.add(10001);
list.add(10003);
int [] arr = new int[50000];
long smillis = System.currentTimeMillis() % 1000;
for(int i=0; i<list.size()-1; i++){
int start = list.get(i)-1;
int end = list.get(i+1)-1;
if(start > end){
int temp = start;
start = end;
end = temp;
}
while(start < end){
arr[start]++;
arr[end]++;
start++;
end--;
}
}
int maxVisits = -1;
int index = -1;
for(int k=0;k<arr.length;k++){
if(arr[k] > maxVisits){
index = k;
maxVisits = arr[k];
}
}
long emillis = System.currentTimeMillis() % 1000;
System.out.println("Time taken---"+ (emillis-smillis));
System.out.println("Here is the highest---"+(index+1));
}
}
Here's the code based on how I explained to solve the problem in my comment:
import java.util.ArrayList;
import java.util.List;
import java.util.Objects;
public class LowestMaxVisits {
public static void main(String[] args) {
List<Integer> milestones = new ArrayList<Integer>(){{
add(1);
add(50000);
add(10000);
add(3);
add(10001);
add(10003);
}};
System.out.println(getHighestMilestone(milestones));
}
public static int getHighestMilestone(List<Integer> milestones) {
// Make a sorted milestones array
List<Integer> sortedMilestones = new ArrayList<>(milestones);
int t;
for (int i = 0; i < sortedMilestones.size() - 1; i++) {
for (int j = sortedMilestones.size() - 1; j > i; j--) {
if (sortedMilestones.get(i) > sortedMilestones.get(j)) {
t = sortedMilestones.get(i);
sortedMilestones.set(i, sortedMilestones.get(j));
sortedMilestones.set(j, t);
}
}
}
// Count the amount of times passed for each milestone
t = 0;
int[] mPassed = new int[sortedMilestones.size()];
for (int i = 0; i < milestones.size() - 1; i++) {
if (i == 0 || (milestones.get(i) > sortedMilestones.get(t))) {
for (int j = t + 1; j < sortedMilestones.size(); j++) {
mPassed[j]++;
if (Objects.equals(sortedMilestones.get(j), milestones.get(i))) {
t = j;
break;
}
}
} else {
for (int j = t - 1; j >= 0; j--) {
mPassed[j]++;
if (Objects.equals(sortedMilestones.get(j), milestones.get(i))) {
t = j;
break;
}
}
}
}
// Get the highest count and set it to t
t = 0;
for (int i = 0; i < mPassed.length; i++) {
if (t < mPassed[i]) {
t = mPassed[i];
}
}
// Get lowest milestone with the higest count
for (int i = 0; i < mPassed.length; i++) {
if (mPassed[i] == t) {
t = sortedMilestones.get(i);
break;
}
}
return t;
}
}
You need to sort the copied array in order to iterate through them as if they were "places", back and forth according to the order of stops. You also need it for finding the lowest number milestone with the most visits.
Edit: Instead of making a list as big as the largest number in our array (in your case 50,000), this only needs to make a copy of the milestones array and another integer array of the same size. Note: a int t; variable is also made and reused for multiple purposes throughout getHighestMilestone().
The key observation here is that only the relative order of the milestones matter, not their absolute positions. If we changed the numbers in your example to 0, 5, 2, 1, 3, 4, the problem is still the same, as long as we convert the final output back. If N is the number of milestones in your input, there are at most N distinct positions so we can map them to the integers from 0 to N - 1. One way we can do this quickly is to insert all the positions as keys into an ordered map structure, which I believe is called TreeMap in Java, and then iterating through the keys in increasing order and assigning the next integer as the value for each key.
Now we can count the number of times each milestone was passed using an array, which only needs to be N elements large. When we move from one milestone to the next we can just increment all the array elements in between. However, we might have to increment almost the entire array every time, and we can make this faster. If f(x) is the number of times the milestone x was passed and a is the array, instead of storing f(x) in a[x] we store f(0) in a[0] and f(x) - f(x - 1) in a[x] for x greater than 0. Now to increment the range [l, r], we only need to increment the a[l] and decrement the a[r + 1].
After we processed all the milestones this way, we can iterate through the array and accumulate the elements to obtain the number of times each milestone is passed. It's not hard to see that f(x) = a[0] + a[1] + ... + a[x].
Here's the pseudocode:
let mp be an ordered map structure such as a TreeMap which maps integers to integers and stores elements in order of their key
for each milestone ms:
if ms isn't in mp:
insert ms as a key into mp paired with an arbitrary value
let index = 0
for each key-value pair pos in ms in increasing order of key:
set the value of pos to index
increment index
for each milestone ms:
set ms to mp[ms]
let a be an array
define incrementRange(l, r):
increment a[l]
decrement a[r + 1]
for each milestone ms:
let prev be the previous milestone
if ms is greater than prev:
incrementRange(prev + 1, ms)
else:
incrementRange(ms, prev - 1)
for i from 1 to N - 1:
add a[i - 1] to a[i]
let maxVisited be the minimum index in a of an element with maximum value
output the key in mp with value maxVisited
Time complexity: O(N log N) Space: O(N)
I don't think it gets faster than this.

Most frequent element in an array

Given all numbers are in the range 0 to n-1 where n is the length of the array.
How to solve this in linear time and constant space?
You can reuse given array as a counter for numbers. Just iterate through the array and increment corresponding counter. The only trick is to increment each time by n, not by one:
for (int i = 0; i < n; ++i) {
arr[arr[i]%n] += n;
}
After this loop element arr[i] will be changed to arr[i]+n*count[i], where arr[i]<n. So this way the most frequent element is the one with the greatest value. In order to restore the original value, just take arr[i]%n.
Here is a function to achieve this. But it is not O(n) as you required, it is O(n^2). Hope this help
function getPopularElement(array) {
var count = 1, tempCount;
var popular = array[0];
var temp = 0;
for (var i = 0; i < (array.length - 1); i++) {
temp = array[i];
tempCount = 0;
for (var j = 1; j < array.length; j++) {
if (temp == array[j]) {
tempCount++;
}
}
if (tempCount > count) {
popular = temp;
count = tempCount;
}
}
return popular;
}
Heller's solution is actually similar to this: the idea is to go through the array and for each element increase a counter at that number's position in the array. Generally there's another element at that position (where Heller is keeping the information by counting in step sizes of n) but we can resolve those elements recursively and in-place. This is a linear process done at most once per element, since there can be no longer chains (trying to increase a count at a position, finding a new element) than the length of the whole array (i.e. a single permutation cycle) and once an element is processed, it can be skipped in the main loop making it O(n) overall. The trick is to decrease the counter:
//input: arr
//output: most frequent element, number of occurrences
n <- arr.length
for i = 0..n-1
val <- arr[i]
if val < 0
// this element has already been processed
continue
// set counter at i to zero (-1)
arr[i] <- -1
// resolve the chain
do
idx <- val
val <- arr[idx]
if val < 0
// arrived at a counter, end of chain
// increase the counter by one (-1)
arr[idx] <- arr[idx] - 1
break
}
// otherwise continue the chain with val
// and initialise the counter at idx to one (-2)
arr[idx] <- -2
// find the most common element
idx <- 0
for i = 1..n-1
// smaller value means larger counter
if arr[i] < arr[idx]
idx <- i
// [the most frequent element, number of occurrences]
output [idx, -(arr[idx] + 1)] // arr[i] = -1 - #occurrences
This solution also deals nicely with very large arrays where the biggest possible counter in Heller's solution (n*n-1) overflows the underlying integer (for 32bit integers that's arrays longer than 65535 elements!)
Lets suppose the array is as follows:
int arr[] = {10, 20, 10, 20, 30, 20, 20,40,40,50,15,15,15};
int max =0;
int result = 0;
Map<Integer,Integer> map = new HashMap<>();
for (int i = 0; i < arr.length; i++) {
if ( map.containsKey(arr[i]))
map.put(arr[i], map.get(arr[i]) +1);
else
map.put(arr[i], 1);
int key = map.keySet().iterator().next();
if (map.get(key) > max) {
max = map.get(key) ;
result = key;
}
}
System.out.println(result);
Explanation :
In the above code I have taken HashMap to store the elements in keys and the repetition of the elements as values.
We have initialized variable max = 0 ( max is the maximum count of repeated element)
While iterating over elements We are also getting the max count of keys.
The result variable returns the keys with the mostly repeated.

Find the number of remove-then-append operations needed to sort a given array

This is an interview question. A swap means removing any element from the array and appending it to the back of the same array. Given an array of integers, find the minimum number of swaps needed to sort the array.
Is there a solution better than O(n^2)?
For example:
Input array: [3124].
The number of swaps: 2 ([3124] -> [1243] -> [1234]).
The problem boils down to finding the longest prefix of the sorted array that appears as a subsequence in the input array. This determines the elements that do not need to be sorted. The remaining elements will need to be deleted one by one, from the smallest to the largest, and appended at the back.
In your example, [3, 1, 2, 4], the already-sorted subsequence is [1, 2]. The optimal solution is to delete the remaning two elements, 3 and 4, and append them at the back. Thus the optimal solution is two "swaps".
Finding the subsequence can be done in O(n logn) time using O(n) extra memory. The following pseudo-code will do it (the code also happens to be valid Python):
l = [1, 2, 4, 3, 99, 98, 7]
s = sorted(l)
si = 0
for item in l:
if item == s[si]:
si += 1
print len(l) - si
If, as in your example, the array contains a permutation of integers from 1 to n, the problem can be solved in O(n) time using O(1) memory:
l = [1, 2, 3, 5, 4, 6]
s = 1
for item in l:
if item == s:
s += 1
print len(l) - s + 1
More generally, the second method can be used whenever we know the output array a priori and thus don't need to find it through sorting.
This might work in O(nlogn) even if we don't assume array of consecutive values.
If we do - it can be done in O(n).
One way of doing it is with O(n) space and O(nlogn) time.
Given array A sort it (O(nlogn)) into a second array B.
now... (arrays are indexed from 1)
swaps = 0
b = 1
for a = 1 to len(A)
if A[a] == B[b]
b = b + 1
else
swaps = swaps + 1
Observation: If an element is swapped to the back, its previous position does not matter. No element needs to be swapped more than once.
Observation: The last swap (if any) must move the largest element.
Observation: Before the swap, the array (excluding the last element) must be sorted (by former swaps, or initially)
Sorting algorithm, assuming the values are conecutive: find the longest sorted subsequence of consecutive (by value) elements starting at 1:
3 1 5 2 4
swap all higher elements in turn:
1 5 2 4 3
1 5 2 3 4
1 2 3 4 5
To find the number of swaps in O(n), find the length of the longest sorted subsequence of consecutive elements starting at 1:
expected = 1
for each element in sequence
if element == expected
expected += 1
return expected-1
then the number of swaps = the length of the input - its longest sorted subsequence.
An alternative solution ( O(n^2) ) if the input is not a permutation of 1..n:
swaps = 0
loop
find the first instance of the largest element and detect if the array is sorted
if the array is sorted, return swaps.
else remove the found element from the array and increment swaps.
Yet another solution ( O(n log n) ), assuming unique elements:
wrap each element in {oldPos, newPos, value}
make a shallow copy of the array
sort the array by value
store the new position of each element
run the algorithm for permutations on the newPos' in the (unsorted) copy
If you don't want to copy the input array, sort by oldPos before the last step instead.
This can be done in O(n log n).
First find the minimum element in the array. Now, find the max element that occurs before this element. Call this max_left. You have to call swap()for all the elements before the min element of the array.
Now, find the longest increasing subsequence to the right of the min element, along with the constraint that you should skip elements whose values are greater than max_left.
The required number of swaps is size(array) - size(LIS).
For example consider the array,
7 8 9 1 2 5 11 18
Minimum element in the array is 1. So we find the max before the minimum element.
7 8 9 | 1 2 5 11 18
max_left = 9
Now, find the LIS to the right of min with elements < 9
LIS = 1,2,5
No of swaps = 8 - 3 = 5
In cases where max element is null, ie., min is the first element, find the LIS of the array and required answer is size(array)-size(LIS)
For Example
2 5 4 3
max_left is null. LIS is 2 3
No of swaps = size(array) - size(LIS) = 4 - 2 = 2
Here is the code in python for minimum number of swaps,
def find_cycles(array):
cycles = []
remaining = set(array)
while remaining:
j = i = remaining.pop()
cycle = [i]
while True:
j = array[j]
if j == i:
break
array.append(j)
remaining.remove(j)
cycles.append(cycle)
return cycles
def minimum_swaps(seq):
return sum(len(cycle) - 1 for cycle in find_cycles(seq))
O(1) space and O(N) (~ 2*N) solution assuming min element is 1 and the array contains all numbers from 1 to N-1 without any duplicate value. where N is array length.
int minimumSwaps(int[] a) {
int swaps = 0;
int i = 0;
while(i < a.length) {
int position = a[i] - 1;
if(position != i) {
int temp = a[position];
a[position] = a[i];
a[i] = temp;
swaps++;
} else {
i++;
}
}
return swaps;
}
int numSwaps(int arr[], int length) {
bool sorted = false;
int swaps = 0;
while(!sorted) {
int inversions = 0;
int t1pos,t2pos,t3pos,t4pos = 0;
for (int i = 1;i < length; ++i)
{
if(arr[i] < arr[i-1]){
if(inversions){
tie(t3pos,t4pos) = make_tuple(i-1, i);
}
else tie(t1pos, t2pos) = make_tuple(i-1, i);
inversions++;
}
if(inversions == 2)
break;
}
if(!inversions){
sorted = true;
}
else if(inversions == 1) {
swaps++;
int temp = arr[t2pos];
arr[t2pos] = arr[t1pos];
arr[t1pos] = temp;
}
else{
swaps++;
if(arr[t4pos] < arr[t2pos]){
int temp = arr[t1pos];
arr[t1pos] = arr[t4pos];
arr[t4pos] = temp;
}
else{
int temp = arr[t2pos];
arr[t2pos] = arr[t1pos];
arr[t1pos] = temp;
}
}
}
return swaps;
}
This code returns the minimal number of swaps required to sort an array inplace.
For example, A[] = [7,3,4,1] By swapping 1 and 7, we get [1,3,4,7].
similarly B[] = [1,2,6,4,8,7,9]. We first swap 6 with 4, so, B[] -> [1,2,4,6,8,7,9]. Then 7 with 8. So -> [1,2,4,6,7,8,9]
The algorithm runs in O(number of pairs where value at index i < value at index i-1) ~ O(N) .
Writing a very simple JavaScript program to sort an array and find number of swaps:
function findSwaps(){
let arr = [4, 3, 1, 2];
let swap = 0
var n = arr.length
for (let i = 0; i < n; i++) {
for (let j = i + 1; j < n; j++) {
if (arr[i] > arr[j]) {
arr[i] = arr[i] + arr[j];
arr[j] = arr[i] - arr[j];
arr[i] = arr[i] - arr[j]
swap = swap + 1
}
}
}
console.log(arr);
console.log(swap)
}
for(int count = 1; count<=length; count++)
{
tempSwap=0; //it will count swaps per iteration
for(int i=0; i<length-1; i++)
if(a[i]>a[i+1])
{
swap(a[i],a[i+1]);
tempSwap++;
}
if(tempSwap!=0) //check if array is already sorted!
swap += tempSwap;
else
break;
}
System.out.println(swaps);
this is an O(n) solution which works for all inputs:
static int minimumSwaps(int[] arr) {
int swap=0;
boolean visited[]=new boolean[arr.length];
for(int i=0;i<arr.length;i++){
int j=i,cycle=0;
while(!visited[j]){
visited[j]=true;
j=arr[j]-1;
cycle++;
}
if(cycle!=0)
swap+=cycle-1;
}
return swap;
}
}
def minimumSwaps(arr):
swaps = 0
'''
first sort the given array to determine the correct indexes
of its elements
'''
temp = sorted(arr)
# compare unsorted array with the sorted one
for i in range(len(arr)):
'''
if ith element in the given array is not at the correct index
then swap it with the correct index, since we know the correct
index because of sorting.
'''
if arr[i] != temp[i]:
swaps += 1
a = arr[i]
arr[arr.index(temp[i])] = a
arr[i] = temp[i]
return swaps
I think this problem can be solved in O(N) if you notice that an element in the array needs to be removed and appended if:
There is a smaller element to the right or...
There is a smaller element to his left that needs to be removed and appended.
Then it's just about identifying elements that will need to be removed and appended. Here is the code:
static int minMoves(int arr[], int n) {
if (arr.length == 0) return 0;
boolean[] willBeMoved = new boolean[n]; // keep track of elements to be removed and appended
int min = arr[n - 1]; // keep track of the minimum
for (int i = n - 1; i >= 0; i--) { // traverse the array from the right
if (arr[i] < min) min = arr[i]; // found a new min
else if (arr[i] > min) { // arr[i] has a smaller element to the right, so it will need to be moved at some point
willBeMoved[i] = true;
}
}
int minToBeMoved = -1; // keep track of the minimum element to be removed and appended
int result = 0; // the answer
for (int i = 0; i < n; i++) { // traverse the array from the left
if (minToBeMoved == -1 && !willBeMoved[i]) continue; // find the first element to be moved
if (minToBeMoved == -1) minToBeMoved = i;
if (arr[i] > arr[minToBeMoved]) { // because a smaller value will be moved to the end, arr[i] will also have to be moved at some point
willBeMoved[i] = true;
} else if (arr[i] < arr[minToBeMoved] && willBeMoved[i]) { // keep track of the min value to be moved
minToBeMoved = i;
}
if (willBeMoved[i]) result++; // increment
}
return result;
}
It uses O(N) space.
#all , the accepted solution provided by #Itay karo and #NPE is totally wrong because it doesn't consider future ordering of swapped elements...
It fails for many testcases like:
3 1 2 5 4
correct output: 4
but their codes give output as 3...
explanation: 3 1 2 5 4--->1 2 5 4 3--->1 2 4 3 5--->1 2 3 5 4--->1 2 3 4 5
PS:i cann't comment there because of low reputation
Hear is my solution in c# to solve the minimum number of swaps required to short an array
At at time we can swap only 2 elements(at any index position).
public class MinimumSwaps2
{
public static void minimumSwapsMain(int[] arr)
{
Dictionary<int, int> dic = new Dictionary<int, int>();
Dictionary<int, int> reverseDIc = new Dictionary<int, int>();
int temp = 0;
int indx = 0;
//find the maximum number from the array
int maxno = FindMaxNo(arr);
if (maxno == arr.Length)
{
for (int i = 1; i <= arr.Length; i++)
{
dic[i] = arr[indx];
reverseDIc.Add(arr[indx], i);
indx++;
}
}
else
{
for (int i = 1; i <= arr.Length; i++)
{
if (arr.Contains(i))
{
dic[i] = arr[indx];
reverseDIc.Add(arr[indx], i);
indx++;
}
}
}
int counter = FindMinSwaps(dic, reverseDIc, maxno);
}
static int FindMaxNo(int[] arr)
{
int maxNO = 0;
for (int i = 0; i < arr.Length; i++)
{
if (maxNO < arr[i])
{
maxNO = arr[i];
}
}
return maxNO;
}
static int FindMinSwaps(Dictionary<int, int> dic, Dictionary<int, int> reverseDIc, int maxno)
{
int counter = 0;
int temp = 0;
for (int i = 1; i <= maxno; i++)
{
if (dic.ContainsKey(i))
{
if (dic[i] != i)
{
counter++;
var myKey1 = reverseDIc[i];
temp = dic[i];
dic[i] = dic[myKey1];
dic[myKey1] = temp;
reverseDIc[temp] = reverseDIc[i];
reverseDIc[i] = i;
}
}
}
return counter;
}
}
int temp = 0, swaps = 0;
for (int i = 0; i < arr.length;) {
if (arr[i] != i + 1){
// System.out.println("Swapping --"+arr[arr[i] - 1] +" AND -- "+arr[i]);
temp = arr[arr[i] - 1];
arr[arr[i] - 1] = arr[i];
arr[i] = temp;
++swaps;
} else
++i;
// System.out.println("value at position -- "+ i +" is set to -- "+ arr[i]);
}
return swaps;
This is the most optimized answer i have found. It is so simple. You will probably understand in one look through the loop. Thanks to Darryl at hacker rank.

How to find a duplicate element in an array of shuffled consecutive integers?

I recently came across a question somewhere:
Suppose you have an array of 1001 integers. The integers are in random order, but you know each of the integers is between 1 and 1000 (inclusive). In addition, each number appears only once in the array, except for one number, which occurs twice. Assume that you can access each element of the array only once. Describe an algorithm to find the repeated number. If you used auxiliary storage in your algorithm, can you find an algorithm that does not require it?
What I am interested in to know is the second part, i.e., without using auxiliary storage. Do you have any idea?
Just add them all up, and subtract the total you would expect if only 1001 numbers were used from that.
Eg:
Input: 1,2,3,2,4 => 12
Expected: 1,2,3,4 => 10
Input - Expected => 2
Update 2: Some people think that using XOR to find the duplicate number is a hack or trick. To which my official response is: "I am not looking for a duplicate number, I am looking for a duplicate pattern in an array of bit sets. And XOR is definitely suited better than ADD to manipulate bit sets". :-)
Update: Just for fun before I go to bed, here's "one-line" alternative solution that requires zero additional storage (not even a loop counter), touches each array element only once, is non-destructive and does not scale at all :-)
printf("Answer : %d\n",
array[0] ^
array[1] ^
array[2] ^
// continue typing...
array[999] ^
array[1000] ^
1 ^
2 ^
// continue typing...
999^
1000
);
Note that the compiler will actually calculate the second half of that expression at compile time, so the "algorithm" will execute in exactly 1002 operations.
And if the array element values are know at compile time as well, the compiler will optimize the whole statement to a constant. :-)
Original solution: Which does not meet the strict requirements of the questions, even though it works to find the correct answer. It uses one additional integer to keep the loop counter, and it accesses each array element three times - twice to read it and write it at the current iteration and once to read it for the next iteration.
Well, you need at least one additional variable (or a CPU register) to store the index of the current element as you go through the array.
Aside from that one though, here's a destructive algorithm that can safely scale for any N up to MAX_INT.
for (int i = 1; i < 1001; i++)
{
array[i] = array[i] ^ array[i-1] ^ i;
}
printf("Answer : %d\n", array[1000]);
I will leave the exercise of figuring out why this works to you, with a simple hint :-):
a ^ a = 0
0 ^ a = a
A non destructive version of solution by Franci Penov.
This can be done by making use of the XOR operator.
Lets say we have an array of size 5: 4, 3, 1, 2, 2
Which are at the index: 0, 1, 2, 3, 4
Now do an XOR of all the elements and all the indices. We get 2, which is the duplicate element. This happens because, 0 plays no role in the XORing. The remaining n-1 indices pair with same n-1 elements in the array and the only unpaired element in the array will be the duplicate.
int i;
int dupe = 0;
for(i = 0; i < N; i++) {
dupe = dupe ^ arr[i] ^ i;
}
// dupe has the duplicate.
The best feature of this solution is that it does not suffer from overflow problems that is seen in the addition based solution.
Since this is an interview question, it would be best to start with the addition based solution, identify the overflow limitation and then give the XOR based solution :)
This makes use of an additional variable so does not meet the requirements in the question completely.
Add all the numbers together. The final sum will be the 1+2+...+1000+duplicate number.
To paraphrase Francis Penov's solution.
The (usual) problem is: given an array of integers of arbitrary length that contain only elements repeated an even times of times except for one value which is repeated an odd times of times, find out this value.
The solution is:
acc = 0
for i in array: acc = acc ^ i
Your current problem is an adaptation. The trick is that you are to find the element that is repeated twice so you need to adapt solution to compensate for this quirk.
acc = 0
for i in len(array): acc = acc ^ i ^ array[i]
Which is what Francis' solution does in the end, although it destroys the whole array (by the way, it could only destroy the first or last element...)
But since you need extra-storage for the index, I think you'll be forgiven if you also use an extra integer... The restriction is most probably because they want to prevent you from using an array.
It would have been phrased more accurately if they had required O(1) space (1000 can be seen as N since it's arbitrary here).
Add all numbers. The sum of integers 1..1000 is (1000*1001)/2. The difference from what you get is your number.
One line solution in Python
arr = [1,3,2,4,2]
print reduce(lambda acc, (i, x): acc ^ i ^ x, enumerate(arr), 0)
# -> 2
Explanation on why it works is in #Matthieu M.'s answer.
If you know that we have the exact numbers 1-1000, you can add up the results and subtract 500500 (sum(1, 1000)) from the total. This will give the repeated number because sum(array) = sum(1, 1000) + repeated number.
Well, there is a very simple way to do this... each of the numbers between 1 and 1000 occurs exactly once except for the number that is repeated.... so, the sum from 1....1000 is 500500. So, the algorithm is:
sum = 0
for each element of the array:
sum += that element of the array
number_that_occurred_twice = sum - 500500
n = 1000
s = sum(GivenList)
r = str(n/2)
duplicate = int( r + r ) - s
public static void main(String[] args) {
int start = 1;
int end = 10;
int arr[] = {1, 2, 3, 4, 4, 5, 6, 7, 8, 9, 10};
System.out.println(findDuplicate(arr, start, end));
}
static int findDuplicate(int arr[], int start, int end) {
int sumAll = 0;
for(int i = start; i <= end; i++) {
sumAll += i;
}
System.out.println(sumAll);
int sumArrElem = 0;
for(int e : arr) {
sumArrElem += e;
}
System.out.println(sumArrElem);
return sumArrElem - sumAll;
}
No extra storage requirement (apart from loop variable).
int length = (sizeof array) / (sizeof array[0]);
for(int i = 1; i < length; i++) {
array[0] += array[i];
}
printf(
"Answer : %d\n",
( array[0] - (length * (length + 1)) / 2 )
);
Do arguments and callstacks count as auxiliary storage?
int sumRemaining(int* remaining, int count) {
if (!count) {
return 0;
}
return remaining[0] + sumRemaining(remaining + 1, count - 1);
}
printf("duplicate is %d", sumRemaining(array, 1001) - 500500);
Edit: tail call version
int sumRemaining(int* remaining, int count, int sumSoFar) {
if (!count) {
return sumSoFar;
}
return sumRemaining(remaining + 1, count - 1, sumSoFar + remaining[0]);
}
printf("duplicate is %d", sumRemaining(array, 1001, 0) - 500500);
public int duplicateNumber(int[] A) {
int count = 0;
for(int k = 0; k < A.Length; k++)
count += A[k];
return count - (A.Length * (A.Length - 1) >> 1);
}
A triangle number T(n) is the sum of the n natural numbers from 1 to n. It can be represented as n(n+1)/2. Thus, knowing that among given 1001 natural numbers, one and only one number is duplicated, you can easily sum all given numbers and subtract T(1000). The result will contain this duplicate.
For a triangular number T(n), if n is any power of 10, there is also beautiful method finding this T(n), based on base-10 representation:
n = 1000
s = sum(GivenList)
r = str(n/2)
duplicate = int( r + r ) - s
I support the addition of all the elements and then subtracting from it the sum of all the indices but this won't work if the number of elements is very large. I.e. It will cause an integer overflow! So I have devised this algorithm which may be will reduce the chances of an integer overflow to a large extent.
for i=0 to n-1
begin:
diff = a[i]-i;
dup = dup + diff;
end
// where dup is the duplicate element..
But by this method I won't be able to find out the index at which the duplicate element is present!
For that I need to traverse the array another time which is not desirable.
Improvement of Fraci's answer based on the property of XORing consecutive values:
int result = xor_sum(N);
for (i = 0; i < N+1; i++)
{
result = result ^ array[i];
}
Where:
// Compute (((1 xor 2) xor 3) .. xor value)
int xor_sum(int value)
{
int modulo = x % 4;
if (modulo == 0)
return value;
else if (modulo == 1)
return 1;
else if (modulo == 2)
return i + 1;
else
return 0;
}
Or in pseudocode/math lang f(n) defined as (optimized):
if n mod 4 = 0 then X = n
if n mod 4 = 1 then X = 1
if n mod 4 = 2 then X = n+1
if n mod 4 = 3 then X = 0
And in canonical form f(n) is:
f(0) = 0
f(n) = f(n-1) xor n
My answer to question 2:
Find the sum and product of numbers from 1 -(to) N, say SUM, PROD.
Find the sum and product of Numbers from 1 - N- x -y, (assume x, y missing), say mySum, myProd,
Thus:
SUM = mySum + x + y;
PROD = myProd* x*y;
Thus:
x*y = PROD/myProd; x+y = SUM - mySum;
We can find x,y if solve this equation.
In the aux version, you first set all the values to -1 and as you iterate check if you have already inserted the value to the aux array. If not (value must be -1 then), insert. If you have a duplicate, here is your solution!
In the one without aux, you retrieve an element from the list and check if the rest of the list contains that value. If it contains, here you've found it.
private static int findDuplicated(int[] array) {
if (array == null || array.length < 2) {
System.out.println("invalid");
return -1;
}
int[] checker = new int[array.length];
Arrays.fill(checker, -1);
for (int i = 0; i < array.length; i++) {
int value = array[i];
int checked = checker[value];
if (checked == -1) {
checker[value] = value;
} else {
return value;
}
}
return -1;
}
private static int findDuplicatedWithoutAux(int[] array) {
if (array == null || array.length < 2) {
System.out.println("invalid");
return -1;
}
for (int i = 0; i < array.length; i++) {
int value = array[i];
for (int j = i + 1; j < array.length; j++) {
int toCompare = array[j];
if (value == toCompare) {
return array[i];
}
}
}
return -1;
}

Resources