Related
The problem statement asks the number of such subarrays where i < j < k, such that sum of any two numbers should be greater than or equal to the third in the subarray:
What I did:
I ran a loop from i=0 till n-2:
and the basic logic I used was if the first two elements in the sorted subarray are greater than or equal to the maximum, then all pairs will be greater than any element. and every time I get the subarray, I add the next element into it and set those three variables again. Am passing 15/20 TCs other am getting TLE:
Constraints:
1<=n<=10^5
1<=ai<=10^9
for(int i=0;i<n-2;i++)
{
int r=i+2;
vector<int> temp(inp.begin()+i,inp.begin()+r+1);
sort(temp.begin(),temp.end());
max_elem=temp[1];min_elem=temp[0];
int maximum=temp[temp.size()-1];
//cout<<max_elem<<" "<<min_elem<<"\n";
while(r<n && max_elem+min_elem >= maximum)
{
//cout<<max_elem<<" "<<min_elem<<" "<<inp[r]<<"\n";
cnt++;
r++;
if(inp[r]<min_elem) {max_elem=min_elem;min_elem=inp[r];}
else if(inp[r]<max_elem) max_elem=inp[r];
else if(inp[r]>maximum) maximum=inp[r];
}
}
cout<<cnt<<"\n";
Sample TC:
I1:
5
7 6 5 3 4
O1:
6
Explanation:
6 subarrays fulfill the conditions: (7,6,5),(7,6,5,3),(7,6,5,3,4),(6,5,3),(6,5,3,4),(5,3,4).
I2:
5
1 2 3 5 6
O2:
3
Explanation:
(1,2,3),(2,3,5),(3,5,6) --(NOTE: 1,2,3,5 isn't the ans coz 1+2 < 5 )
A naive approach to do this is this is as the following. Your logic is correct and it is what I implemented. I changed the sort (NlogN) with a single pass (N) finding only the 2 smallest and largest numbers. I haven't compiled the code and not sure it works as intended. It has the overall complexity of (N*N*N).
Execution time can be improved by doing some extra checks:
min1 + min2 >= maxcondition can be checked after each inner (k) loop, breaking if it violates for single case.
If condition is not satisfied for say subarray 4-7, there is no need to check any other substring including 4-7. By storing violating cases and checking against them before each loop, overall execution time can be improved.
int min1;
int min2;
int max;
int count = 0;
for(int i = 2; i < n; i++){
for(int j = 0; j < i - 2; j++){
max = -1;
min1 = min2 = 1000000000;
for(int k = j; k <= i; k++){
if(inp[k] > max)
max = inp[k];
if(inp[k] < min1){
min1 = inp[k];
continue;
}
if(inp[k] < min2){
min2 = inp[k];
}
}
if(min1 + min2 >= max)
count++;
}
}
There might be some bugs, but here is the general idea for a O(n log n) solution:
We keep a windows of elements from startIdx to endIdx. If its a valid subarray, it means we can expand it, we can add another element to it, so we increase endIdx. If its not valid, it wouldnt be valid no matter how much we expand it, so we need to reduce it by increasing startIdx.
pseudocode:
multiset<int> nums;
int startIdx = 0, endIdx = 0;
int sol = 0;
while(endIdx != inp.size()) {
if (endIdx - startIdx < 3) {
nums.add(inp[endIdx]);
endIdx++;
} else {
if (nums.lowestElement() + nums.secondLowestElement() < nums.highestElement()) {
nums.remove(nums.find(inp[startIdx]));
startIdx++;
} else {
sol += endIdx - startIdx - 2; // amount of valid subarrays ending in inp[endIdx - 1]
nums.add(inp[endIdx]);
endIdx++;
}
}
}
Lets say we have an array of positive numbers and we were given a value M. Our goal is to find if there is a consecutive sub sequence in the array of positive numbers such that the sum of the sequence is exactly equal to sum M. If A[1],A[2],....A[n] is an array then we have to find if there exist i and j such that A[i]+...+A[j] = M.
I am trying to get the O(n) solution using greedy approach.
I believe you can solve this in linear time with a pointer chasing algorithm.
Here's the intuition. Start off a pointer at the left side of the array. Keep moving it to the right, tracking the sum of the elements you've seen so far, until you either hit exactly M (done!), your total exceeds M (stop for now, adding in more elements only makes it worse), or you hit the end of the array without reaching at least M (all the elements combined are too small). If you do end up in a case where the sum exceeds M, you can be guaranteed that no subarray starting at the beginning of the array adds up to exactly M, since you tried all of them and they were either too small or too big.
Now, start a second pointer at the first element and keep advancing it forward, subtracting out the current element, until you either get to exactly M (done!), you reach the first pointer (stop for now), or the total drops below M (stop for now). All the elements you skipped over with this pointer can't be the starting point of the subarray you're looking for. At this point, start marching the first pointer forward again.
Overall, each pointer advances at most n times and you do O(1) work per step, so this runs in time O(n). Plus, it uses only O(1) space, which is as good as it's going to get!
This is a standard two pointer problem. First of all, create an array, prefix that will store the prefix sum of the given array, say arr.
So
prefix[i] = arr[1] + .. + arr[i]
Start with two pointers, lower and upper. Initialize them as
lower = 0
upper = 1
(Note: Initialize prefix[0] to 0)
Now, try to understand this code:
lower = 0, upper = 1;
while(upper <= n) { // n is the number of elements
if(prefix[upper] - prefix[lower] == m) {
return true;
} else if(prefix[upper] - prefix[lower] > m) {
lower++;
} else {
upper++;
}
}
return false;
Here we are using the fact that the array consists of positive integers,
hence prefix is increasing
Assume that the subarray with indices X ≤ i < Y might be the solution.
You start with X = 1, Y= 1, sum of elements = 0.
As long as the sum is less than M, and Y <= n, increase the sum by array [Y] and replace Y with Y + 1.
If the sum is equal to M, you found a solution.
If the sum is less than M, you remove array elements at the start: As long as the sum is greater than M, subtract array [X] from the sum and replace X with X + 1. If the sum became equal to M, you have a solution. Otherwise you start with the first loop.
(edited: see templatetypedef's comment)
Use the two indices approach: increase the lower index if subsequence too small otherwise increase higher index.
Example:
void solve(int *a, int n, int M) {
if (n <= 0) return;
int i, j, s;
i = 0, j = 0, s = a[j];
while (j < n) {
if (s == M) {
printf("%dth through %dth elements\n", i + 1, j + 1);
return;
} else if (s < M) {
j++;
s += a[j];
} else {
s -= a[i];
i++;
}
}
}
public class FindSumEquals {
public static void main(String[] args) {
int n = 15;
System.out.println("Count is "+ findPossible(n));
}
private static int findPossible(int n) {
int temp = n;
int arrayLength = n / 2 + 2;
System.out.println("arrayLength : " + arrayLength) ;
int a [] = new int[arrayLength];
int count = 0;
for(int i = 1; i < arrayLength; i++){
a[i] = i + a[i - 1];
}
int lower = 0, upper = 1;
while(upper <= arrayLength - 1) {
if(a[upper] - a[lower] == temp) {
System.out.println("hello - > " + ++lower + " to "+ upper);
upper++;
count++;
} else if(a[upper] - a[lower] > temp) {
lower++;
} else {
upper++;
}
}
return count;
}
}
Given a set of n integers, divide the set in two subsets of n/2 sizes each such that the difference of the sum of two subsets is as minimum as possible. If n is even, then sizes of two subsets must be strictly n/2 and if n is odd, then size of one subset must be (n-1)/2 and size of other subset must be (n+1)/2.
For example, let given set be {3, 4, 5, -3, 100, 1, 89, 54, 23, 20}, the size of set is 10. Output for this set should be {4, 100, 1, 23, 20} and {3, 5, -3, 89, 54}. Both output subsets are of size 5 and sum of elements in both subsets is same (148 and 148).
Another example where n is odd. Let given set be {23, 45, -34, 12, 0, 98, -99, 4, 189, -1, 4}. The output subsets should be {45, -34, 12, 98, -1} and {23, 0, -99, 4, 189, 4}. The sums of elements in two subsets are 120 and 121 respectively.
After much searching, I found out this problem is NP-Hard. Therefore, a polynomial time solution is not possible.
However, I was thinking something in lines of this:
Initialise first subset as the first element.
Initialise second subset as second element.
Then depending upon which subset is smaller in size and the sum is lacking in which subset, I will insert the next elements.
The above might achieve a linear time, I guess.
However, the solution given here is way too complicated: http://www.geeksforgeeks.org/tug-of-war/. I couldn't understand it. Therefore, I just want to ask, is my solution correct? Considering the fact that this is a NP-Hard problem, I think it should do? And if not, can someone please explain in like really brief, how exactly the code on the link attached works? Thanks!
Your solution is wrong.
It's a greedy approach to solve the Subset-Sum problem/ Partition Problem, that fails.
Here is a simple counter example:
arr = [1,2,3]
Your solution will assign A={1}, B={2}, and then chose to assign 3 to A, and get A={1,3}, B={2} - which is not optimal, since the optimal solution is A={1,2}, b={3}
The correct way to do it is using Dynamic Programming, by following the recursive formulas:
D(x,i) = false i < 0
D(0,i) = true
D(x,i) = D(x,i-1) OR D(x-arr[i],i-1)
It can be done efficiently using Dynamic Programming by building a table that follows the recurrence bottom-up.
The table will be of size (SUM/2 + 1)*(n+1) (where SUM is the sum of all elements), and then find the maximal value in the table such that D(x,n) = true
The problem is a specialization of SubSet Sum problem which decides whether we can find any two partition that has equal sum. This is an NP-Complete problem.
But, the problem in question asking for 2 such equal partition where the equality holds when we satisfy following two conditions:
Size of the partitions differ by at most 1
Sum of the elements in the partitions is minimum
Certainly we are asking here for a suboptimal solution to the more generalized NP-complete problem.
For example, for A=[1, 2, 3, 4, 5, 6, 7, 8, 9], we can have such two partitions {[1, 3, 2, 7, 9], [5, 4, 6, 8]} with sum diff = abs(22-23) = 1.
Our goal is to find suboptimal solution with best approximation ratio. Idea is to partition the array in pairs of element that would distribute the sum as uniformly as possible across the partitions. So, each time we would try to take 2 pairs and put one pair in a partition and the other pair in the other partition.
Sort the array
If number of elements in less than 4 then create partitions accordingly for each cases when we have 1 element or 2 element or 3 element in the array.
Otherwise we will take 2 pair each time and put into two partition such that it minimizes the sum diff.
Pick the pair(largest, smallest) elemets in the the sorted array and put it to the smaller (wr.to sum) partition.
Then pick the second largest element and find its buddy to put them in the ‘other’ partition such that sum of second largest and its buddy minimizes the sum difference of the partitions.
The above approach will give a suboptimal solution. The problem in NP complete so, we can’t have an optimal solution but we can improve the approximation ratio as follows.
If we have suboptimal solution (ie. sum diff != 0) then we try to improve the solution by swapping a large element in the larger partition with a small element in the smaller partition such that the swap actually minimizes the sum diff.
The O(n^2) time and O(n) space implementation of the above approach is as follows –
//overall O(n^2) time and O(n) space solution using a greedy approach
----------
----------
----------
public static ArrayList<Integer>[] findEqualPartitionMinSumDif(int A[]){
//first sort the array - O(nlgn)
Arrays.sort(A);
ArrayList<Integer> partition1 = new ArrayList<Integer>();
ArrayList<Integer> partition2 = new ArrayList<Integer>();
//create index table to manage largest unused and smallest unused items
//O(n) space and O(nlgn) time to build and query the set
TreeSet<Integer> unused = new TreeSet<>();
for(int i = 0; i<A.length; i++){
unused.add(i);
}
int i = 0;
int j = A.length-1;
int part1Sum = 0;
int part2Sum = 0;
int diffSum = 0;
//O(n^2) processing time
while(unused.size() > 0){
i = unused.first();
j = unused.last();
diffSum = part1Sum-part2Sum;
//in case of size of the array is not multiple of 4 then we need to process last 3(or 2 or 1)
//element to assign partition. This is special case handling
if(unused.size() < 4){
switch(unused.size()){
case 1:
//put the 1 remaining item into smaller partition
if(diffSum > 0){
partition2.add(A[i]);
part2Sum += A[i];
}
else{
partition1.add(A[i]);
part1Sum += A[i];
}
break;
case 2:
//among the remaining 2 put the max in smaller and min in larger bucket
int max = Math.max(A[i], A[j]);
int min = Math.min(A[i], A[j]);
if(diffSum > 0){
partition2.add(max);
partition1.add(min);
part2Sum += max;
part1Sum += min;
}
else{
partition1.add(max);
partition2.add(min);
part1Sum += max;
part2Sum += min;
}
break;
case 3:
//among the remaining 3 put the two having total value greater then the third one into smaller partition
//and the 3rd one to larger bucket
unused.remove(i);
unused.remove(j);
int middle = unused.first();
if(diffSum > 0){
if(A[i]+A[middle] > A[j]){
partition2.add(A[i]);
partition2.add(A[middle]);
partition1.add(A[j]);
part2Sum += A[i]+A[middle];
part1Sum += A[j];
}
else{
partition2.add(A[j]);
partition1.add(A[i]);
partition1.add(A[middle]);
part1Sum += A[i]+A[middle];
part2Sum += A[j];
}
}
else{
if(A[i]+A[middle] > A[j]){
partition1.add(A[i]);
partition1.add(A[middle]);
partition2.add(A[j]);
part1Sum += A[i]+A[middle];
part2Sum += A[j];
}
else{
partition1.add(A[j]);
partition2.add(A[i]);
partition2.add(A[middle]);
part2Sum += A[i]+A[middle];
part1Sum += A[j];
}
}
break;
default:
}
diffSum = part1Sum-part2Sum;
break;
}
//first take the largest and the smallest element to create a pair to be inserted into a partition
//we do this for having a balanced distribute of the numbers in the partitions
//add pair (i, j) to the smaller partition
int pairSum = A[i]+A[j];
int partition = diffSum > 0 ? 2 : 1;
if(partition == 1){
partition1.add(A[i]);
partition1.add(A[j]);
part1Sum += pairSum;
}
else{
partition2.add(A[i]);
partition2.add(A[j]);
part2Sum += pairSum;
}
//update diff
diffSum = part1Sum-part2Sum;
//we have used pair (i, j)
unused.remove(i);
unused.remove(j);
//move j to next big element to the left
j = unused.last();
//now find the buddy for j to be paired with such that sum of them is as close as to pairSum
//so we will find such buddy A[k], i<=k<j such that value of ((A[j]+A[k])-pairSum) is minimized.
int buddyIndex = unused.first();
int minPairSumDiff = Integer.MAX_VALUE;
for(int k = buddyIndex; k<j; k++){
if(!unused.contains(k))
continue;
int compPairSum = A[j]+A[k];
int pairSumDiff = Math.abs(pairSum-compPairSum);
if(pairSumDiff < minPairSumDiff){
minPairSumDiff = pairSumDiff;
buddyIndex = k;
}
}
//we now find buddy for j. So we add pair (j,buddyIndex) to the other partition
if(j != buddyIndex){
pairSum = A[j]+A[buddyIndex];
if(partition == 2){
partition1.add(A[j]);
partition1.add(A[buddyIndex]);
part1Sum += pairSum;
}
else{
partition2.add(A[j]);
partition2.add(A[buddyIndex]);
part2Sum += pairSum;
}
//we have used pair (j, buddyIndex)
unused.remove(j);
unused.remove(buddyIndex);
}
}
//if diffsum is greater than zero then we can further try to optimize by swapping
//a larger elements in large partition with an small element in smaller partition
//O(n^2) operation with O(n) space
if(diffSum != 0){
Collections.sort(partition1);
Collections.sort(partition2);
diffSum = part1Sum-part2Sum;
ArrayList<Integer> largerPartition = (diffSum > 0) ? partition1 : partition2;
ArrayList<Integer> smallerPartition = (diffSum > 0) ? partition2 : partition1;
int prevDiff = Math.abs(diffSum);
int largePartitonSwapCandidate = -1;
int smallPartitonSwapCandidate = -1;
//find one of the largest element from large partition and smallest from the smaller partition to swap
//such that it overall sum difference in the partitions are minimized
for(i = 0; i < smallerPartition.size(); i++){
for(j = largerPartition.size()-1; j>=0; j--){
int largerVal = largerPartition.get(j);
int smallerVal = smallerPartition.get(i);
//no point of swapping larger value from smaller partition
if(largerVal <= smallerVal){
continue;
}
//new difference if we had swapped these elements
int diff = Math.abs(prevDiff - 2*Math.abs(largerVal - smallerVal));
if(diff == 0){
largerPartition.set(j, smallerVal);
smallerPartition.set(i, largerVal);
return new ArrayList[]{largerPartition, smallerPartition};
}
//find the pair to swap that minimizes the sum diff
else if (diff < prevDiff){
prevDiff = diff;
largePartitonSwapCandidate = j;
smallPartitonSwapCandidate = i;
}
}
}
//if we indeed found one such a pair then swap it.
if(largePartitonSwapCandidate >=0 && smallPartitonSwapCandidate >=0){
int largerVal = largerPartition.get(largePartitonSwapCandidate);
int smallerVal = smallerPartition.get(smallPartitonSwapCandidate);
largerPartition.set(largePartitonSwapCandidate, smallerVal);
smallerPartition.set(smallPartitonSwapCandidate, largerVal);
return new ArrayList[]{largerPartition, smallerPartition};
}
}
return new ArrayList[]{partition1, partition2};
}
Given an array of size n, for each k from 1 to n, find the maximum sum of contiguous subarray of size k.
This problem has an obvious solution with time complexity O(N2) and O(1) space. Lua code:
array = {7, 1, 3, 1, 4, 5, 1, 3, 6}
n = #array
function maxArray(k)
ksum = 0
for i = 1, k do
ksum = ksum + array[i]
end
max_ksum = ksum
for i = k + 1, n do
add_index = i
sub_index = i - k
ksum = ksum + array[add_index] - array[sub_index]
max_ksum = math.max(ksum, max_ksum)
end
return max_ksum
end
for k = 1, n do
print(k, maxArray(k))
end
Is there any algorithm with lower time complexity? For example, O(N log N) + additional memory.
Related topics:
Kadane's algorithm
An Efficient Solution is based on the fact that sum of a subarray (or window) of size k can be obtained in O(1) time using the sum of previous subarray (or window) of size k. Except the first subarray of size k, for other subarrays, we compute sum by removing the first element of the last window and adding the last element of the current window.
here is the implementation of the same
int maxSum(int arr[], int n, int k)
{
// k must be greater
if (n < k)
{
cout << "Invalid";
return -1;
}
// Compute sum of first window of size k
int res = 0;
for (int i=0; i<k; i++)
res += arr[i];
// Compute sums of remaining windows by
// removing first element of previous
// window and adding last element of
// current window.
int curr_sum = res;
for (int i=k; i<n; i++)
{
curr_sum += arr[i] - arr[i-k];
res = max(res, curr_sum);
}
return res;
}
Time Complexity : O(n)
Auxiliary Space : O(1)
Source
The problem can be reduced to min-sum convultion, see section 2.4 (MCSP) in https://core.ac.uk/download/pdf/84869149.pdf. Therefore, currently the best complexity you can expect is probably O(n^2/polylog(n)).
I don't think there is a more efficient solution than O(N²) if you don't add any other constraint. In other words, there is no other way to decide you have found the maximum-sum subarray but to explore all the other subarrays.
Thus the least-complex solution comprises O(N²/2) which is the overall number of contiguous subarrays of an array of given length N.
Personally I would implement this with the dynamic programming approach. The idea is having a wedge of partial results, and use them to build the current sums of the subarrays (in place of computing the whole sum through). Anyhow this gives "only" a constant speedup, thus the complexity is O(N²/2)~O(N²).
The following is pseudocode - sorry for not speaking Lua
// here we place temporary results, row by row alternating in 0 or 1
int[2][N] sum_array_buffer
// stores the start of the max subarray
int[N] max_subarray_start
// stores the value
int[N] max_subarray_value
array = {7, 1, 3, 1, 4, 5, 1, 3, 6}
// we initialize the buffer with the array (ideally 1-length subarrays)
sum_array_buffer[1] = array
// the length of subarrays - we can also start from 1 if considered
for k = 1 ; k <= (N); ++k:
// the starting position fo the sub-array
for j = 0; j < (N-k+1); ++j:
sum_array_buffer[k%2][j] = sum_array_buffer[(k+1)%2][j] + array[j+k-1]
if j == 0 || sum_array_buffer[k%2][j] > max_subarray_value[k]:
max_subarray_value = sum_array_buffer[k%2][j]
max_subarray_start[k] = j
for k = 1 ; k <= (N); ++k:
print(k, max_subarray_value[k])
Graphycally:
We create a Dequeue, Qi of capacity k, that stores only useful elements of current window of k elements. An element is useful if it is in current window and is greater than all other elements on left side of it in current window. We process all array elements one by one and maintain Qi to contain useful elements of current window and these useful elements are maintained in sorted order. The element at front of the Qi is the largest and element at rear of Qi is the smallest of current window.
int maxCrossingSum(int arr[], int l, int m, int h)
{
// Include elements on left of mid.
int sum = 0;
int left_sum = INT_MIN;
for (int i = m; i >= l; i--)
{
sum = sum + arr[i];
if (sum > left_sum)
left_sum = sum;
}
// Include elements on right of mid
sum = 0;
int right_sum = INT_MIN;
for (int i = m+1; i <= h; i++)
{
sum = sum + arr[i];
if (sum > right_sum)
right_sum = sum;
}
// Return sum of elements on left and right of mid
return left_sum + right_sum;
}
// Returns sum of maxium sum subarray in aa[l..h]
int maxSubArraySum(int arr[], int l, int h)
{
// Base Case: Only one element
if (l == h)
return arr[l];
// Find middle point
int m = (l + h)/2;
/* Return maximum of following three possible cases
a) Maximum subarray sum in left half
b) Maximum subarray sum in right half
c) Maximum subarray sum such that the subarray crosses the midpoint */
return max(maxSubArraySum(arr, l, m),
maxSubArraySum(arr, m+1, h),
maxCrossingSum(arr, l, m, h));
}
Explanation
Using Divide and Conquer approach, we can find the maximum subarray sum in O(nLogn) time. Following is the Divide and Conquer algorithm.
1) Divide the given array in two halves
2) Return the maximum of following three
….a) Maximum subarray sum in left half (Make a recursive call)
….b) Maximum subarray sum in right half (Make a recursive call)
source
The above question can be solved by O(n).
Please try this algorithm.
lets say k=3.
array = {7, 1, 3, 1, 4, 5, 1, 3, 6}
maxsum=0.
1)We start with adding 7+1+3 and store sum=11.since sum >maxsum.maxsum=11.
2)Now since size of k=3,next continuous array is 1+3+1.so how we get this sum??
remove 7 from sum and add 1 to sum.so now sum is 5.Check if sum>maxsum.
3)Similarly do for other elements as well.This loop will run until (n-1).``
Please find the code here
class Program
{
static void Main(string[] args)
{
int sum=0;
int max=0;
int size=9;
string input="7, 1, 3, 1, 4, 5, 1, 3, 6";
string[] values=input.Split(',');
int length=values.Length;
int k=size-1;
for(int i=0;i<=k;i++)
{
sum=sum+int.Parse(values[i]);
max=sum;
}
for(int j=0;k<length-1;j++)
{
++k;
sum=(sum-int.Parse(values[j]))+int.Parse(values[k]);
if(sum>max)
max=sum;
}
Console.WriteLine(max);
}
}
below process might help you
1) Pick first k elements and create a Self-Balancing Binary Search Tree (BST) of size k.
2) Run a loop for i = 0 to n – k
…..a) Get the maximum element from the BST, and print it.
…..b) Search for arr[i] in the BST and delete it from the BST.
…..c) Insert arr[i+k] into the BST.
Time Complexity:
Time Complexity of step 1 is O(kLogk). Time Complexity of steps 2(a), 2(b) and 2(c) is O(Logk). Since steps 2(a), 2(b) and 2(c) are in a loop that runs n-k+1 times, time complexity of the complete algorithm is O(kLogk + (n-k+1)*Logk) which can also be written as O(nLogk).
I recently came across a question somewhere:
Suppose you have an array of 1001 integers. The integers are in random order, but you know each of the integers is between 1 and 1000 (inclusive). In addition, each number appears only once in the array, except for one number, which occurs twice. Assume that you can access each element of the array only once. Describe an algorithm to find the repeated number. If you used auxiliary storage in your algorithm, can you find an algorithm that does not require it?
What I am interested in to know is the second part, i.e., without using auxiliary storage. Do you have any idea?
Just add them all up, and subtract the total you would expect if only 1001 numbers were used from that.
Eg:
Input: 1,2,3,2,4 => 12
Expected: 1,2,3,4 => 10
Input - Expected => 2
Update 2: Some people think that using XOR to find the duplicate number is a hack or trick. To which my official response is: "I am not looking for a duplicate number, I am looking for a duplicate pattern in an array of bit sets. And XOR is definitely suited better than ADD to manipulate bit sets". :-)
Update: Just for fun before I go to bed, here's "one-line" alternative solution that requires zero additional storage (not even a loop counter), touches each array element only once, is non-destructive and does not scale at all :-)
printf("Answer : %d\n",
array[0] ^
array[1] ^
array[2] ^
// continue typing...
array[999] ^
array[1000] ^
1 ^
2 ^
// continue typing...
999^
1000
);
Note that the compiler will actually calculate the second half of that expression at compile time, so the "algorithm" will execute in exactly 1002 operations.
And if the array element values are know at compile time as well, the compiler will optimize the whole statement to a constant. :-)
Original solution: Which does not meet the strict requirements of the questions, even though it works to find the correct answer. It uses one additional integer to keep the loop counter, and it accesses each array element three times - twice to read it and write it at the current iteration and once to read it for the next iteration.
Well, you need at least one additional variable (or a CPU register) to store the index of the current element as you go through the array.
Aside from that one though, here's a destructive algorithm that can safely scale for any N up to MAX_INT.
for (int i = 1; i < 1001; i++)
{
array[i] = array[i] ^ array[i-1] ^ i;
}
printf("Answer : %d\n", array[1000]);
I will leave the exercise of figuring out why this works to you, with a simple hint :-):
a ^ a = 0
0 ^ a = a
A non destructive version of solution by Franci Penov.
This can be done by making use of the XOR operator.
Lets say we have an array of size 5: 4, 3, 1, 2, 2
Which are at the index: 0, 1, 2, 3, 4
Now do an XOR of all the elements and all the indices. We get 2, which is the duplicate element. This happens because, 0 plays no role in the XORing. The remaining n-1 indices pair with same n-1 elements in the array and the only unpaired element in the array will be the duplicate.
int i;
int dupe = 0;
for(i = 0; i < N; i++) {
dupe = dupe ^ arr[i] ^ i;
}
// dupe has the duplicate.
The best feature of this solution is that it does not suffer from overflow problems that is seen in the addition based solution.
Since this is an interview question, it would be best to start with the addition based solution, identify the overflow limitation and then give the XOR based solution :)
This makes use of an additional variable so does not meet the requirements in the question completely.
Add all the numbers together. The final sum will be the 1+2+...+1000+duplicate number.
To paraphrase Francis Penov's solution.
The (usual) problem is: given an array of integers of arbitrary length that contain only elements repeated an even times of times except for one value which is repeated an odd times of times, find out this value.
The solution is:
acc = 0
for i in array: acc = acc ^ i
Your current problem is an adaptation. The trick is that you are to find the element that is repeated twice so you need to adapt solution to compensate for this quirk.
acc = 0
for i in len(array): acc = acc ^ i ^ array[i]
Which is what Francis' solution does in the end, although it destroys the whole array (by the way, it could only destroy the first or last element...)
But since you need extra-storage for the index, I think you'll be forgiven if you also use an extra integer... The restriction is most probably because they want to prevent you from using an array.
It would have been phrased more accurately if they had required O(1) space (1000 can be seen as N since it's arbitrary here).
Add all numbers. The sum of integers 1..1000 is (1000*1001)/2. The difference from what you get is your number.
One line solution in Python
arr = [1,3,2,4,2]
print reduce(lambda acc, (i, x): acc ^ i ^ x, enumerate(arr), 0)
# -> 2
Explanation on why it works is in #Matthieu M.'s answer.
If you know that we have the exact numbers 1-1000, you can add up the results and subtract 500500 (sum(1, 1000)) from the total. This will give the repeated number because sum(array) = sum(1, 1000) + repeated number.
Well, there is a very simple way to do this... each of the numbers between 1 and 1000 occurs exactly once except for the number that is repeated.... so, the sum from 1....1000 is 500500. So, the algorithm is:
sum = 0
for each element of the array:
sum += that element of the array
number_that_occurred_twice = sum - 500500
n = 1000
s = sum(GivenList)
r = str(n/2)
duplicate = int( r + r ) - s
public static void main(String[] args) {
int start = 1;
int end = 10;
int arr[] = {1, 2, 3, 4, 4, 5, 6, 7, 8, 9, 10};
System.out.println(findDuplicate(arr, start, end));
}
static int findDuplicate(int arr[], int start, int end) {
int sumAll = 0;
for(int i = start; i <= end; i++) {
sumAll += i;
}
System.out.println(sumAll);
int sumArrElem = 0;
for(int e : arr) {
sumArrElem += e;
}
System.out.println(sumArrElem);
return sumArrElem - sumAll;
}
No extra storage requirement (apart from loop variable).
int length = (sizeof array) / (sizeof array[0]);
for(int i = 1; i < length; i++) {
array[0] += array[i];
}
printf(
"Answer : %d\n",
( array[0] - (length * (length + 1)) / 2 )
);
Do arguments and callstacks count as auxiliary storage?
int sumRemaining(int* remaining, int count) {
if (!count) {
return 0;
}
return remaining[0] + sumRemaining(remaining + 1, count - 1);
}
printf("duplicate is %d", sumRemaining(array, 1001) - 500500);
Edit: tail call version
int sumRemaining(int* remaining, int count, int sumSoFar) {
if (!count) {
return sumSoFar;
}
return sumRemaining(remaining + 1, count - 1, sumSoFar + remaining[0]);
}
printf("duplicate is %d", sumRemaining(array, 1001, 0) - 500500);
public int duplicateNumber(int[] A) {
int count = 0;
for(int k = 0; k < A.Length; k++)
count += A[k];
return count - (A.Length * (A.Length - 1) >> 1);
}
A triangle number T(n) is the sum of the n natural numbers from 1 to n. It can be represented as n(n+1)/2. Thus, knowing that among given 1001 natural numbers, one and only one number is duplicated, you can easily sum all given numbers and subtract T(1000). The result will contain this duplicate.
For a triangular number T(n), if n is any power of 10, there is also beautiful method finding this T(n), based on base-10 representation:
n = 1000
s = sum(GivenList)
r = str(n/2)
duplicate = int( r + r ) - s
I support the addition of all the elements and then subtracting from it the sum of all the indices but this won't work if the number of elements is very large. I.e. It will cause an integer overflow! So I have devised this algorithm which may be will reduce the chances of an integer overflow to a large extent.
for i=0 to n-1
begin:
diff = a[i]-i;
dup = dup + diff;
end
// where dup is the duplicate element..
But by this method I won't be able to find out the index at which the duplicate element is present!
For that I need to traverse the array another time which is not desirable.
Improvement of Fraci's answer based on the property of XORing consecutive values:
int result = xor_sum(N);
for (i = 0; i < N+1; i++)
{
result = result ^ array[i];
}
Where:
// Compute (((1 xor 2) xor 3) .. xor value)
int xor_sum(int value)
{
int modulo = x % 4;
if (modulo == 0)
return value;
else if (modulo == 1)
return 1;
else if (modulo == 2)
return i + 1;
else
return 0;
}
Or in pseudocode/math lang f(n) defined as (optimized):
if n mod 4 = 0 then X = n
if n mod 4 = 1 then X = 1
if n mod 4 = 2 then X = n+1
if n mod 4 = 3 then X = 0
And in canonical form f(n) is:
f(0) = 0
f(n) = f(n-1) xor n
My answer to question 2:
Find the sum and product of numbers from 1 -(to) N, say SUM, PROD.
Find the sum and product of Numbers from 1 - N- x -y, (assume x, y missing), say mySum, myProd,
Thus:
SUM = mySum + x + y;
PROD = myProd* x*y;
Thus:
x*y = PROD/myProd; x+y = SUM - mySum;
We can find x,y if solve this equation.
In the aux version, you first set all the values to -1 and as you iterate check if you have already inserted the value to the aux array. If not (value must be -1 then), insert. If you have a duplicate, here is your solution!
In the one without aux, you retrieve an element from the list and check if the rest of the list contains that value. If it contains, here you've found it.
private static int findDuplicated(int[] array) {
if (array == null || array.length < 2) {
System.out.println("invalid");
return -1;
}
int[] checker = new int[array.length];
Arrays.fill(checker, -1);
for (int i = 0; i < array.length; i++) {
int value = array[i];
int checked = checker[value];
if (checked == -1) {
checker[value] = value;
} else {
return value;
}
}
return -1;
}
private static int findDuplicatedWithoutAux(int[] array) {
if (array == null || array.length < 2) {
System.out.println("invalid");
return -1;
}
for (int i = 0; i < array.length; i++) {
int value = array[i];
for (int j = i + 1; j < array.length; j++) {
int toCompare = array[j];
if (value == toCompare) {
return array[i];
}
}
}
return -1;
}