Algorithm for finding a combination of integers greater than a specified value - arrays

I've been trying to develop an algorithm that would take an input array and return an array such that the integers contained within are the combination of integers with the smallest sum greater than a specified value (limited to a combination of size k).
For instance, if I have the array [1,4,5,10,17,34] and I specified a minimum sum of 31, the function would return [1,4,10,17]. Or, if I wanted it limited to a max array size of 2, it would just return [34].
Is there an efficient way to do this? Any help would be appreciated!

Something like this? It returns the value, but could easily be adapted to return the sequence.
Algorithm: assuming sorted input, test the k-length combinations for the smallest sum greater than min, stop after the first array element greater than min.
JavaScript:
var roses = [1,4,5,10,17,34]
function f(index,current,k,best,min,K)
{
if (roses.length == index)
return best
for (var i = index; i < roses.length; i++)
{
var candidate = current + roses[i]
if (candidate == min + 1)
return candidate
if (candidate > min)
best = best < 0 ? candidate : Math.min(best,candidate)
if (roses[i] > min)
break
if (k + 1 < K)
{
var nextCandidate = f(i + 1,candidate,k + 1,best,min,K)
if (nextCandidate > min)
best = best < 0 ? nextCandidate : Math.min(best,nextCandidate)
if (best == min + 1)
return best
}
}
return best
}
Output:
console.log(f(0,0,0,-1,31,3))
32
console.log(f(0,0,0,-1,31,2))
34

This is more of a hybrid solution, with Dynamic Programming and Back Tracking. We can use Back Tracking alone to solve this problem, but then we have to do exhaustive searching (2^N) to find the solution. The DP part optimizes the search space in Back Tracking.
import sys
from collections import OrderedDict
MinimumSum = 31
MaxArraySize = 4
InputData = sorted([1,4,5,10,17,34])
# Input part is over
Target = MinimumSum + 1
Previous, Current = OrderedDict({0:0}), OrderedDict({0:0})
for Number in InputData:
for CurrentNumber, Count in Previous.items():
if Number + CurrentNumber in Current:
Current[Number + CurrentNumber] = min(Current[Number + CurrentNumber], Count + 1)
else:
Current[Number + CurrentNumber] = Count + 1
Previous = Current.copy()
FoundSolution = False
for Number, Count in Previous.items():
if (Number >= Target and Count < MaxArraySize):
MaxArraySize = Count
Target = Number
FoundSolution = True
break
if not FoundSolution:
print "Not possible"
sys.exit(0)
else:
print Target, MaxArraySize
FoundSolution = False
Solution = []
def Backtrack(CurrentIndex, Sum, MaxArraySizeUsed):
global FoundSolution
if (MaxArraySizeUsed <= MaxArraySize and Sum == Target):
FoundSolution = True
return
if (CurrentIndex == len(InputData) or MaxArraySizeUsed > MaxArraySize or Sum > Target):
return
for i in range(CurrentIndex, len(InputData)):
Backtrack(i + 1, Sum, MaxArraySizeUsed)
if (FoundSolution): return
Backtrack(i + 1, Sum + InputData[i], MaxArraySizeUsed + 1)
if (FoundSolution):
Solution.append(InputData[i])
return
Backtrack(0, 0, 0)
print sorted(Solution)
Note: As per the examples given by you in the question, Minimum sum and Maximum Array Size are strictly greater and lesser than the values specified, respectively.
For this input
MinimumSum = 31
MaxArraySize = 4
InputData = sorted([1,4,5,10,17,34])
Output is
[5, 10, 17]
where as, for this input
MinimumSum = 31
MaxArraySize = 3
InputData = sorted([1,4,5,10,17,34])
Output is
[34]
Explanation
Target = MinimumSum + 1
Previous, Current = OrderedDict({0:0}), OrderedDict({0:0})
for Number in InputData:
for CurrentNumber, Count in Previous.items():
if Number + CurrentNumber in Current:
Current[Number + CurrentNumber] = min(Current[Number + CurrentNumber], Count + 1)
else:
Current[Number + CurrentNumber] = Count + 1
Previous = Current.copy()
This part of the program finds the minimum number of numbers from the input data, required to make the sum of numbers from 1 to the maximum possible number (which is the sum of all the input data). Its a dynamic programming solution, for knapsack problem. You can read about that in the internet.
FoundSolution = False
for Number, Count in Previous.items():
if (Number >= Target and Count < MaxArraySize):
MaxArraySize = Count
Target = Number
FoundSolution = True
break
if not FoundSolution:
print "Not possible"
sys.exit(0)
else:
print Target, MaxArraySize
This part of the program, finds the Target value which matches the MaxArraySize criteria.
def Backtrack(CurrentIndex, Sum, MaxArraySizeUsed):
global FoundSolution
if (MaxArraySizeUsed <= MaxArraySize and Sum == Target):
FoundSolution = True
return
if (CurrentIndex == len(InputData) or MaxArraySizeUsed > MaxArraySize or Sum > Target):
return
for i in range(CurrentIndex, len(InputData)):
Backtrack(i + 1, Sum, MaxArraySizeUsed)
if (FoundSolution): return
Backtrack(i + 1, Sum + InputData[i], MaxArraySizeUsed + 1)
if (FoundSolution):
Solution.append(InputData[i])
return
Backtrack(0, 0, 0)
Now that we know that the solution exists, we want to recreate the solution. We use backtracking technique here. You can easily find lot of good tutorials about this also in the internet.

Related

Find the kth lowest sum of unique pair in an array

Given an array of numbers, each number represents the difficulty of a problem. People standing in a line should choose any two problems to solve. The two chosen problems should be different and the pair of problems should not be picked by anyone previously. Since they know the difficulties they'll choose the pair whose sum of difficulties is minimum.
Find the minimum sum of difficulties for the person whose standing in kth position in the line. i.e kth minimum sum of unique pair from an array.
Approach 1: Brute force approach(O(n2)) to calculate all possible unique sum and store that in an array and sorted the unique sum array to get the kth element.
Approach 2: Sort the difficulties array and choose the minimal elements(for first 4 elements we can have 6 unique pairs. so if k is less than or equal to 6 we can use first 4 elements in the sorted array to find the minimum sum) and did the approach 1 with the minimal array.
These 2 approaches did not solve the timeout cases. Need a solution with improved time efficiency.
Note: Different problem can have same difficulty level(i.e. array can contain duplicate numbers) also and not in sorted order by default.
difficulties = [1,4,3,2,4]
Person comes first chooses: 1+2 = 3
2nd person: 1+3 = 4
3rd person: 1+4 (or) 1+4(since difficulty of two problems are 4) (or) 2+3 = 5
4th person: 2+3 (or) 1+4(based on the previous selection) = 5
Final answer needed is only the minimum sum not the actual elements.
Assume the constraints to be:
2 <= N <= 105
1 <= k <= N*(N-1)/2
1 <= difficulties[i] <= 109
where,
N is the length of the array
k is the position in which the person has to choose the problems
Assuming k <= n*(n-1)/2. If not, then no answer possible.
We can use binary search to solve the problem. We binary search on the possible sum of pairs.
Here, low = minimum sum possible i.e. low = difficulties[0] + difficulties[1], and high = maximum sum possible i.e. high = difficulties[n-1] + difficulties[n-2].
So, mid = low + (high - low)/2
Now, in 1 iteration of binary search we would count the pairs of indices (i, j), i < j such that difficulties[i] + difficulties[j] <= mid. If the count is less than k, low = mid + 1 else if count >= k, high = mid. Now, this one iteration can be done in O(NlogN).
You can do this till (high - low) > 1. So, each time you reduce your search space by half. So, total time complexity would be O(N*logN*logMaxsum) which for N <= 1e6 and difficulties[i] <= 1e18 would run in less than 1s.
Now high can be equal to low or high can be equal to low +1. The answer can be equal to low or high. Now, you just need to solve the problem whether low is a possible sum(can be solved easily in O(N) using Hashing) and no. of pairs of indices (i, j), i < j such that difficulties[i] + difficulties[j] <= low. If both conditions satify then this is your answer. If not then high is the answer.
Running an example testcase:
Lets' consider the initial array, difficulties = [1, 4, 3, 2, 4] and k = 6.
You first sort the array costing us O(NlogN). After sorting difficulties = [1, 2, 3, 4, 4]
All the pairs n*(n-1)/2 = 10 would be:
(1 + 2) => 3
(1 + 3) => 4
(1 + 4) => 5
(1 + 4) => 5
(2 + 3) => 5
(2 + 4) => 6
(2 + 4) => 6
(3 + 4) => 7
(3 + 4) => 7
(4 + 4) => 8
This is more of a pseudocode to understand the running of the logic.
sort(difficulties)
low = difficulties[0] + difficulties[1] // Minimum possible sum
high = difficulties[n-1] + difficulties[n-2] // Maximum possible sum
while(high - low > 1){
mid = low + (high - low)/2
count = all pairs (i, j) and i < j such that difficulties[i] + difficulties[j] <= mid.
if(count < k){
low = mid +1
}else{
high = mid
}
}
Iteration 1:
low = 3
high = 8
mid = 5
count = 5 [(1 + 2), (1 + 3), (1 + 4), (1 + 4), (2 + 3)]
count < k, so low = mid + 1 = 6
----------
Iteration 2:
low = 6
high = 8
mid = 7
count = 9 [(1 + 2), (1 + 3), (1 + 4), (1 + 4), (2 + 3), (2 + 4), (2 + 4), (3 + 4), (3 + 4)]
count >= k, so high= mid = 7
Now, while loop stops since high(7) - low(6) = 1.
Now, you need to check if sum 6 is possible and count of all (i, j) >= k. if it is then low is the answer and in this case it is true. So, answer = 6 for k = 6.
To implement the count thing, you can again do a binary search. Choose first index as i then you just need to find the upper bound of mid - difficulties[i] in the array [i+1, n-1]. Then increment i by 1 and repeat the same. So, you go over every index 0 <= i <= n-1 and find its upper bound in the array search space of [i+1, n-1] and this each iteration takes O(NlogN).
To see why is the last step of checking if low or high is a possible sum or not, try running the algorithm for the array difficulties = [10, 40, 30, 20, 40].
UPDATE:
Below is the complete working code with the time complexity of O(N*logN*logMaxsum) including comments for clear understanding of the logic.
#include<bits/stdc++.h>
#define ll long long int
using namespace std;
void solve();
int main(){
solve();
return 0;
}
map<int, int> m;
vector<ll> difficulties;
ll countFunction(ll sum){
/*
Function to count all the pairs of indices (i, j) such that
i < j and (difficulties[i] + difficulties[j]) <= sum
*/
ll count = 0;
int n = (int)difficulties.size();
for(int i=0;i<n-1;i++){
/*
Here the outer for loop means that if I choose difficulties[i]
as the first element of the pair, then the remaining sum is
m - difficulties[i], so we just need to find the upper_bound of this value
to find the count of all pairs with sum <= m.
upper_bound is an in-built function in C++ STL.
*/
int x= upper_bound(difficulties.begin(), difficulties.end(), sum-difficulties[i]) - (difficulties.begin() + i + 1);
if(x<=0){
/*
We break here because the condition of i < j is violated
and it will be violated for remaining values of i as well.
*/
break;
}
//cout<<"x = "<<x<<endl;
count += x;
}
return count;
}
bool isPossible(ll sum){
/*
Hashing based solution to check if atleast 1 pair with
a particular exists in the difficultiesay.
*/
int n = (int) difficulties.size();
for(int i=0;i<n;i++){
/*
Choosing the ith element as first element of pair
and checking if there exists an element with value = sum - difficulties[i]
*/
if(difficulties[i] == (sum - difficulties[i])){
// If the elements are equal then the frequency must be > 1
if(m[difficulties[i]] > 1){
return true;
}
}else{
if(m[sum - difficulties[i]] > 0){
return true;
}
}
}
return false;
}
void solve(){
ll i, j, n, k;
cin>>n>>k;
difficulties.resize(n);
m.clear(); // to run multiple test-cases
for(i=0;i<n;i++){
cin>>difficulties[i];
m[difficulties[i]]++;
}
sort(difficulties.begin(), difficulties.end());
// Using binary search on the possible values of sum.
ll low = difficulties[0] + difficulties[1]; // Lowest possible sum after sorting
ll high = difficulties[n-1] + difficulties[n-2]; // Highest possible sum after sorting
while((high-low)>1){
ll mid = low + (high - low)/2;
ll count = countFunction(mid);
//cout<<"Low = "<<low<<" high = "<<high<<" mid = "<<mid<<" count = "<<count<<endl;
if (k > count){
low = mid + 1;
}else{
high = mid;
}
}
/*
Now the answer can be low or high and we need to check
if low or high is a possible sum and does it satisfy the constraints of k.
For low to be the answer, we need to count the number of pairs with sum <=low.
If this count is >=k, then low is the answer.
But we also need to check whether low is a feasible sum.
*/
if(isPossible(low) && countFunction(low)>=k){
cout<<low<<endl;
}else{
cout<<high<<endl;
}
}

Minimum number of operations to make pair sums of array equal

You are given a list of integers nums of even length. Consider an operation where you pick any number in nums and update it with a value between [1, max(nums)]. Return the number of operations required such that for every i, nums[i] + nums[n - 1 - i] equals to the same number. The problem can be solved greedily.
Note: n is the size of the array and max(nums) is the maximum element in nums.
For example: nums = [1,5,4,5,9,3] the expected operations are 2.
Explanation: The maxnums is 9, so I can change any element of nums to any number between [1, 9] which costs one operation.
Choose 1 at index 0 and change it to 6
Choose 9 at index 4 and change it to 4.
Now this makes the nums[0] + nums[5] = nums[1] + nums[4] = nums[2] + nums[3] = 9. We had changed 2 numbers and it cost us 2 operations which is the minimum for this input.
The approach that I've used is to find the median of the sums and use that to find the number of operations greedily.
Let us find the all the sums of the array based on the given condition.
Sums can be calculated by nums[i] + nums[n-1-i].
Let i = 0, nums[0] + nums[6-1-0] = 4.
i = 1, nums[1] + nums[6-1-1] = 14.
i = 2, nums[2] + nums[6-1-2] = 9.
Store these sums in an array and sort it.
sums = [4,9,14] after sorting. Now find the median from sums which is 9 as it is the middle element.
Now I use this median to equalize the sums and we can find the number of operations. I've also added the code that I use to calculate the number of operations.
int operations = 0;
for(int i=0; i<nums.size()/2; i++) {
if(nums[i] + nums[nums.size()-1-i] == mid)
continue;
if(nums[i] + nums[nums.size()-1-i] > mid) {
if(nums[i] + 1 <= mid || 1 + nums[nums.size()-1-i] <= mid) {
operations++;
} else {
operations += 2;
}
} else if (maxnums + nums[nums.size()-1-i] >= mid || nums[i] + maxnums >= mid) {
operations++;
} else {
operations += 2;
}
}
The total operations for this example is 2 which is correct.
The problem here is that, for some cases choosing the median gives the wrong result. For example, the nums = [10, 7, 2, 9, 4, 1, 7, 3, 10, 8] expects 5 operations but my code gives 6 if the median (16) was chosen.
Is choosing the median not the most optimal approach? Can anyone help provide a better approach?
I think the following should work:
iterate pairs of numbers
for each pair, calculate the sum of that pair, as well as the min and max sum that can be achieved by changing just one of the values
update a dictionary/map with -1 when starting a new "region" requiring one fewer change, and +1 when that region is over
iterate the boundaries in that dictionary and update the total changes needed to find the sum that requires the fewest updates
Example code in Python, giving 9 as the best sum for your example, requiring 5 changes.
from collections import defaultdict
nums = [10, 7, 2, 9, 4, 1, 7, 3, 10, 8]
m = max(nums)
pairs = [(nums[i], nums[-1-i]) for i in range(len(nums)//2)]
print(pairs)
score = defaultdict(int)
for a, b in map(sorted, pairs):
low = a + 1
high = m + b
score[low] -= 1
score[a+b] -= 1
score[a+b+1] += 1
score[high+1] += 1
print(sorted(score.items()))
cur = best = len(nums)
num = None
for i in sorted(score):
cur += score[i]
print(i, cur)
if cur < best:
best, num = cur, i
print(best, num)
The total complexity of this should be O(nlogn), needing O(n) to create the dictionary, O(nlogn) for sorting, and O(n) for iterating the sorted values in that dictionary. (Do not use an array or the complexity could be much higher if max(nums) >> len(nums))
(UPDATED receiving additional information)
The optimal sum must be one of the following:
a sum of a pair -> because you can keep both numbers of that pair
the min value of a pair + 1 -> because it is the smallest possible sum you only need to change 1 of the numbers for that pair
the max value of a pair + the max overall value -> because it is the largest possible sum you only need to change 1 of the numbers for that pair
Hence, there are order N possible sums.
The total number of operations for this optimal sum can be calculated in various ways.
The O(N²) is quite trivial. And you can implement it quite easily if you want to confirm other solutions work.
Making it O(N log N)
getting all possible optimal sums O(N)
for each possible sum you can calculate occ the number of pairs having that exact sum and thus don't require any manipulation. O(N)
For all other pairs you just need to know if it requires 1 or 2 operations to get to that sum. Which is 2 when it is either impossible if the smallest of the pair is too big to reach sum with the smallest possible number or when the largest of the pair is too small to reach the sum with the largest possible number. Many data structures could be used for that (BIT, Tree, ..). I just used a sorted list and applied binary search (not exhaustively tested though). O(N log N)
Example solution in java:
int[] nums = new int[] {10, 7, 2, 9, 4, 1, 7, 3, 10, 8};
// preprocess pairs: O(N)
int min = 1
, max = nums[0];
List<Integer> minList = new ArrayList<>();
List<Integer> maxList = new ArrayList<>();
Map<Integer, Integer> occ = new HashMap<>();
for (int i=0;i<nums.length/2;i++) {
int curMin = Math.min(nums[i], nums[nums.length-1-i]);
int curMax = Math.max(nums[i], nums[nums.length-1-i]);
min = Math.min(min, curMin);
max = Math.max(max, curMax);
minList.add(curMin);
maxList.add(curMax);
// create all pair sums
int pairSum = nums[i] + nums[nums.length-1-i];
int currentOccurences = occ.getOrDefault(pairSum, 0);
occ.put(pairSum, currentOccurences + 1);
}
// sorting 0(N log N)
Collections.sort(minList);
Collections.sort(maxList);
// border cases
for (int a : minList) {
occ.putIfAbsent(a + max, 0);
}
for (int a : maxList) {
occ.putIfAbsent(a + min, 0);
}
// loop over all condidates O(N log N)
int best = (nums.length-2);
int med = max + min;
for (Map.Entry<Integer, Integer> entry : occ.entrySet()) {
int sum = entry.getKey();
int count = entry.getValue();
int requiredChanges = (nums.length / 2) - count;
if (sum > med) {
// border case where max of pair is too small to be changed to pair of sum
requiredChanges += countSmaller(maxList, sum - max);
} else if (sum < med) {
// border case where having a min of pair is too big to be changed to pair of sum
requiredChanges += countGreater(minList, sum - min);
}
System.out.println(sum + " -> " + requiredChanges);
best = Math.min(best, requiredChanges);
}
System.out.println("Result: " + best);
}
// O(log N)
private static int countGreater(List<Integer> list, int key) {
int low=0, high=list.size();
while(low < high) {
int mid = (low + high) / 2;
if (list.get(mid) <= key) {
low = mid + 1;
} else {
high = mid;
}
}
return list.size() - low;
}
// O(log N)
private static int countSmaller(List<Integer> list, int key) {
int low=0, high=list.size();
while(low < high) {
int mid = (low + high) / 2;
if (list.get(mid) < key) {
low = mid + 1;
} else {
high = mid;
}
}
return low;
}
Just to offer some theory -- we can easily show that the upper bound for needed changes is n / 2, where n is the number of elements. This is because each pair can be made in one change to anything between 1 + C and max(nums) + C, where C is any of the two elements in a pair. For the smallest C, we can bind max(nums) + 1 at the highest; and for the largest C, we can bind 1 + max(nums) at the lowest.
Since those two bounds at the worst cases are equal, we are guaranteed there is some solution with at most N / 2 changes that leaves at least one C (array element) unchanged.
From that we conclude that an optimal solution either (1) has at least one pair where neither element is changed and the rest require only one change per pair, or (2) our optimal solution has n / 2 changes as discussed above.
We can therefore proceed to test each existing pair's single or zero change possibilities as candidates. We can iterate over a sorted list of two to three possibilities per pair, labeled with each cost and index. (Other authors on this page have offered similar ways and code.)

Calculate the maximal even cost subarray

I am new to Algorithms and Competitive Programming. I am learning about Dynamic programming and I have a problem as below:
Given an array with n numbers. Define a sub-array is a[i, j] = {a[i], a[i + 1], ..., a[j]}, in other words, elements must be contiguous.
The problem is the find the maximum weight of a sub-array such that
that weight is an even number.
The input is 2 <= n <= 1000000; -100 <= a[i] <= 100
Sample test:
5
-2 1 -4 4 9
Output: 10
For this problem, I can do brute force but with a large value of n, I can not do it with the time limit is 1 second. Therefore, I want to change it to Dynamic programming.
I have an idea but I do not know if it works. I think I can divide this problem into two sub-problems. For each element/number, I consider if it is odd/even and then find the largest sum with its corresponding property (odd + odd or even + even to get a even sum). However, that is just what I think and I really need your help.
Here is C++ algorithm with O(n) time complexity:
const int Inf = 1e9;
int main() {
int n = 5;
vector<int> inputArray = {-2, 1, -4, 4, 9};
int minEvenPrefixSum = 0, minOddPrefixSum = Inf;
bool isOddPrefixSumFound = false;
int prefixSum = 0, answer = -Inf;
for(int i = 0; i < n; ++i) {
prefixSum += inputArray[i];
if(abs(prefixSum) % 2 == 0) {
answer = max(answer, prefixSum - minEvenPrefixSum);
minEvenPrefixSum = min(minEvenPrefixSum, prefixSum);
} else {
if(isOddPrefixSumFound) {
answer = max(answer, prefixSum - minOddPrefixSum);
}
isOddPrefixSumFound = true;
minOddPrefixSum = min(minOddPrefixSum, prefixSum);
}
}
if(answer == -Inf) {
cout << "There is no subarray with even sum";
} else {
cout << answer;
}
}
Explanation:
As #nico-schertler mentioned in commentary this task is very similar with more basic problem of the maximum-sum contiguous sub array. How to solve basic task with O(n) time complexity you can read here.
Now let's store not just one value of the minimum prefix sum, but two. One is for minimum even prefix sum, and the other is for minimum odd prefix sum. As a result, when we process the next number, we look at what the value of the prefix sum becomes. If it is even, we try to update the answer using the minimum even value of the prefix sum, in the other case using the minimum odd value of the prefix sum.

Given an array of numbers, find a number to start with such that the sum is never negative

Given an array of integers, find the smallest number X to start with, such that adding elements of array to X, the sum is always greater than 0
If given array is {-2, 3, 1, -5}
For example, in the above array, X should be 4
Explanation:
If, we start with 4, then adding first number -2, array sum becomes 4 + (-2) = 2 (which is >0)
Now adding next element 3 to current sum which is 2, 2+ 3 = 5 (which is >0)
Adding next element 1 to new sum 5 gives, 5 + 1 = 6 (which is >0)
Adding last element -5 to new sum 6 gives 6 + (-5) = 1, which is again greater than zero.
Given an array of integers, How can I find the smallest number X?
Find minimum of cumulative sum for given array.
Needed result is 1 - MinC
A = [-2, 3, 1, -5]
CSum = 0
MinC = 10000000000
for x in A:
CSum += x
MinC = min(MinC, CSum)
Addend = 1 - MinC
print(Addend)
>>> 4
First we should check if element at ith position is greater than zero or not. If element is greater than zero then no problem and we can add it directly to our sum, ans remains unchanged. But if not greater than zero then we first check whether sum is greater than element or not. If sum is greater than abs(arr[I]) then then we decrease our sum by abs(arr[I]), ans remains unchanged but if abs(arr[I]) is greater than sum then we add abs(arr[i]-sum+1) to ans and initialise sum to 1.
Below is code for same:
#include<bits/stdc++.h>
using namespace std;
typedef long long ll;
int main()
{
// considering x>=0(x can't take values less than 0)
ll arr[] = { -2, 3, 1, -5};
ll n = sizeof(arr) / sizeof(ll);
ll ans = 0; // My answer
ll sum = 0; // current sum of elements
for (ll i = 0; i < n; i++)
{
if (arr[i] <= 0)
{
if ((sum + arr[i]) <= 0)
{
ans += (abs(sum + arr[i]) + 1); //add 1 to make sum=1
sum = 1;
}
else
sum += arr[i]; // added because arr[i]<=0,(sum-=abs(arr[i]))
}
else
sum += arr[i];
}
cout << ans;
return 0;
}
// ans=4 for this case
you should find miniminum subarray sum first. lets say it comes equal to k.
your answer should be k+1. you can find minimum sum here :
https://www.geeksforgeeks.org/smallest-sum-contiguous-subarray/
This problem can be solved with the simple logic by precomputing the sum of all the elements present in the array.
Find the Sum of all the elements present in the array.
If sum equals zero return 1 as it will give smallest number on adding which we will get the total sum greater than zero.
If sum greater than zero return a negative number just less than that sum ( handle the special case of sum == 1 seperately)
If sum is negative then return the positive number just one greater than the sum in magnitude.
Step 1 : Find the Sum of all the Elements in the array
int sum = 0;
for(int i = 0; i < arr.size(); i++)
{
sum += arr[i];
}
Step 2 : Check The Sum obtained and Apply the following logic
if(sum == 0)
return 1;
else if( sum < 0)
return abs(sum)+1;
else
{
if( sum == 1)
return 0;
else
return -(sum-1);
}

Space-efficient algorithm for finding the largest balanced subarray?

given an array of 0s and 1s, find maximum subarray such that number of zeros and 1s are equal.
This needs to be done in O(n) time and O(1) space.
I have an algo which does it in O(n) time and O(n) space. It uses a prefix sum array and exploits the fact that if the number of 0s and 1s are same then
sumOfSubarray = lengthOfSubarray/2
#include<iostream>
#define M 15
using namespace std;
void getSum(int arr[],int prefixsum[],int size) {
int i;
prefixsum[0]=arr[0]=0;
prefixsum[1]=arr[1];
for (i=2;i<=size;i++) {
prefixsum[i]=prefixsum[i-1]+arr[i];
}
}
void find(int a[],int &start,int &end) {
while(start < end) {
int mid = (start +end )/2;
if((end-start+1) == 2 * (a[end] - a[start-1]))
break;
if((end-start+1) > 2 * (a[end] - a[start-1])) {
if(a[start]==0 && a[end]==1)
start++; else
end--;
} else {
if(a[start]==1 && a[end]==0)
start++; else
end--;
}
}
}
int main() {
int size,arr[M],ps[M],start=1,end,width;
;
cin>>size;
arr[0]=0;
end=size;
for (int i=1;i<=size;i++)
cin>>arr[i];
getSum(arr,ps,size);
find(ps,start,end);
if(start!=end)
cout<<(start-1)<<" "<<(end-1)<<endl; else cout<<"No soln\n";
return 0;
}
Now my algorithm is O(n) time and O(Dn) space where Dn is the total imblance in the list.
This solution doesn't modify the list.
let D be the difference of 1s and 0s found in the list.
First, let's step linearily through the list and calculate D, just to see how it works:
I'm gonna use this list as an example : l=1100111100001110
Element D
null 0
1 1
1 2 <-
0 1
0 0
1 1
1 2
1 3
1 4
0 3
0 2
0 1
0 0
1 1
1 2
1 3
0 2 <-
Finding the longest balanced subarray is equivalent to finding 2 equal elements in D that are the more far appart. (in this example the 2 2s marked with arrows.)
The longest balanced subarray is between first occurence of element +1 and last occurence of element. (first arrow +1 and last arrow : 00111100001110)
Remark:
The longest subarray will always be between 2 elements of D that are
between [0,Dn] where Dn is the last element of D. (Dn = 2 in the
previous example) Dn is the total imbalance between 1s and 0s in the
list. (or [Dn,0] if Dn is negative)
In this example it means that I don't need to "look" at 3s or 4s
Proof:
Let Dn > 0 .
If there is a subarray delimited by P (P > Dn). Since 0 < Dn < P,
before reaching the first element of D which is equal to P we reach one
element equal to Dn. Thus, since the last element of the list is equal to Dn, there is a longest subarray delimited by Dns than the one delimited by Ps.And therefore we don't need to look at Ps
P cannot be less than 0 for the same reasons
the proof is the same for Dn <0
Now let's work on D, D isn't random, the difference between 2 consecutive element is always 1 or -1. Ans there is an easy bijection between D and the initial list. Therefore I have 2 solutions for this problem:
the first one is to keep track of first and last appearance of each
element in D that are between 0 and Dn (cf remark).
second is to transform the list into D, and then work on D.
FIRST SOLUTION
For the time being I cannot find a better approach than the first one:
First calculate Dn (in O(n)) . Dn=2
Second instead of creating D, create a dictionnary where the keys are the value of D (between [0 and Dn]) and the value of each keys is a couple (a,b) where a is the first occurence of the key and b the last.
Element D DICTIONNARY
null 0 {0:(0,0)}
1 1 {0:(0,0) 1:(1,1)}
1 2 {0:(0,0) 1:(1,1) 2:(2,2)}
0 1 {0:(0,0) 1:(1,3) 2:(2,2)}
0 0 {0:(0,4) 1:(1,3) 2:(2,2)}
1 1 {0:(0,4) 1:(1,5) 2:(2,2)}
1 2 {0:(0,4) 1:(1,5) 2:(2,6)}
1 3 { 0:(0,4) 1:(1,5) 2:(2,6)}
1 4 {0:(0,4) 1:(1,5) 2:(2,6)}
0 3{0:(0,4) 1:(1,5) 2:(2,6) }
0 2 {0:(0,4) 1:(1,5) 2:(2,9) }
0 1 {0:(0,4) 1:(1,10) 2:(2,9) }
0 0 {0:(0,11) 1:(1,10) 2:(2,9) }
1 1 {0:(0,11) 1:(1,12) 2:(2,9) }
1 2 {0:(0,11) 1:(1,12) 2:(2,13)}
1 3 {0:(0,11) 1:(1,12) 2:(2,13)}
0 2 {0:(0,11) 1:(1,12) 2:(2,15)}
and you chose the element with the largest difference : 2:(2,15) and is l[3:15]=00111100001110 (with l=1100111100001110).
Time complexity :
2 passes, the first one to caclulate Dn, the second one to build the
dictionnary.
find the max in the dictionnary.
Total is O(n)
Space complexity:
the current element in D : O(1) the dictionnary O(Dn)
I don't take 3 and 4 in the dictionnary because of the remark
The complexity is O(n) time and O(Dn) space (in average case Dn <<
n).
I guess there is may be a better way than a dictionnary for this approach.
Any suggestion is welcome.
Hope it helps
SECOND SOLUTION (JUST AN IDEA NOT THE REAL SOLUTION)
The second way to proceed would be to transform your list into D. (since it's easy to go back from D to the list it's ok). (O(n) time and O(1) space, since I transform the list in place, even though it might not be a "valid" O(1) )
Then from D you need to find the 2 equal element that are the more far appart.
it looks like finding the longest cycle in a linked list, A modification of Richard Brent algorithm might return the longest cycle but I don't know how to do it, and it would take O(n) time and O(1) space.
Once you find the longest cycle, go back to the first list and print it.
This algorithm would take O(n) time and O(1) space complexity.
Different approach but still O(n) time and memory. Start with Neil's suggestion, treat 0 as -1.
Notation: A[0, …, N-1] - your array of size N, f(0)=0, f(x)=A[x-1]+f(x-1) - a function
If you'd plot f, you'll see, that what you look for are points for which f(m)=f(n), m=n-2k where k-positive natural. More precisely, only for x such that A[x]!=A[x+1] (and the last element in an array) you must check whether f(x) already occurred. Unfortunately, now I see no improvement over having array B[-N+1…N-1] where such information would be stored.
To complete my thought: B[x]=-1 initially, B[x]=p when p = min k: f(k)=x . And the algorithm is (double-check it, as I'm very tired):
fx = 0
B = new array[-N+1, …, N-1]
maxlen = 0
B[0]=0
for i=1…N-1 :
fx = fx + A[i-1]
if B[fx]==-1 :
B[fx]=i
else if ((i==N-1) or (A[i-1]!=A[i])) and (maxlen < i-B[fx]):
We found that A[B[fx], …, i] is best than what we found so far
maxlen = i-B[fx]
Edit: Two bed-thoughts (= figured out while laying in bed :P ):
1) You could binary search the result by the length of subarray, which would give O(n log n) time and O(1) memory algorithm. Let's use function g(x)=x - x mod 2 (because subarrays which sum to 0 are always of even length). Start by checking, if the whole array sums to 0. If yes -- we're done, otherwise continue. We now assume 0 as starting point (we know there's subarray of such length and "summing-to-zero property") and g(N-1) as ending point (we know there's no such subarray). Let's do
a = 0
b = g(N-1)
while a<b :
c = g((a+b)/2)
check if there is such subarray in O(n) time
if yes:
a = c
if no:
b = c
return the result: a (length of maximum subarray)
Checking for subarray with "summing-to-zero property" of some given length L is simple:
a = 0
b = L
fa = fb = 0
for i=0…L-1:
fb = fb + A[i]
while (fa != fb) and (b<N) :
fa = fa + A[a]
fb = fb + A[b]
a = a + 1
b = b + 1
if b==N:
not found
found, starts at a and stops at b
2) …can you modify input array? If yes and if O(1) memory means exactly, that you use no additional space (except for constant number of elements), then just store your prefix table values in your input array. No more space used (except for some variables) :D
And again, double check my algorithms as I'm veeery tired and could've done off-by-one errors.
Like Neil, I find it useful to consider the alphabet {±1} instead of {0, 1}. Assume without loss of generality that there are at least as many +1s as -1s. The following algorithm, which uses O(sqrt(n log n)) bits and runs in time O(n), is due to "A.F."
Note: this solution does not cheat by assuming the input is modifiable and/or has wasted bits. As of this edit, this solution is the only one posted that is both O(n) time and o(n) space.
A easier version, which uses O(n) bits, streams the array of prefix sums and marks the first occurrence of each value. It then scans backward, considering for each height between 0 and sum(arr) the maximal subarray at that height. Some thought reveals that the optimum is among these (remember the assumption). In Python:
sum = 0
min_so_far = 0
max_so_far = 0
is_first = [True] * (1 + len(arr))
for i, x in enumerate(arr):
sum += x
if sum < min_so_far:
min_so_far = sum
elif sum > max_so_far:
max_so_far = sum
else:
is_first[1 + i] = False
sum_i = 0
i = 0
while sum_i != sum:
sum_i += arr[i]
i += 1
sum_j = sum
j = len(arr)
longest = j - i
for h in xrange(sum - 1, -1, -1):
while sum_i != h or not is_first[i]:
i -= 1
sum_i -= arr[i]
while sum_j != h:
j -= 1
sum_j -= arr[j]
longest = max(longest, j - i)
The trick to get the space down comes from noticing that we're scanning is_first sequentially, albeit in reverse order relative to its construction. Since the loop variables fit in O(log n) bits, we'll compute, instead of is_first, a checkpoint of the loop variables after each O(√(n log n)) steps. This is O(n/√(n log n)) = O(√(n/log n)) checkpoints, for a total of O(√(n log n)) bits. By restarting the loop from a checkpoint, we compute on demand each O(√(n log n))-bit section of is_first.
(P.S.: it may or may not be my fault that the problem statement asks for O(1) space. I sincerely apologize if it was I who pulled a Fermat and suggested that I had a solution to a problem much harder than I thought it was.)
If indeed your algorithm is valid in all cases (see my comment to your question noting some corrections to it), notice that the prefix array is the only obstruction to your constant memory goal.
Examining the find function reveals that this array can be replaced with two integers, thereby eliminating the dependence on the length of the input and solving your problem. Consider the following:
You only depend on two values in the prefix array in the find function. These are a[start - 1] and a[end]. Yes, start and end change, but does this merit the array?
Look at the progression of your loop. At the end, start is incremented or end is decremented only by one.
Considering the previous statement, if you were to replace the value of a[start - 1] by an integer, how would you update its value? Put another way, for each transition in the loop that changes the value of start, what could you do to update the integer accordingly to reflect the new value of a[start - 1]?
Can this process can be repeated with a[end]?
If, in fact, the values of a[start - 1] and a[end] can be reflected with two integers, doesn't the whole prefix array no longer serve a purpose? Can't it therefore be removed?
With no need for the prefix array and all storage dependencies on the length of the input removed, your algorithm will use a constant amount of memory to achieve its goal, thereby making it O(n) time and O(1) space.
I would prefer you solve this yourself based on the insights above, as this is homework. Nevertheless, I have included a solution below for reference:
#include <iostream>
using namespace std;
void find( int *data, int &start, int &end )
{
// reflects the prefix sum until start - 1
int sumStart = 0;
// reflects the prefix sum until end
int sumEnd = 0;
for( int i = start; i <= end; i++ )
sumEnd += data[i];
while( start < end )
{
int length = end - start + 1;
int sum = 2 * ( sumEnd - sumStart );
if( sum == length )
break;
else if( sum < length )
{
// sum needs to increase; get rid of the lower endpoint
if( data[ start ] == 0 && data[ end ] == 1 )
{
// sumStart must be updated to reflect the new prefix sum
sumStart += data[ start ];
start++;
}
else
{
// sumEnd must be updated to reflect the new prefix sum
sumEnd -= data[ end ];
end--;
}
}
else
{
// sum needs to decrease; get rid of the higher endpoint
if( data[ start ] == 1 && data[ end ] == 0 )
{
// sumStart must be updated to reflect the new prefix sum
sumStart += data[ start ];
start++;
}
else
{
// sumEnd must be updated to reflect the new prefix sum
sumEnd -= data[ end ];
end--;
}
}
}
}
int main() {
int length;
cin >> length;
// get the data
int data[length];
for( int i = 0; i < length; i++ )
cin >> data[i];
// solve and print the solution
int start = 0, end = length - 1;
find( data, start, end );
if( start == end )
puts( "No soln" );
else
printf( "%d %d\n", start, end );
return 0;
}
This algorithm is O(n) time and O(1) space. It may modify the source array, but it restores all the information back. So it is not working with const arrays. If this puzzle has several solutions, this algorithm picks the solution nearest to the array beginning. Or it might be modified to provide all solutions.
Algorithm
Variables:
p1 - subarray start
p2 - subarray end
d - difference of 1s and 0s in the subarray
Calculate d, if d==0, stop. If d<0, invert the array and after balanced subarray is found invert it back.
While d > 0 advance p2: if the array element is 1, just decrement both p2 and d. Otherwise p2 should pass subarray of the form 11*0, where * is some balanced subarray. To make backtracking possible, 11*0? is changed to 0?*00 (where ? is the value next to the subarray). Then d is decremented.
Store p1 and p2.
Backtrack p2: if the array element is 1, just increment p2. Otherwise we found element, changed on step 2. Revert the changes and pass subarray of the form 11*0.
Advance p1: if the array element is 1, just increment p1. Otherwise p1 should pass subarray of the form 0*11.
Store p1 and p2, if p2 - p1 improved.
If p2 is at the end of the array, stop. Otherwise continue with step 4.
How does it work
Algorithm iterates through all possible positions of the balanced subarray in the input array. For each subarray position p1 and p2 are kept as far from each other as possible, providing locally longest subarray. Subarray with maximum length is chosen between all these subarrays.
To determine the next best position for p1, it is advanced to the first position where the balance between 1s and 0s is changed by one. (Step 5).
To determine the next best position for p2, it is advanced to the last position where the balance between 1s and 0s is changed by one. To make it possible, step 2 detects all such positions (starting from the array's end) and modifies the array in such a way, that it is possible to iterate through these positions with linear search. (Step 4).
While performing step 2, two possible conditions may be met. Simple one: when value '1' is found; pointer p2 is just advanced to the next value, no special treatment needed. But when value '0' is found, balance is going in wrong direction, it is necessary to pass through several bits until correct balance is found. All these bits are of no interest to the algorithm, stopping p2 there will give either a balanced subarray, which is too short, or a disbalanced subarray. As a result, p2 should pass subarray of the form 11*0 (from right to left, * means any balanced subarray). There is no chance to go the same way in other direction. But it is possible to temporary use some bits from the pattern 11*0 to allow backtracking. If we change first '1' to '0', second '1' to the value next to the rightmost '0', and clear the value next to the rightmost '0': 11*0? -> 0?*00, then we get the possibility to (first) notice the pattern on the way back, since it starts with '0', and (second) find the next good position for p2.
C++ code:
#include <cstddef>
#include <bitset>
static const size_t N = 270;
void findLargestBalanced(std::bitset<N>& a, size_t& p1s, size_t& p2s)
{
// Step 1
size_t p1 = 0;
size_t p2 = N;
int d = 2 * a.count() - N;
bool flip = false;
if (d == 0) {
p1s = 0;
p2s = N;
return;
}
if (d < 0) {
flip = true;
d = -d;
a.flip();
}
// Step 2
bool next = true;
while (d > 0) {
if (p2 < N) {
next = a[p2];
}
--d;
--p2;
if (a[p2] == false) {
if (p2+1 < N) {
a[p2+1] = false;
}
int dd = 2;
while (dd > 0) {
dd += (a[--p2]? -1: 1);
}
a[p2+1] = next;
a[p2] = false;
}
}
// Step 3
p2s = p2;
p1s = p1;
do {
// Step 4
if (a[p2] == false) {
a[p2++] = true;
bool nextToRestore = a[p2];
a[p2++] = true;
int dd = 2;
while (dd > 0 && p2 < N) {
dd += (a[p2++]? 1: -1);
}
if (dd == 0) {
a[--p2] = nextToRestore;
}
}
else {
++p2;
}
// Step 5
if (a[p1++] == false) {
int dd = 2;
while (dd > 0) {
dd += (a[p1++]? -1: 1);
}
}
// Step 6
if (p2 - p1 > p2s - p1s) {
p2s = p2;
p1s = p1;
}
} while (p2 < N);
if (flip) {
a.flip();
}
}
Sum all elements in the array, then diff = (array.length - sum) will be the difference in number of 0s and 1s.
If diff is equal to array.length/2, then the maximum subarray = array.
If diff is less than array.length/2 then there are more 1s than 0s.
If diff is greater than array.length/2 then there are more 0s than 1s.
For cases 2 & 3, initialize two pointers, start & end pointing to beginning and end of array. If we have more 1s, then move the pointers inward (start++ or end--) based on whether array[start] = 1 or array[end] = 1, and update sum accordingly. At each step check if sum = (end - start) / 2. If this condition is true, then start and end represent the bounds of your maximum subarray.
Here we end up doing two passes of the array, once to calculate sum, and once which moving the pointers inward. And we are using constant space as we just need to store sum and two index values.
If anyone wants to knock up some pseudocode, you're more than welcome :)
Here's an actionscript solution that looked like it was scaling O(n). Though it might be more like O(n log n). It definitely uses only O(1) memory.
Warning I haven't checked how complete it is. I could be missing some cases.
protected function findLongest(array:Array, start:int = 0, end:int = -1):int {
if (end < start) {
end = array.length-1;
}
var startDiff:int = 0;
var endDiff:int = 0;
var diff:int = 0;
var length:int = end-start;
for (var i:int = 0; i <= length; i++) {
if (array[i+start] == '1') {
startDiff++;
} else {
startDiff--;
}
if (array[end-i] == '1') {
endDiff++;
} else {
endDiff--;
}
//We can stop when there's no chance of equalizing anymore.
if (Math.abs(startDiff) > length - i) {
diff = endDiff;
start = end - i;
break;
} else if (Math.abs(endDiff) > length - i) {
diff = startDiff;
end = i+start;
break;
}
}
var bit:String = diff > 0 ? '1': '0';
var diffAdjustment:int = diff > 0 ? -1: 1;
//Strip off the bad vars off the ends.
while (diff != 0 && array[start] == bit) {
start++;
diff += diffAdjustment;
}
while(diff != 0 && array[end] == bit) {
end--;
diff += diffAdjustment;
}
//If we have equalized end. Otherwise recurse within the sub-array.
if (diff == 0)
return end-start+1;
else
return findLongest(array, start, end);
}
I would argue that it is impossible, that an algorithm with O(1) exists, in the following way. Assume you iterate ONCE over every bit. This requires a counter which needs the space of O(log n). Possibly one could argue that n itself is part of the problem instance, then you have as input length for a binary string of the length k: k + 2-log k. Regardless how you look over them you need an additional variable, on case you need an index into that array, that already makes it non O(1).
Usually you dont have this problem, because you have for an problem of the size n, an input of n numbers of the size log k, which adds up to nlog k. Here a variable of length log k is just O(1). But here our log k is just 1. So we can only introduce a help variable that has constant length (and I mean really constant, it must be limited regardless how big the n is).
Here one problem is the description of the problem comes visible. In computer theory you have to be very careful about your encoding. E.g. you can make NP problems polynomial if you switch to unary encoding (because then input size is exponential bigger than in a n-ary (n>1) encoding.
As for n the input has just the size 2-log n, one must be careful. When you speak in this case of O(n) - this is really an algorithm that is O(2^n) (This is no point we need to discuss about - because one can argue whether the n itself is part of the description or not).
I have this algorithm running in O(n) time and O(1) space.
It makes use of simple "shrink-then-expand" trick. Comments in codes.
public static void longestSubArrayWithSameZerosAndOnes() {
// You are given an array of 1's and 0's only.
// Find the longest subarray which contains equal number of 1's and 0's
int[] A = new int[] {1, 0, 1, 1, 1, 0, 0,0,1};
int num0 = 0, num1 = 0;
// First, calculate how many 0s and 1s in the array
for(int i = 0; i < A.length; i++) {
if(A[i] == 0) {
num0++;
}
else {
num1++;
}
}
if(num0 == 0 || num1 == 0) {
System.out.println("The length of the sub-array is 0");
return;
}
// Second, check the array to find a continuous "block" that has
// the same number of 0s and 1s, starting from the HEAD and the
// TAIL of the array, and moving the 2 "pointer" (HEAD and TAIL)
// towards the CENTER of the array
int start = 0, end = A.length - 1;
while(num0 != num1 && start < end) {
if(num1 > num0) {
if(A[start] == 1) {
num1--; start++;
}
else if(A[end] == 1) {
num1--; end--;
}
else {
num0--; start++;
num0--; end--;
}
}
else if(num1 < num0) {
if(A[start] == 0) {
num0--; start++;
}
else if(A[end] == 0) {
num0--; end--;
}
else {
num1--; start++;
num1--; end--;
}
}
}
if(num0 == 0 || num1 == 0) {
start = end;
end++;
}
// Third, expand the continuous "block" just found at step #2 by
// moving "HEAD" to head of the array and "TAIL" to the end of
// the array, while still keeping the "block" balanced(containing
// the same number of 0s and 1s
while(0 < start && end < A.length - 1) {
if(A[start - 1] == 0 && A[end + 1] == 0 || A[start - 1] == 1 && A[end + 1] == 1) {
break;
}
start--;
end++;
}
System.out.println("The length of the sub-array is " + (end - start + 1) + ", starting from #" + start + " to #" + end);
}
linear time, constant space. Let me know if there is any bug I missed.
tested in python3.
def longestBalancedSubarray(A):
lo,hi = 0,len(A)-1
ones = sum(A);zeros = len(A) - ones
while lo < hi:
if ones == zeros: break
else:
if ones > zeros:
if A[lo] == 1: lo+=1; ones-=1
elif A[hi] == 1: hi+=1; ones-=1
else: lo+=1; zeros -=1
else:
if A[lo] == 0: lo+=1; zeros-=1
elif A[hi] == 0: hi+=1; zeros-=1
else: lo+=1; ones -=1
return(A[lo:hi+1])

Resources