Largest Slice Sum from Two Different Arrays

Largest Slice Sum from Two Different Arrays - arrays

Original Problem: Problem 1 (INOI 2015)
There are two arrays A[1..N] and B[1..N]
An operation SSum is defined on them as
SSum[i,j] = A[i] + A[j] + B[t (where t = i+1, i+2, ..., j-1)] when i < j
SSum[i,j] = A[i] + A[j] + B[t (where t = 1, 2, ..., j-1, i+1, i+2, ..., N)] when i > j
SSum[i,i] = A[i]
The challenge is to find the largest possible value of SSum.
I had an O(n^2) solution based on computing the Prefix Sums of B
#include <iostream>
#include <utility>
int main(){
int N;
std::cin >> N;
int *a = new int[N+1];
long long int *bPrefixSums = new long long int[N+1];
for (int iii=1; iii<=N; iii++) //1-based arrays to prevent confusion
std::cin >> a[iii];
bPrefixSums[0] = 0;
for (int b,iii=1; iii<=N; iii++){
std::cin >> b;
bPrefixSums[iii] = bPrefixSums[iii-1] + b;
}
long long int SSum, SSumMax=-(1<<10);
for (int i=1; i <= N; i++)
for (int j=1; j <= N; j++){
if (i<j)
SSum = a[i] + a[j] + (bPrefixSums[j-1] - bPrefixSums[i]);
else if (i==j)
SSum = a[i];
else
SSum = a[i] + a[j] + ((bPrefixSums[N] - bPrefixSums[i]) + bPrefixSums[j-1]);
SSumMax = std::max(SSum, SSumMax);
}
std::cout << SSumMax;
return 0;
}
For larger values of N around 10^6, the program fails to complete the task in 3 seconds.

Since I didn't get enough rep to add a comment, I shall just write the ideas here in this answer.
This problem is really nice, and I was actually inspired by this link. Thanks to #superty.
We may consider this problem separately, in other words, into three conditions: i == j, i < j, i > j. And we only need to find the maximum result.
Consider i == j: The maximum result should be a[i], and it's easy to find the answer in O(n) time complexity.
Consider i < j: It's quite similar to the classical maximum sum problem, and for each j we only need to find the i in the left which manages to make the result maximum.
Think about the classical problem first, if we are asked to get the maximum partial sum for array a, we calculate the prefix-sum of a in order to get an O(n) complexity. Now in this problem, it is almost the same.
You can see that here(i < j), we have SSum[i,j] = A[i] + A[j] + B[t (where t = i+1, i+2, ..., j-1)] = (B[1] + B[2] + ... + B[j - 1] + A[j]) - (B[1] + B[2] + ... B[i] - A[i]), and the first term stays the same when j stays the same while the second term stays the same when i stays the same. So the solution now is quite clear, you get two 'prefix-sum' and find the smallest prefix_sum_2[i] for each prefix_sum_1[j].
Consider i > j: It's quite similar with this discussion on SO(but this discussion doesn't help much).
Similarly, we get SSum[i,j] = A[i] + A[j] + B[t (where t = 1, 2, ..., j-1, i+1, i+2, ..., N)] = (B[1] + B[2] + ... + B[j - 1] + A[j]) + (A[i] + B[i + 1] + ... + B[n - 1] + B[n]). Now you need to get both the prefix-sum and the suffix-sum of the array (we need prefix_sum[i] = a[i] + prefix_sum[i - 1] - a[i - 1] and suffix similarly), and get another two arrays, say ans_left[i] as the maximum value of the first term for all j <= i and ans_right[j] as the maximum value of the second term for i >= j, so the answer in this condition is the maximum value among all (ans_left[i] + ans_right[i + 1])
Finally, the maximum result required for the original problem is the maximum of the answers for these three sub-cases.
It's clear to see that the total complexity is O(n).

Related

What is the purpose of the base case dp[0] = -1 of this DP array in the palindromic partitions problem?

Recently I came across this problem on https://leetcode.com/problems/palindrome-partitioning-ii/:
Given a string s, partition s such that every substring of the partition is a palindrome.
Return the minimum cuts needed for a palindrome partitioning of s.
Example:
Input: "aab"
Output: 1
Explanation: The palindrome partitioning ["aa","b"] could be produced using 1 cut.
Here is a C solution I discovered on the internet. I've been trying to understand what the DP array is tracking, and I've figured out that at dp[j], stores the number of palindrome partitions at the jth character of the string. So dp[1] stores the number of partitions needed for a one letter element, which will always be 0, dp[2] that for the first two letters of the strig.
What I don't understand is, why do we initialize dp[0] = -1? This seems somewhat unintuitive, and I cannot figure out a reason that this happens.
int _min(int a, int b) {
return a < b ? a : b;
}
int minCut(char* s) {
int *dp, n, i, k;
n = strlen(s);
dp = malloc((n + 1) * sizeof(int)); // number of cuts on length
//assert(dp);
dp[0] = -1;
for (i = 0; i < n; i ++) {
dp[i + 1] = dp[i] + 1;
}
for (i = 0; i < n; i ++) {
dp[i + 1] = _min(dp[i + 1], dp[i] + 1);
for (k = 1; // "aba"
i - k >= 0 &&
i + k < n &&
s[i - k] == s[i + k];
k ++) {
dp[i + k + 1] = _min(dp[i + k + 1], dp[i - k] + 1);
}
for (k = 1; // "aaaa"
i - k + 1 >= 0 &&
i + k < n &&
s[i - k + 1] == s[i + k];
k ++) {
dp[i + k + 1] = _min(dp[i + k + 1], dp[i - k + 1] + 1);
}
}
i = dp[n];
free(dp);
return i;
}
I've done some tracing with this function and still don't seem to be able to find an answer: Here's where I tried minCut("aba"), printing i and dp at the beginning of each iteration of the second wrapping for loop, and also k when it appears in the first nested for loop.
i = 0
dp = [-1, 0, 1, 2]
i = 1
dp = [-1, 0, 1, 2]
k = 1
i = 2
dp = [-1, 0, 1, 0]
When we come to element 'b', we find out, by expanding forwards and back that "aba" is a palindrome. Then, with this: dp[i + k + 1] = _min(dp[i + k + 1], dp[i - k] + 1);, we get that dp[3] = _min(dp[3], dp[1 - 1] + 1) = _min(2, -1 + 1) = 0.
It is confusing why the base case is dp[0] = -1, and how it factors into _min(dp[3], dp[0] + 1). Basically we are going back to where we didn't have the palindrome detected and taking that value + 1. But why is minCut("") = -1?
I've been trying to figure this out for 2.5 hours, but I still cannot figure it out.

This a guard value. We use such things when we don't want to write additional ifs, then we append some guard elements to the data, e.g. instead of n*n matrix we might use (n+2)*(n+2) matrix with some convenient values in guards positions, often zeroes.
Observe that with every next palindrome discovered you need to do one more cut. This is achieved by + 1 while updating dp. But when you discover first palindrome you don't need to do a cut for it. This is the same as with rod cutting, to cut a rod into one piece you don't need to cut it at all.
BTW, if s is zero length, the program return -1 which is wrong.
BTW2, this program takes a lot of time to run if input string looks like aaa...aaa. Basically, it is O(n^2).

Accurate method for finding the time complexity of a function

How to find the time complexity of this function:
Code
void f(int n)
{
for(int i=0; i<n; ++i)
for(int j=0; j<i; ++j)
for(int k=i*j; k>0; k/=2)
printf("~");
}
I took an educated guess of (n^2)*log(n) based on intuition and it turned out to be correct.
But I can't seem to find an accurate explanation for it.

For every value of i, i>0, there will be i-1 values of the inner loop, each of them for k starting respectively at:
i*1, i*2, ..., i(i-1)
Since k is divided by 2 until it reaches 0, each of these inner-inner loops require lg(k) steps. Hence
lg(i*1) + lg(i*2) + ... + lg(i(i-1)) = lg(i) + lg(i) + lg(2) + ... + lg(i) + lg(i-1)
= (i-1)lg(i) + lg(2) + ... + lg(i-1)
Therefore the total would be
f(n) ::= sum_{i=1}^{n-1} i*lg(i) + lg(2) + ... + lg(i-1)
Let's now bound f(n+1) from above:
f(n+1) <= sum_{i-1}^n i*lg(i) + (i-1)lg(i-1)
<= 2*sum_{i-1}^n i*lg(i)
<= C*integral_0^n x(ln x) ; integral bound, some constant C
= C/2(n^2(ln n) - n^2/2) ; integral x*ln(x) = x^2/2*ln(x) - x^2/4
= O(n^2*lg(n))
If we now bound f(n+1) from below:
f(n+1) >= sum_{i=1}^n i*lg(i)
>= C*integral_0^n x(ln x) ; integral bound
= C*(n^2*ln(n)/2 - n^2/4) ; integral x*ln(x) = x^2/2*ln(x) - x^2/4
>= C/4(n^2*ln(n))
= O(n^2*lg(n))

Maximizing count of distinct numbers that produce a given sum 'k'

I need help with this dynamic programming problem.
Given a positive integer k, find the maximum number of distinct positive integers that sum to k. For example, 6 = 1 + 2 + 3 so the answer would be 3, as opposed to 5 + 1 or 4 + 2 which would be 2.
The first thing I think of is that I have to find a subproblem. So to find the max sum for k, we need to find the max sum for the values less than k. So we have to iterate through the values 1 -> k and find the max sum for those values.
What confuses me is how to make a formula. We can define M(j) as the maximum number of distinct values that sum to j, but how do I actually write the formula for it?
Is my logic for what I have so far correct, and can someone explain how to work through this step by step?

No dynamic programming is need. Let's start with an example:
50 = 50
50 = 1 + 49
50 = 1 + 2 + 47 (three numbers)
50 = 1 + 2 + 3 + 44 (four numbers)
50 = 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 14 (nine numbers)
Nine numbers is as far as we can go. If we use ten numbers, the sum would be at least 1 + 2 + 3 + ... + 10 = 55, which is greater than 50 - thus it is impossible.
Indeed, if we use exactly n distinct positive integers, then the lowest number with such a sum is 1+2+...+n = n(n+1)/2. By solving the quadratic, we have that M(k) is approximately sqrt(2k).
Thus the algorithm is to take the number k, subtract 1, 2, 3, etc. until we can't anymore, then decrement by 1. Algorithm in C:
int M(int k) {
int i;
for (i = 1; ; i++) {
if (k < i) return i - 1;
else k -= i;
}
}

The other answers correctly deduce that the problem essentially is this summation:
However this can actually be simplified to
In code this looks like : floor(sqrt(2.0 * k + 1.0/4) - 1.0/2)
The disadvantage of this answer is that it requires you to deal with floating point numbers.
Brian M. Scott (https://math.stackexchange.com/users/12042/brian-m-scott), Given a positive integer, find the maximum distinct positive integers that can form its sum, URL (version: 2012-03-22): https://math.stackexchange.com/q/123128

The smallest number that can be represented as the sum of i distinct positive integers is 1 + 2 + 3 + ... + i = i(i+1)/2, otherwise known as the i'th triangular number, T[i].
Let i be such that T[i] is the largest triangular number less than or equal to your k.
Then we can represent k as the sum of i different positive integers:
1 + 2 + 3 + ... + (i-1) + (i + k - T[i])
Note that the last term is greater than or equal to i (and therefore different from the other integers), since k >= T[i].
Also, it's not possible to represent k as the sum of i+1 different positive integers, since the smallest number that's the sum of i+1 different positive integers is T[i+1] > k because of how we chose i.
So your question is equivalent to finding the largest i such that T[i] <= k.
That's solved by this:
i = floor((-1 + sqrt(1 + 8k)) / 2)
[derivation here: https://math.stackexchange.com/questions/1417579/largest-triangular-number-less-than-a-given-natural-number ]
You could also write a simple program to iterate through triangular numbers until you find the first larger than k:
def uniq_sum_count(k):
i = 1
while i * (i+1) <= k * 2:
i += 1
return i - 1
for k in xrange(20):
print k, uniq_sum_count(k)

I think you just check if 1 + ... + n > k. If so, print n-1.
Because if you find the smallest n as 1 + ... + n > k, then 1 + ... + (n-1) <= k. so add the extra value, say E, to (n-1), then 1 + ... + (n-1+E) = k.
Hence n-1 is the maximum.
Note that : 1 + ... + n = n(n+1) / 2
#include <stdio.h>
int main()
{
int k, n;
printf(">> ");
scanf("%d", &k);
for (n = 1; ; n++)
if (n * (n + 1) / 2 > k)
break;
printf("the maximum: %d\n", n-1);
}
Or you can make M(j).
int M(int j)
{
int n;
for (n = 1; ; n++)
if (n * (n + 1) / 2 > j)
return n-1; // return the maximum.
}

Well the problem might be solved without dynamic programming however i tried to look at it in dynamic programming way.
Tip: when you wanna solve a dynamic programming problem you should see when situation is "repetitive". Here, since from the viewpoint of the number k it does not matter if, for example, I subtract 1 first and then 3 or first 3 and then 1; I say that "let's subtract from it in ascending order".
Now, what is repeated? Ok, the idea is that I want to start with number k and subtract it from distinct elements until I get to zero. So, if I reach to a situation where the remaining number and the last distinct number that I have used are the same the situation is "repeated":
#include <stdio.h>
bool marked[][];
int memo[][];
int rec(int rem, int last_distinct){
if(marked[rem][last_distinct] == true) return memo[rem][last_distinct]; //don't compute it again
if(rem == 0) return 0; //success
if(rem > 0 && last > rem - 1) return -100000000000; //failure (minus infinity)
int ans = 0;
for(i = last_distinct + 1; i <= rem; i++){
int res = 1 + rec(rem - i, i); // I've just used one more distinct number
if(res > ans) ans = res;
}
marked[rem][last_distinct] = true;
memo[rem][last_distinct] = res;
return res;
}
int main(){
cout << rec(k, 0) << endl;
return 0;
}
The time complexity is O(k^3)

Though it isn't entirely clear what constraints there may be on how you arrive at your largest discrete series of numbers, but if you are able, passing a simple array to hold the discrete numbers, and keeping a running sum in your functions can simplify the process. For example, passing the array a long with your current j to the function and returning the number of elements that make up the sum within the array can be done with something like this:
int largest_discrete_sum (int *a, int j)
{
int n, sum = 0;
for (n = 1;; n++) {
a[n-1] = n, sum += n;
if (n * (n + 1) / 2 > j)
break;
}
a[sum - j - 1] = 0; /* zero the index holding excess */
return n;
}
Putting it together in a short test program would look like:
#include <stdio.h>
int largest_discrete_sum(int *a, int j);
int main (void) {
int i, idx = 0, v = 50;
int a[v];
idx = largest_discrete_sum (a, v);
printf ("\n largest_discrete_sum '%d'\n\n", v);
for (i = 0; i < idx; i++)
if (a[i])
printf (!i ? " %2d" : " +%2d", a[i]);
printf (" = %d\n\n", v);
return 0;
}
int largest_discrete_sum (int *a, int j)
{
int n, sum = 0;
for (n = 1;; n++) {
a[n-1] = n, sum += n;
if (n * (n + 1) / 2 > j)
break;
}
a[sum - j - 1] = 0; /* zero the index holding excess */
return n;
}
Example Use/Output
$ ./bin/largest_discrete_sum
largest_discrete_sum '50'
1 + 2 + 3 + 4 + 6 + 7 + 8 + 9 +10 = 50
I apologize if I missed a constraint on the discrete values selection somewhere, but approaching in this manner you are guaranteed to obtain the largest number of discrete values that will equal your sum. Let me know if you have any questions.

kth smallest element in two sorted array - O(log n) Solution

The above is one of the interview question. There is an article about 0(log n) algorithm explaining the invariant (i + j = k – 1). I'm having much difficulty in understanding this algorithm. Could anyone explain this algorithm in simple way and also why do they calculate i as (int)((double)m / (m+n) * (k-1)). I appreciate your help. Thanks.
protected static int kthSmallestEasy(int[] A, int aLow, int aLength, int[] B, int bLow, int bLength, int k)
{
//Error Handling
assert(aLow >= 0); assert(bLow >= 0);
assert(aLength >= 0); assert(bLength >= 0); assert(aLength + bLength >= k);
int i = (int)((double)((k - 1) * aLength / (aLength + bLength)));
int j = k - 1 - i;
int Ai_1 = aLow + i == 0 ? Int32.MinValue : A[aLow + i - 1];
int Ai = aLow + i == A.Length ? Int32.MaxValue : A[aLow + i];
int Bj_1 = bLow + j == 0 ? Int32.MinValue : B[bLow + j - 1];
int Bj = bLow + j == B.Length ? Int32.MaxValue : B[bLow + j];
if (Bj_1 < Ai && Ai < Bj)
return Ai;
else if (Ai_1 < Bj && Bj < Ai)
return Bj;
assert(Ai < Bj - 1 || Bj < Ai_1);
if (Ai < Bj_1) // exclude A[aLow .. i] and A[j..bHigh], k was replaced by k - i - 1
return kthSmallestEasy(A, aLow + i + 1, aLength - i - 1, B, bLow, j, k - i - 1);
else // exclude A[i, aHigh] and B[bLow .. j], k was replaced by k - j - 1
return kthSmallestEasy(A, aLow, i, B, bLow + j + 1, bLength - j - 1, k - j - 1);

Could anyone explain this algorithm in simple way.
Yes, it is essentially a bisection algorithm.
In successive passes, it moves probes on one array index upward and the other index array downward, seeking equal values while keeping the sum of the two indices equal to k.
and also why do they calculate i as (int)((double)m / (m+n) * (k-1)).
This gives an estimate of the new half-way point assuming an equidistribution of values between the known points.

Need explaination for this code (algorithm)

The problem:
Larry is very bad at math - he usually uses a calculator, which worked well throughout college. Unforunately, he is now struck in a deserted island with his good buddy Ryan after a snowboarding accident. They're now trying to spend some time figuring out some good problems, and Ryan will eat Larry if he cannot answer, so his fate is up to you!
It's a very simple problem - given a number N, how many ways can K numbers less than N add up to N?
For example, for N = 20 and K = 2, there are 21 ways:
0+20
1+19
2+18
3+17
4+16
5+15
...
18+2
19+1
20+0
Input
Each line will contain a pair of numbers N and K. N and K will both be an integer from 1 to 100, inclusive. The input will terminate on 2 0's.
Output
Since Larry is only interested in the last few digits of the answer, for each pair of numbers N and K, print a single number mod 1,000,000 on a single line.
Sample Input
20 2
20 2
0 0
Sample Output
21
21
The solution code:
#include<iostream>
#include<stdlib.h>
#include<stdio.h>
using namespace std;
#define maxn 100
typedef long ss;
ss T[maxn+2][maxn+2];
void Gen() {
ss i, j;
for(i = 0; i<= maxn; i++)
T[1][i] = 1;
for(i = 2; i<= 100; i++) {
T[i][0] = 1;
for(j = 1; j <= 100; j++)
T[i][j] = (T[i][j-1] + T[i-1][j]) % 1000000;
}
}
int main() {
//freopen("in.txt", "r", stdin);
ss n, m;
Gen();
while(cin>>n>>m) {
if(!n && !m) break;
cout<<T[m][n]<<endl;
}
return 0;
}
How has this calculation been derived?
How has it come T[i][j] = (T[i][j-1] + T[i-1][j]) ?

Note: I only use n and k (lower case) to refer to some anonymous variable. I will always use N and K (upper case) to refer to N and K as defined in the question (sum and the number of portions).
Let C(n, k) be the result of n choose k, then the solution to the problem is C(N + K - 1, K - 1), with the assumption that those K numbers are non-negative (or there will be infinitely many solution even for N = 0 and K = 2).
Since the K numbers are non-negative, and the sum N is fixed, we can think of the problem as: how many ways to divide candy among K people. We can divide the candies, by lying them into a line, and put (K - 1) separator between the candies. The (K - 1) separators will divide the candies up to K portions of candies. Looking at another perspective, it is also like choosing (K - 1) positions among (N + K - 1) positions to put in the separators, then the rest of the positions are candies. So, this explains why the number of ways is N + (K - 1) choose (K - 1).
Then the problem reduce to how to find the least significant digits of C(n, k). (Since maximum of N and K is 100 as defined in maxn, we don't have to worry if the algorithm goes up to O(n3)).
The calculation uses this combinatorial identity C(n, k) = C(n - 1, k) + C(n, k - 1) (Pascal's rule). The clever thing about the implementation is that it doesn't store C(n, k) (table of result of combination, which is a jagged array), but it stores C(N, K) instead. The identity is actually present in the T[i][j] = (T[i][j-1] + T[i-1][j]):
The first dimension is actually K, the number of portions. And the second dimension is the sum N. T[K][N] will directly store the result, and according to the mathematical result derived above, is (least significant digits of) C(N + K - 1, K - 1).
Re-writing the T[i][j] = (T[i][j-1] + T[i-1][j]) back to equivalent mathematical result:
C(i + j - 1, i - 1) = C(i + j - 2, i - 1) + C(i + j - 2, i - 2), which is correct according to the identity.
The program will fill the array row by row:
The row K = 0 is already initialized to 0, using the fact that static array is initialized to 0.
It fills the row K = 1 with 1 (there is only 1 way to divide N into 1 portion).
For the rest of the rows, it sets the case N = 0 to 1 (there is only 1 way to divide 0 into K parts - all parts are 0).
Then the rest are filled with the expression T[i][j] = (T[i][j-1] + T[i-1][j]), which will refer to the previous row, and the previous element of the same row, both of which has been filled up in earlier iterations.

Let C(x, y) to be the result of x choose y, then the value of T[i][j] equals: C(i - 1 + j, j).
You can proove this by induction.
Base cases:
T[1][j] = C(1 - 1 + j, j) = C(j, j) = 1
T[i][0] = C(i - 1, 0) = 1
For the induction step, use the formula (for 0<=y<=x):
C(x,y) = C(x - 1, y - 1) + C(x - 1, y)
Therefore:
C(i - 1 + j, j) = C(i-1+j - 1, j - 1) + C(i-1+j - 1, j) = C(i-1+(j-1), (j-1)) + C((i-1)-1+j, j)
Or in other words:
T[i][j] = T[i,j-1] + T[i-1,j]
Now, as nhahtdh mentioned before, the value you are looking for is C(N + K - 1, K - 1)
which equals:
T[N+1][K-1] = C(N+1-1+K-1, K-1)
(modulo 1000000)

This is a famous problem - you can check solution here
How many ways to drop N identical balls to K boxes.
The following algorithm is a dynamic-programming solution to your problem:
Define D[i,j] to be the number of ways i numbers less than j, can sum up to j.
0 <= i < = N
1 <= j <= K
Where D[j,1] = 1 for every j.
And where j > 1 you get:
D[i,j] = D[i,j-1] + D[i-1,j-1] +...+ D[0,j-1]

The problem is known as "the integer partition problem". Basically there exists a recursive computation of the k-partition of n, but your solution is just the dynamic programming version of it (non-recursive and computing bottom-up for short).