Algorithm: Given an array, find the maximum sum after rearrangement - arrays

You are given an array A, of size N, containing numbers from 0-N. For each sub-array starting from 0th index, lets say Si, we say Bi is the smallest non negative number that is not present in Si.
We need to find the maximum possible sum of all Bi of this array.
We can rearrange the array to obtain the maximum sum.
For example:
A = 1, 2, 0 , N = 3
then lets say we rearranged it as A= 0, 1, 2
S1 = 0, B1= 1
S2 = 0,1 B2= 2
S3 = 0,1,2 B3= 3
Hence the sum is 6
Whatever examples I have tried, I have seen that sorted array will give the maximum sum. Am I correct or missing something here.
Please help to find the correct logic for this problem. I am not looking for optimal solution but just the correct logic.

Yes, sorting the array maximizes the sum of 𝐵𝑖
As the input size is 𝑛, it does not include every number in the range {0, ..., 𝑛}, as that is a set of 𝑛 + 1 numbers. Let's say it only lacks value 𝑘, then 𝐵𝑖 is 𝑘 for all 𝑖 >= 𝑘. If there are other numbers that are missing, but greater than 𝑘, there is no impact on any 𝐵𝑖.
Thus we need to find out the minimum missing value 𝑘 in the range {0, ..., 𝑛}. And then the maximised sum is 1 + 2 + ... + 𝑘 + (𝑛−𝑘)𝑘. This is 𝑘(𝑘+1)/2 + (𝑛−𝑘)𝑘 = 𝑘(1 + 2𝑛 − 𝑘)/2
To find the value of 𝑘, create a boolean array of size 𝑛 + 1, and set the entry at index 𝑣 to true when 𝑣 is encountered in the input. 𝑘 is then the first index at which that boolean array still has a false value.
Here is a little implementation in a JavaScript snippet:
function maxSum(arr) {
const n = arr.length;
const isUsed = Array(n + 1).fill(false);
for (const value of arr) {
isUsed[value] = true;
}
const k = isUsed.indexOf(false);
return k * (1 + 2*n - k) / 2;
}
console.log(maxSum([0, 1, 2])); // 6
console.log(maxSum([0, 2, 2])); // 3
console.log(maxSum([1, 0, 1])); // 5

Related

Find Minimum Score Possible

Problem statement:
We are given three arrays A1,A2,A3 of lengths n1,n2,n3. Each array contains some (or no) natural numbers (i.e > 0). These numbers denote the program execution times.
The task is to choose the first element from any array and then you can execute that program and remove it from that array.
For example:
if A1=[3,2] (n1=2),
A2=[7] (n2=1),
A3=[1] (n3=1)
then we can execute programs in various orders like [1,7,3,2] or [7,1,3,2] or [3,7,1,2] or [3,1,7,2] or [3,2,1,7] etc.
Now if we take S=[1,3,2,7] as the order of execution the waiting time of various programs would be
for S[0] waiting time = 0, since executed immediately,
for S[1] waiting time = 0+1 = 1, taking previous time into account, similarly,
for S[2] waiting time = 0+1+3 = 4
for S[3] waiting time = 0+1+3+2 = 6
Now the score of array is defined as sum of all wait times = 0 + 1 + 4 + 6 = 11, This is the minimum score we can get from any order of execution.
Our task is to find this minimum score.
How can we solve this problem? I tried with approach trying to pick minimum of three elements each time, but it is not correct because it gets stuck when two or three same elements are encountered.
One more example:
if A1=[23,10,18,43], A2=[7], A3=[13,42] minimum score would be 307.
The simplest way to solve this is with dynamic programming (which runs in cubic time).
For each array A: Suppose you take the first element from array A, i.e. A[0], as the next process. Your total cost is the wait-time contribution of A[0] (i.e., A[0] * (total_remaining_elements - 1)), plus the minimal wait time sum from A[1:] and the rest of the arrays.
Take the minimum cost over each possible first array A, and you'll get the minimum score.
Here's a Python implementation of that idea. It works with any number of arrays, not just three.
def dp_solve(arrays: List[List[int]]) -> int:
"""Given list of arrays representing dependent processing times,
return the smallest sum of wait_time_before_start for all job orders"""
arrays = [x for x in arrays if len(x) > 0] # Remove empty
#functools.lru_cache(100000)
def dp(remaining_elements: Tuple[int],
total_remaining: int) -> int:
"""Returns minimum wait time sum when suffixes of each array
have lengths in 'remaining_elements' """
if total_remaining == 0:
return 0
rem_elements_copy = list(remaining_elements)
best = 10 ** 20
for i, x in enumerate(remaining_elements):
if x == 0:
continue
cost_here = arrays[i][-x] * (total_remaining - 1)
if cost_here >= best:
continue
rem_elements_copy[i] -= 1
best = min(best,
dp(tuple(rem_elements_copy), total_remaining - 1)
+ cost_here)
rem_elements_copy[i] += 1
return best
return dp(tuple(map(len, arrays)), sum(map(len, arrays)))
Better solutions
The naive greedy strategy of 'smallest first element' doesn't work, because it can be worth it to do a longer job to get a much shorter job in the same list done, as the example of
A1 = [100, 1, 2, 3], A2 = [38], A3 = [34],
best solution = [100, 1, 2, 3, 34, 38]
by user3386109 in the comments demonstrates.
A more refined greedy strategy does work. Instead of the smallest first element, consider each possible prefix of the array. We want to pick the array with the smallest prefix, where prefixes are compared by average process time, and perform all the processes in that prefix in order.
A1 = [ 100, 1, 2, 3]
Prefix averages = [(100)/1, (100+1)/2, (100+1+2)/3, (100+1+2+3)/4]
= [ 100.0, 50.5, 34.333, 26.5]
A2=[38]
A3=[34]
Smallest prefix average in any array is 26.5, so pick
the prefix [100, 1, 2, 3] to complete first.
Then [34] is the next prefix, and [38] is the final prefix.
And here's a rough Python implementation of the greedy algorithm. This code computes subarray averages in a completely naive/brute-force way, so the algorithm is still quadratic (but an improvement over the dynamic programming method). Also, it computes 'maximum suffixes' instead of 'minimum prefixes' for ease of coding, but the two strategies are equivalent.
def greedy_solve(arrays: List[List[int]]) -> int:
"""Given list of arrays representing dependent processing times,
return the smallest sum of wait_time_before_start for all job orders"""
def max_suffix_avg(arr: List[int]):
"""Given arr, return value and length of max-average suffix"""
if len(arr) == 0:
return (-math.inf, 0)
best_len = 1
best = -math.inf
curr_sum = 0.0
for i, x in enumerate(reversed(arr), 1):
curr_sum += x
new_avg = curr_sum / i
if new_avg >= best:
best = new_avg
best_len = i
return (best, best_len)
arrays = [x for x in arrays if len(x) > 0] # Remove empty
total_time_sum = sum(sum(x) for x in arrays)
my_averages = [max_suffix_avg(arr) for arr in arrays]
total_cost = 0
while True:
largest_avg_idx = max(range(len(arrays)),
key=lambda y: my_averages[y][0])
_, n_to_remove = my_averages[largest_avg_idx]
if n_to_remove == 0:
break
for _ in range(n_to_remove):
total_time_sum -= arrays[largest_avg_idx].pop()
total_cost += total_time_sum
# Recompute the changed array's avg
my_averages[largest_avg_idx] = max_suffix_avg(arrays[largest_avg_idx])
return total_cost

Generate a matrix of combinations (permutation) without repetition (array exceeds maximum array size preference)

I am trying to generate a matrix, that has all unique combinations of [0 0 1 1], I wrote this code for this:
v1 = [0 0 1 1];
M1 = unique(perms([0 0 1 1]),'rows');
• This isn't ideal, because perms() is seeing each vector element as unique and doing:
4! = 4 * 3 * 2 * 1 = 24 combinations.
• With unique() I tried to delete all the repetitive entries so I end up with the combination matrix M1 →
only [4!/ 2! * (4-2)!] = 6 combinations!
Now, when I try to do something very simple like:
n = 15;
i = 1;
v1 = [zeros(1,n-i) ones(1,i)];
M = unique(perms(vec_1),'rows');
• Instead of getting [15!/ 1! * (15-1)!] = 15 combinations, the perms() function is trying to do
15! = 1.3077e+12 combinations and it's interrupted.
• How would you go about doing in a much better way? Thanks in advance!
You can use nchoosek to return the indicies which should be 1, I think in your heart you knew this must be possible because you were using the definition of nchoosek to determine the expected final number of permutations! So we can use:
idx = nchoosek( 1:N, k );
Where N is the number of elements in your array v1, and k is the number of elements which have the value 1. Then it's simply a case of creating the zeros array and populating the ones.
v1 = [0, 0, 1, 1];
N = numel(v1); % number of elements in array
k = nnz(v1); % number of non-zero elements in array
colidx = nchoosek( 1:N, k ); % column index for ones
rowidx = repmat( 1:size(colidx,1), k, 1 ).'; % row index for ones
M = zeros( size(colidx,1), N ); % create output
M( rowidx(:) + size(M,1) * (colidx(:)-1) ) = 1;
This works for both of your examples without the need for a huge intermediate matrix.
Aside: since you'd have the indicies using this approach, you could instead create a sparse matrix, but whether that's a good idea or not would depend what you're doing after this point.

Selecting elements closest to zero

Let's say I have a vector A = [-3 -2 -1 1 2].
I want to pick out the positive element closest to zero and the negative element closest to zero while retaining their index value in the vector A.
So I want another vector with [-1, 4; 1,5] (i.e. [-ve to 0, index ; +ve to 0, index]).
I've tried creating separate arrays with the positive and negative components and selecting the min/max of these, but this loses the index in A. E.g.
Ap = A(A>0)
An = A(A<0)
[ap,idxp]=sort(Ap)
[an,idxn]=sort(An)
I'm sure there must be a simple way to do this. Any help greatly appreciated.
Invert the vector
B = 1 ./ A;
and find min
[~, idxn] = min(B);
an = A(idxn);
and max
[~, idxp] = max(B);
ap = A(idxp);
You can keep the index with
Ap = A;
Ap(Ap < 0) = NaN;
[ap,idxp] = min(Ap);
An = A;
An(An > 0) = NaN;
[an,idxn] = max(An);
You can sort the absolute values, then pick out the indices and values of the first value where A is greater than or less than zero:
A = [-3 -2 -1 1 2];
[sA,idx] = sort(abs(A)); % Sort absolute values to get closest to 0 order
idxP = idx( find( A(idx) > 0, 1 ) ); % First index where sorted A is positive
idxN = idx( find( A(idx) < 0, 1 ) ); % First index where sorted A is negative
out = [A(idxP), idxP; A(idxN), idxN]; % Formatted output as requested: [1, 4; -1, 3] for example

Matlab: find multiple elements in an array

I want to find multiple elements of a value in an array in Matlab code.
I found the function mod and find, but these return the indices of elements and
not the elements. Moreover, I wrote the following code:
x=[1 2 3 4];
if (mod(x,2)==0)
a=x;
end
but this does not work. How can I solve this problem?
Looks like you what to find all multiples of 2 (or any number), you can achieve this using :
a = x( mod(x,2) == 0 ) ;
When you do a = x, x is still x=[1 2 3 4] regardless if (mod(x,2)==0) is true or false;
you can assign a value to (mod(x,2)==0), e.g. val = (mod(x,2)==0), then append/add this value to a new array.
Given a vector numberList = [ 1, 2, 3, 4, 5, 6]; and a number number = 2; you can find indices (position in a vector) of the numbers in the numberList that are a multiple of number using indices = find(mod(numberList, number) ==0);.
If necessary you may display a list of this multiples calling: multiples = numberList(indices).
multiples =
2 4 6

Minimize the number of operation to make all elements of array equal

Given an array of n elements you are allowed to perform only 2 kinds of operation to make all elements of array equal.
multiply any element by 2
divide element by 2(integer division)
Your task is to minimize the total number of above operation performed to make all elements of array equal.
Example
array = [3,6,7] minimum operation is 2 as 6 and 7 can be divided by 2 to obtain 3.
I cannot think of even the brute force solution.
Constraints
1 <= n <= 100000 and
1 <= ai <=100000
where ai is the ith element of array.
View all numbers as strings of 0 and 1, via their binary expansion.
E.g.: 3, 6, 7 are represented as 11, 110, 111, respectively.
Dividing by 2 is equivalent to removing the right most 0 or 1, and multiplying by 2 is equivalent to adding a 0 from the right.
For a string consisting of 0 and 1, let us define its "head" to be a substring that is the left several terms of the string, which ends with 1.
E.g.: 1100101 has heads 1, 11, 11001, 1100101.
The task becomes finding longest common head of all the given strings, and then determining how many 0's to add after this common head.
An example:
Say you have the following strings:
10101001, 101011, 10111, 1010001
find the longest common head of 10101001 and 101011, which is 10101;
find the longest common head of 10101 and 10111, which is 101;
find the longest common head of 101 and 1010001, which is 101.
Then you are sure that all the numbers should become a number of the form 101 00....
To determine how many 0's to add after 101, find the number of consecutive 0's directly following 101 in every string:
For 10101001: 1
For 101011: 1
For 10111: 0
For 1010001: 3
It remains to find an integer k that minimizes |k - 1| + |k - 1| + |k - 0| + |k - 3|. Here we find k = 1. So every number should becomd 1010 in the end.
As the other answer explains, backtracking is not necessary. For the fun of it a little implementation of that approach. (See link to run online at the bottom):
First we need a function that determines the number of binary digits in a number:
def getLength(i: Int): Int = {
#annotation.tailrec
def rec(i: Int, result: Int): Int =
if(i > 0)
rec(i >> 1, result + 1)
else
result
rec(i, 0)
}
Then we need a function that determines the common prefix of two numbers of equal length
#annotation.tailrec
def getPrefix(i: Int, j: Int): Int =
if(i == j) i
else getPrefix(i >> 1, j >> 1)
And of a list of arbitrary numbers:
def getPrefix(is: List[Int]): Int = is.reduce((x,y) => {
val shift = Math.abs(getLength(x) - getLength(y))
val x2 = Math.max(x,y)
val y2 = Math.min(x,y)
getPrefix((x2 >> shift), y2)
})
Then we need the length of the suffix without counting leeding zeros of the suffix:
def getSuffixLength(i: Int, prefix: Int) = {
val suffix = i ^ (prefix << (getLength(i) - getLength(prefix)))
getLength(suffix)
}
Now we can compute the number of operations we need to synchronize an operation i to the prefix with "zeros" zeros appended.
def getOperations(i: Int, prefix: Int, zeros: Int): Int = {
val length = getLength(i) - getLength(prefix)
val suffixLength = getSuffixLength(i, prefix)
suffixLength + Math.abs(zeros - length + suffixLength)
}
Now we can find the minimal numbers of operations and return that together with the value we will sync to:
def getMinOperations(is: List[Int]) = {
val prefix = getPrefix(is)
val maxZeros = getLength(is.max) - getLength(prefix)
(0 to maxZeros).map{zeros => (is.map{getOperations(_, prefix, zeros)}.sum, prefix << zeros)}.minBy(_._1)
}
You can try this solution at:
http://goo.gl/lLr5jl
The last step of finding the right number of zeros can be improved, as only the length of a suffix without leading zeros matters, not what it looks like. So we can compute the number of operations we need for these together by counting how many there are:
def getSuffixLength(i: Int, prefix: Int) = {
val suffix = i ^ (prefix << (getLength(i) - getLength(prefix)))
getLength(suffix)
}
def getMinOperations(is: List[Int]) = {
val prefix = getPrefix(is)
val maxZeros = getLength(is.max) - getLength(prefix)
val baseCosts = is.map(getSuffixLength(_,prefix)).sum
val suffixLengths: List[(Int, Int)] = is.foldLeft(Map[Int, Int]()){
case (m,i) => {
val x = getSuffixLength(i,prefix) - getLength(i) + getLength(prefix)
m.updated(x, 1 + m.getOrElse(x, 0))
}
}.toList
val (minOp, minSol) = (0 to maxZeros).map{zeros => (suffixLengths.map{
case (x, count) => count * Math.abs(zeros + x)
}.sum, prefix << zeros)}.minBy(_._1)
(minOp + baseCosts, minSol)
}
All axillary operations only take logarithmic time in the size of the maximal number. We have to go through the hole list to collect the suffix lengths. And then we have to guess the number of zeros where there are at most logarithmic in the maximal number many zeros. So we should have a complexity of
O(|list|*ld(maxNum) + (ld(maxNum))^2)
So for your bounds this is basically linear in the input size.
This version can be found here:
http://goo.gl/ijzYik

Resources