Minimum number of moves required to get a permutation of a int of array? - arrays

You have a sequence of d[0] , d[1], d[2] , d[3] ,..,d[n]. In each move you are allowed to increase any d[i] by 1 or 2 or 5 i:0 to n .What is the minimum number of moves required to transform the sequence to permutation of [1,2,3,..,n] if it's possible else return -1. 1<=n<=1000
My approach is sort the given array in ascending array than count it by adding 1 or 2 or 5 . But it fails in many cases .Some of my classmates did this in exam using this method but they read question wrong so read question carefully .
e.g. [1,1,3,2,1] than answer is 4 since We can get [1,2,5,4,3 ] by adding 0,1,2,2,2 respectively so answer is 4 .
[1,2,3,4,1] => [1,1,2,3,4] we will get 4 using sorting method [0,1,1,1,1] but answer is 2 since we can add [2+2] in 1 to get [1,2,3,4,5] .
similarly
[1,2,3,1] =>[1,1,2,3] to [1,2,3,4] required 3 transformation but answer is 2 since by adding [1+2] to 1 we can get [1,2,3,4].
Another method can be used as but i don't have any proof for correctness .
Algorithm
input "n" is number of element , array "a" which contains input element
initialize cnt = 0 ;
initialize boolarray[n] ={0};
1. for i=0...n boolarray[a[i]]=1;
2. put all element in sorted order whose boolarray[a[i]]=0 for i=0...n
3. Now make boolarray[a[i]]=1; for i=0..n and count
how many additions are required .
4. return count ;
According to me this question will be result in 0 or more always since any number can be produced using 1 , 2 and 5 except this case when any d[i] i=0..n is greater than number of Inputs .
How to solve this correctly ?
Any answer and suggestions are welcome .

Your problem can be converted in weighted bipartite matching problem :-
first part p1 of graph are the current array numbers as nodes.
second part p2 of graph are numbers 1 to n.
There is edge between node of p1 to node p2 if we can add 1,2,5 to it to make node in p2.
weighted bipartite matching can be solved using the hungarian algorithm
Edit :-
If you are evaluating minimum number of move then you can use unweighted bipartite matching . You can use hopcroft-karp algorithm which runs in O(n^1.5) in your case as number of edges E = O(n) in the graph.

Create an array count which contains the count of how often we have a specific number in our base array
input 1 1 3 2 1
count 3 1 1 0 0
now walk over this array and calculate the steps
sum = 0
for i: 1..n
while count[i] > 1 // as long as we have spare numbers
missing = -1 // find the biggest empty spot which is bigger than the number at i
for x: n..i+1 // look for the biggest missing
if count[x] > 0 continue // this one is not missing
missing = x
break;
if missing == -1 return -1 // no empty spot found
sum += calcCost(i, missing)
count[i]--
count[missing]++
return sum
calcCost must be greedy

Related

Daily Coding Problem 260 : Reconstruct a jumbled array - Intuition?

I'm going through the question below.
The sequence [0, 1, ..., N] has been jumbled, and the only clue you have for its order is an array representing whether each number is larger or smaller than the last. Given this information, reconstruct an array that is consistent with it.
For example, given [None, +, +, -, +], you could return [1, 2, 3, 0, 4].
I went through the solution on this post but still unable to understand it as to why this solution works. I don't think I would be able to come up with the solution if I had this in front of me during an interview. Can anyone explain the intuition behind it? Thanks in advance!
This answer tries to give a general strategy to find an algorithm to tackle this type of problems. It is not trying to prove why the given solution is correct, but lying out a route towards such a solution.
A tried and tested way to tackle this kind of problem (actually a wide range of problems), is to start with small examples and work your way up. This works for puzzles, but even so for problems encountered in reality.
First, note that the question is formulated deliberately to not point you in the right direction too easily. It makes you think there is some magic involved. How can you reconstruct a list of N numbers given only the list of plusses and minuses?
Well, you can't. For 10 numbers, there are 10! = 3628800 possible permutations. And there are only 2⁹ = 512 possible lists of signs. It's a very huge difference. Most original lists will be completely different after reconstruction.
Here's an overview of how to approach the problem:
Start with very simple examples
Try to work your way up, adding a bit of complexity
If you see something that seems a dead end, try increasing complexity in another way; don't spend too much time with situations where you don't see progress
While exploring alternatives, revisit old dead ends, as you might have gained new insights
Try whether recursion could work:
given a solution for N, can we easily construct a solution for N+1?
or even better: given a solution for N, can we easily construct a solution for 2N?
Given a recursive solution, can it be converted to an iterative solution?
Does the algorithm do some repetitive work that can be postponed to the end?
....
So, let's start simple (writing 0 for the None at the start):
very short lists are easy to guess:
'0++' → 0 1 2 → clearly only one solution
'0--' → 2 1 0 → only one solution
'0-+' → 1 0 2 or 2 0 1 → hey, there is no unique outcome, though the question only asks for one of the possible outcomes
lists with only plusses:
'0++++++' → 0 1 2 3 4 5 6 → only possibility
lists with only minuses:
'0-------'→ 7 6 5 4 3 2 1 0 → only possibility
lists with one minus, the rest plusses:
'0-++++' → 1 0 2 3 4 5 or 5 0 1 2 3 4 or ...
'0+-+++' → 0 2 1 3 4 5 or 5 0 1 2 3 4 or ...
→ no very obvious pattern seem to emerge
maybe some recursion could help?
given a solution for N, appending one sign more?
appending a plus is easy: just repeat the solution and append the largest plus 1
appending a minus, after some thought: increase all the numbers by 1 and append a zero
→ hey, we have a working solution, but maybe not the most efficient one
the algorithm just appends to an existing list, no need to really write it recursively (although the idea is expressed recursively)
appending a plus can be improved, by storing the largest number in a variable so it doesn't need to be searched at every step; no further improvements seem necessary
appending a minus is more troublesome: the list needs to be traversed with each append
what if instead of appending a zero, we append -1, and do the adding at the end?
this clearly works when there is only one minus
when two minus signs are encountered, the first time append -1, the second time -2
→ hey, this works for any number of minuses encountered, just store its counter in a variable and sum with it at the end of the algorithm
This is in bird's eye view one possible route towards coming up with a solution. Many routes lead to Rome. Introducing negative numbers might seem tricky, but it is a logical conclusion after contemplating the recursive algorithm for a while.
It works because all changes are sequential, either adding one or subtracting one, starting both the increasing and the decreasing sequences from the same place. That guarantees we have a sequential list overall. For example, given the arbitrary
[None, +, -, +, +, -]
turned vertically for convenience, we can see
None 0
+ 1
- -1
+ 2
+ 3
- -2
Now just shift them up by two (to account for -2):
2 3 1 4 5 0
+ - + + -
Let's look at first to a solution which (I think) is easier to understand, formalize and demonstrate for correctness (but I will only explain it and not demonstrate in a formal way):
We name A[0..N] our input array (where A[k] is None if k = 0 and is + or - otherwise) and B[0..N] our output array (where B[k] is in the range [0, N] and all values are unique)
At first we see that our problem (find B such that B[k] > B[k-1] if A[k] == + and B[k] < B[k-1] if A[k] == -) is only a special case of another problem:
Find B such that B[k] == max(B[0..k]) if A[k] == + and B[k] == min(B[0..k]) if A[k] == -.
Which generalize from "A value must larger or smaller than the last" to "A value must be larger or smaller than everyone before it"
So a solution to this problem is a solution to the original one as well.
Now how do we approach this problem?
A greedy solution will be sufficient, indeed is easy to demonstrate that the value associated with the last + will be the biggest number in absolute (which is N), the one associated with the second last + will be the second biggest number in absolute (which is N-1) ecc...
And in the same time the value associated with the last - will be the smallest number in absolute (which is 0), the one associated with the second last - will be the second smallest (which is 1) ecc...
So we can start filling B from right to left remembering how many + we have seen (let's call this value X), how many - we have seen (let's call this value Y) and looking at what is the current symbol, if it is a + in B we put N-X and we increase X by 1 and if it is a - in B we put 0+Y and we increase Y by 1.
In the end we'll need to fill B[0] with the only remaining value which is equal to Y+1 and to N-X-1.
An interesting property of this solution is that if we look to only the values associated with a - they will be all the values from 0 to Y (where in this case Y is the total number of -) sorted in reverse order; if we look to only the values associated with a + they will be all the values from N-X to N (where in this case X is the total number of +) sorted and if we look at B[0] it will always be Y+1 and N-X-1 (which are equal).
So the - will have all the values strictly smaller than B[0] and reverse sorted and the + will have all the values strictly bigger than B[0] and sorted.
This property is the key to understand why the solution proposed here works:
It consider B[0] equals to 0 and than it fills B following the property, this isn't a solution because the values are not in the range [0, N], but it is possible with a simple translation to move the range and arriving to [0, N]
The idea is to produce a permutation of [0,1...N] which will follow the pattern of [+,-...]. There are many permutations which will be applicable, it isn't a single one. For instance, look the the example provided:
[None, +, +, -, +], you could return [1, 2, 3, 0, 4].
But you also could have returned other solutions, just as valid: [2,3,4,0,1], [0,3,4,1,2] are also solutions. The only concern is that you need to have the first number having at least two numbers above it for positions [1],[2], and leave one number in the end which is lower then the one before and after it.
So the question isn't finding the one and only pattern which is scrambled, but to produce any permutation which will work with these rules.
This algorithm answers two questions for the next member of the list: get a number who’s both higher/lower from previous - and get a number who hasn’t been used yet. It takes a starting point number and essentially create two lists: an ascending list for the ‘+’ and a descending list for the ‘-‘. This way we guarantee that the next member is higher/lower than the previous one (because it’s in fact higher/lower than all previous members, a stricter condition than the one required) and for the same reason we know this number wasn’t used before.
So the intuition of the referenced algorithm is to start with a referenced number and work your way through. Let's assume we start from 0. The first place we put 0+1, which is 1. we keep 0 as our lowest, 1 as the highest.
l[0] h[1] list[1]
the next symbol is '+' so we take the highest number and raise it by one to 2, and update both the list with a new member and the highest number.
l[0] h[2] list [1,2]
The next symbol is '+' again, and so:
l[0] h[3] list [1,2,3]
The next symbol is '-' and so we have to put in our 0. Note that if the next symbol will be - we will have to stop, since we have no lower to produce.
l[0] h[3] list [1,2,3,0]
Luckily for us, we've chosen well and the last symbol is '+', so we can put our 4 and call is a day.
l[0] h[4] list [1,2,3,0,4]
This is not necessarily the smartest solution, as it can never know if the original number will solve the sequence, and always progresses by 1. That means that for some patterns [+,-...] it will not be able to find a solution. But for the pattern provided it works well with 0 as the initial starting point. If we chose the number 1 is would also work and produce [2,3,4,0,1], but for 2 and above it will fail. It will never produce the solution [0,3,4,1,2].
I hope this helps understanding the approach.
This is not an explanation for the question put forward by OP.
Just want to share a possible approach.
Given: N = 7
Index: 0 1 2 3 4 5 6 7
Pattern: X + - + - + - + //X = None
Go from 0 to N
[1] fill all '-' starting from right going left.
Index: 0 1 2 3 4 5 6 7
Pattern: X + - + - + - + //X = None
Answer: 2 1 0
[2] fill all the vacant places i.e [X & +] starting from left going right.
Index: 0 1 2 3 4 5 6 7
Pattern: X + - + - + - + //X = None
Answer: 3 4 5 6 7
Final:
Pattern: X + - + - + - + //X = None
Answer: 3 4 2 5 1 6 0 7
My answer definitely is too late for your problem but if you need a simple proof, you probably would like to read it:
+min_last or min_so_far is a decreasing value starting from 0.
+max_last or max_so_far is an increasing value starting from 0.
In the input, each value is either "+" or "-" and for each increase the value of max_so_far or decrease the value of min_so_far by one respectively, excluding the first one which is None. So, abs(min_so_far, max_so_far) is exactly equal to N, right? But because you need the range [0, n] but max_so_far and min_so_far now are equal to the number of "+"s and "-"s with the intersection part with the range [0, n] being [0, max_so_far], what you need to do is to pad it the value equal to min_so_far for the final solution (because min_so_far <= 0 so you need to take each value of the current answer to subtract by min_so_far or add by abs(min_so_far)).

Number of intervals that contains a given query point

I know a similar question exists here. My question is also same that I have N intervals (some possibly overlapping some even same). Then Q point queries are given and I need to tell how many intervals contains this point.
I tried to develop my algorithm by sorting the end point array then counting the number of overlapped interval by +1, -1 trick as mentioned in an answer. But after performing the binary search what I should do? Because its not always the case that the corresponding index of the prefix sum array is the answer.
e.g.
Intervals are : [1,4] [5,7] [6,10] [7,13]
sorted end point array : [1,4,5,6,7,7,10,13]
+1/-1 array : [1,-1,1,1,1,-1,-1,-1]
prefix sum array : [1,0,1,2,3,2,1,0]
Query : 10
my algorithm gives 1 (corresponding prefix array)
but actual ans should be 2.
How should I fix my algorithm?
There are no good answers in the question you linked, so:
First:
Put the entry and exit positions of each interval into separate arrays. (if you are using closed intervals then exit position is end position + 1, i.e., in [4,6], entry is 4, exit is 7.
Sort the arrays.
Then, for each point p:
Binary search in the entry array to find the number of entry positions <= p.
Binary search in the exit array to find the number of exit positions <= p.
The number of intervals that contain the point is entry_count - exit_count
NOTE that the number of positions <= p is the index of the first element > p. See: Where is the mistake in my code to perform Binary Search? to help you get that search right.
For your example:
Intervals: [1,4], [5,7], [6,10], [7,13]
Entry positions: [1,5,6,7]
Exit positions: [5,8,11,14]
Entry positions <= 6: 3
Exit positions <= 6: 1
Intervals that contains 6: 3-1 = 2
Problem is your intervals are [] instead of [), and the answer probably was made for the latter . First transform your end indendexes to value -1.
After this + "compressing" repeated coordinates you should have:
points = [1,5,6,7,8,11,14]
sums = [1,0,1,1,-1,-1,-1]
accumulated = [1,1,2,3,2,1,0]
Then for a query, if query < points[0] or query > points[max] return 0. If not, binary search over points to get index and the answer lies on accumulated[index].

Longest sub array in rotating array

Is there any way to find the longest subarray of 1's in log(n) time?
example:
110011111000 - then the output is 5 (from pos 5 to 10)
1110011101 - the output here is 3 but after rotation 1 the array becomes 111100111 and the output is now 4.
001111 - the output here is 4 from pos 3 to 6 but after rotation it becomes 3 from pos 4 to 6
Note: I found the longest subarray length in O(n) before rotation. How can I improve the solution after rotation if I have the past results?
You can follow those steps:
find the longest subarray of 1 (in O(n)). During that loop find the first and last instance of 0.
Start rotating and update those parameters.
If the last index is 0 search for the previous 1 to keep updated the parameters (at complexity this will not go over O(n) total.
I will assume you know how to get max subarray index and count in O(n). In addition to that, you will need to find the second largest subarray (can be done in the same loop).
Let extract 2 more params - first and last zeros (Simple code - I can attach it if you need )
When you rotate the array there are 3 option:
Nothing change
Bigger subarray created
Current biggest subarray breaks
In the first and second cases - you only need to update the params - O(1) - you can know this is the case according your params. In the third, you will need to use the second longest subarray you find (notice that only 1 subarray can be break at a time)
For example, consider you have array: 1110011101 (as your example) and you have max = 3 and maxIndex = 5. After running the getZeroIndexs function you also know that firstZeroIndex = 3 and lastZeroIndex = 8.
How our var will look like after rotate?
max = 3
maxIndex = 6
firstZeroIndex = 4 // this will increase as long as the lastZeroIndex different then the array size
lastZeroIndex = 9 //this will go up till array.size - when it those you need to loop again till you find last
In this case, the first index move to 4 whats make him bigger then max -> max = 4 and maxIndex = 0.
Now your array is : 1111001110 so lastZeroIndex = 9 as the array size so next rotation will yield:
max = 4
maxIndex = 1
firstZeroIndex = 0
lastZeroIndex = ? // here you need to loop from the end of your array till you get 0 -> O(k) in total complexity, all the run together will be O(n)
Hope it clear, if not feel free to ask!
No, because you have to know every letter value to count 1s, which is O(n) at least.

Minimum number of adjacent swaps

This is one of the question from online written test.
Books numbered from (1...N) have arrived to a warehouse.
The books are said to be best arranged if a book “i” is present only to the left of book “i+1” (for all i, 1 <= i <= N-1) and that book N is present to the left of book 1. [Yes! Any cyclic sorted sequence is a best arrangement]
Books received are in a random order.Now your task is to find out the minimal number of moves required to achieve the best arrangement described above.
Note that only valid move is to choose a pair of adjacent books and have them switch places.
For Example if the books were initially in the order 3 5 4 2 1
Solution can be
a. First swap the second pair of books: { result : 3 4 5 2 1 }
b. Swap the rightmost pair: { result : 3 4 5 1 2 }
So, in 2 moves we achieve the best arrangement.
I tried but not able to find out solution for this.First I though that i will divide the array in two arrays and then I will apply insertion sort on both the arrays but that is also not working.
Please help me to find out a algo for this question.
N,1 can be anywhere in the sequence. eg 1..5, could be 3,4,5,1,2. So the first digit could be 1..5, ie 5x as complicated as Previous question. So, you'll have to do it 5 times. Use a sort algorithm that has a replaceable compare function.
So for the 3rd test the compare would be:-
// returns <0, 0 or >0
int compare(a,b){
return ((b+N-3)%N) - ((a+N-3)%N);
}

Algorithm to split an array into P subarrays of balanced sum

I have an big array of length N, let's say something like:
2 4 6 7 6 3 3 3 4 3 4 4 4 3 3 1
I need to split this array into P subarrays (in this example, P=4 would be reasonable), such that the sum of the elements in each subarray is as close as possible to sigma, being:
sigma=(sum of all elements in original array)/P
In this example, sigma=15.
For the sake of clarity, one possible result would be:
2 4 6 7 6 3 3 3 4 3 4 4 4 3 3 1
(sums: 12,19,14,15)
I have written a very naive algorithm based in how I would do the divisions by hand, but I don't know how to impose the condition that a division whose sums are (14,14,14,14,19) is worse than one that is (15,14,16,14,16).
Thank you in advance.
First, let’s formalize your optimization problem by specifying the input, output, and the measure for each possible solution (I hope this is in your interest):
Given an array A of positive integers and a positive integer P, separate the array A into P non-overlapping subarrays such that the difference between the sum of each subarray and the perfect sum of the subarrays (sum(A)/P) is minimal.
Input: Array A of positive integers; P is a positive integer.
Output: Array SA of P non-negative integers representing the length of each subarray of A where the sum of these subarray lengths is equal to the length of A.
Measure: abs(sum(sa)-sum(A)/P) is minimal for each sa ∈ {sa | sa = (Ai, …, Ai+‍SAj) for i = (Σ SAj), j from 0 to P-1}.
The input and output define the set of valid solutions. The measure defines a measure to compare multiple valid solutions. And since we’re looking for a solution with the least difference to the perfect solution (minimization problem), measure should also be minimal.
With this information, it is quite easy to implement the measure function (here in Python):
def measure(a, sa):
sigma = sum(a)/len(sa)
diff = 0
i = 0
for j in xrange(0, len(sa)):
diff += abs(sum(a[i:i+sa[j]])-sigma)
i += sa[j]
return diff
print measure([2,4,6,7,6,3,3,3,4,3,4,4,4,3,3,1], [3,4,4,5]) # prints 8
Now finding an optimal solution is a little harder.
We can use the Backtracking algorithm for finding valid solutions and use the measure function to rate them. We basically try all possible combinations of P non-negative integer numbers that sum up to length(A) to represent all possible valid solutions. Although this ensures not to miss a valid solution, it is basically a brute-force approach with the benefit that we can omit some branches that cannot be any better than our yet best solution. E.g. in the example above, we wouldn’t need to test solutions with [9,…] (measure > 38) if we already have a solution with measure ≤ 38.
Following the pseudocode pattern from Wikipedia, our bt function looks as follows:
def bt(c):
global P, optimum, optimum_diff
if reject(P,c):
return
if accept(P,c):
print "%r with %d" % (c, measure(P,c))
if measure(P,c) < optimum_diff:
optimum = c
optimum_diff = measure(P,c)
return
s = first(P,c)
while s is not None:
bt(list(s))
s = next(P,s)
The global variables P, optimum, and optimum_diff represent the problem instance holding the values for A, P, and sigma, as well as the optimal solution and its measure:
class MinimalSumOfSubArraySumsProblem:
def __init__(self, a, p):
self.a = a
self.p = p
self.sigma = sum(a)/p
Next we specify the reject and accept functions that are quite straight forward:
def reject(P,c):
return optimum_diff < measure(P,c)
def accept(P,c):
return None not in c
This simply rejects any candidate whose measure is already more than our yet optimal solution. And we’re accepting any valid solution.
The measure function is also slightly changed due to the fact that c can now contain None values:
def measure(P, c):
diff = 0
i = 0
for j in xrange(0, P.p):
if c[j] is None:
break;
diff += abs(sum(P.a[i:i+c[j]])-P.sigma)
i += c[j]
return diff
The remaining two function first and next are a little more complicated:
def first(P,c):
t = 0
is_complete = True
for i in xrange(0, len(c)):
if c[i] is None:
if i+1 < len(c):
c[i] = 0
else:
c[i] = len(P.a) - t
is_complete = False
break;
else:
t += c[i]
if is_complete:
return None
return c
def next(P,s):
t = 0
for i in xrange(0, len(s)):
t += s[i]
if i+1 >= len(s) or s[i+1] is None:
if t+1 > len(P.a):
return None
else:
s[i] += 1
return s
Basically, first either replaces the next None value in the list with either 0 if it’s not the last value in the list or with the remainder to represent a valid solution (little optimization here) if it’s the last value in the list, or it return None if there is no None value in the list. next simply increments the rightmost integer by one or returns None if an increment would breach the total limit.
Now all you need is to create a problem instance, initialize the global variables and call bt with the root:
P = MinimalSumOfSubArraySumsProblem([2,4,6,7,6,3,3,3,4,3,4,4,4,3,3,1], 4)
optimum = None
optimum_diff = float("inf")
bt([None]*P.p)
If I am not mistaken here, one more approach is dynamic programming.
You can define P[ pos, n ] as the smallest possible "penalty" accumulated up to position pos if n subarrays were created. Obviously there is some position pos' such that
P[pos', n-1] + penalty(pos', pos) = P[pos, n]
You can just minimize over pos' = 1..pos.
The naive implementation will run in O(N^2 * M), where N - size of the original array and M - number of divisions.
#Gumbo 's answer is clear and actionable, but consumes lots of time when length(A) bigger than 400 and P bigger than 8. This is because that algorithm is kind of brute-forcing with benefits as he said.
In fact, a very fast solution is using dynamic programming.
Given an array A of positive integers and a positive integer P, separate the array A into P non-overlapping subarrays such that the difference between the sum of each subarray and the perfect sum of the subarrays (sum(A)/P) is minimal.
Measure: , where is sum of elements of subarray , is the average of P subarray' sums.
This can make sure the balance of sum, because it use the definition of Standard Deviation.
Persuming that array A has N elements; Q(i,j) means the minimum Measure value when split the last i elements of A into j subarrays. D(i,j) means (sum(B)-sum(A)/P)^2 when array B consists of the i~jth elements of A ( 0<=i<=j<N ).
The minimum measure of the question is to calculate Q(N,P). And we find that:
Q(N,P)=MIN{Q(N-1,P-1)+D(0,0); Q(N-2,P-1)+D(0,1); ...; Q(N-1,P-1)+D(0,N-P)}
So it like can be solved by dynamic programming.
Q(i,1) = D(N-i,N-1)
Q(i,j) = MIN{ Q(i-1,j-1)+D(N-i,N-i);
Q(i-2,j-1)+D(N-i,N-i+1);
...;
Q(j-1,j-1)+D(N-i,N-j)}
So the algorithm step is:
1. Cal j=1:
Q(1,1), Q(2,1)... Q(3,1)
2. Cal j=2:
Q(2,2) = MIN{Q(1,1)+D(N-2,N-2)};
Q(3,2) = MIN{Q(2,1)+D(N-3,N-3); Q(1,1)+D(N-3,N-2)}
Q(4,2) = MIN{Q(3,1)+D(N-4,N-4); Q(2,1)+D(N-4,N-3); Q(1,1)+D(N-4,N-2)}
... Cal j=...
P. Cal j=P:
Q(P,P), Q(P+1,P)...Q(N,P)
The final minimum Measure value is stored as Q(N,P)!
To trace each subarray's length, you can store the
MIN choice when calculate Q(i,j)=MIN{Q+D...}
space for D(i,j);
time for calculate Q(N,P)
compared to the pure brute-forcing algorithm consumes time.
Working code below (I used php language). This code decides part quantity itself;
$main = array(2,4,6,1,6,3,2,3,4,3,4,1,4,7,3,1,2,1,3,4,1,7,2,4,1,2,3,1,1,1,1,4,5,7,8,9,8,0);
$pa=0;
for($i=0;$i < count($main); $i++){
$p[]= $main[$i];
if(abs(15 - array_sum($p)) < abs(15 - (array_sum($p)+$main[$i+1])))
{
$pa=$pa+1;
$pi[] = $i+1;
$pc = count($pi);
$ba = $pi[$pc-2] ;
$part[$pa] = array_slice( $main, $ba, count($p));
unset($p);
}
}
print_r($part);
for($s=1;$s<count($part);$s++){
echo '<br>';
echo array_sum($part[$s]);
}
code will output part sums like as below
13
14
16
14
15
15
17
I'm wondering whether the following would work:
Go from the left, as soon as sum > sigma, branch into two, one including the value that pushes it over, and one that doesn't. Recursively process data to the right with rightSum = totalSum-leftSum and rightP = P-1.
So, at the start, sum = 60
2 4 6 7 6 3 3 3 4 3 4 4 4 3 3 1
Then for 2 4 6 7, sum = 19 > sigma, so split into:
2 4 6 7 6 3 3 3 4 3 4 4 4 3 3 1
2 4 6 7 6 3 3 3 4 3 4 4 4 3 3 1
Then we process 7 6 3 3 3 4 3 4 4 4 3 3 1 and 6 3 3 3 4 3 4 4 4 3 3 1 with P = 4-1 and sum = 60-12 and sum = 60-19 respectively.
This results in, I think, O(P*n).
It might be a problem when 1 or 2 values is by far the largest, but, for any value >= sigma, we can probably just put that in it's own partition (preprocessing the array to find these might be the best idea (and reduce sum appropriately)).
If it works, it should hopefully minimise sum-of-squared-error (or close to that), which seems like the desired measure.
I propose an algorithm based on backtracking. The main function chosen randomly select an element from the original array and adds it to an array partitioned. For each addition will check to obtain a better solution than the original. This will be achieved by using a function that calculates the deviation, distinguishing each adding a new element to the page. Anyway, I thought it would be good to add an original variables in loops that you can not reach desired solution will force the program ends. By desired solution I means to add all elements with respect of condition imposed by condition from if.
sum=CalculateSum(vector)
Read P
sigma=sum/P
initialize P vectors, with names vector_partition[i], i=1..P
list_vector initialize a list what pointed this P vectors
initialize a diferences_vector with dimension of P
//that can easy visualize like a vector of vectors
//construct a non-recursive backtracking algorithm
function Deviation(vector) //function for calculate deviation of elements from a vector
{
dev=0
for i=0 to Size(vector)-1 do
dev+=|vector[i+1]-vector[i]|
return dev
}
iteration=0
//fix some maximum number of iteration for while loop
Read max_iteration
//as the number of iterations will be higher the more it will get
//a more accurate solution
while(!IsEmpty(vector))
{
for i=1 to Size(list_vector) do
{
if(IsEmpty(vector)) break from while loop
initial_deviation=Deviation(list_vector[i])
el=SelectElement(vector) //you can implement that function using a randomized
//choice of element
difference_vector[i]=|sigma-CalculateSum(list_vector[i])|
PutOnBackVector(vector_list[i], el)
if(initial_deviation>Deviation(difference_vector))
ExtractFromBackVectorAndPutOnSecondVector(list_vector, vector)
}
iteration++
//prevent to enter in some infinite loop
if (iteration>max_iteration) break from while loop
}
You can change this by adding in first if some code witch increment with a amount the calculated deviation.
aditional_amount=0
iteration=0
while
{
...
if(initial_deviation>Deviation(difference_vector)+additional_amount)
ExtractFromBackVectorAndPutOnSecondVector(list_vector, vector)
if(iteration>max_iteration)
{
iteration=0
aditional_amout+=1/some_constant
}
iteration++
//delete second if from first version
}
Your problem is very similar to, or the same as, the minimum makespan scheduling problem, depending on how you define your objective. In the case that you want to minimize the maximum |sum_i - sigma|, it is exactly that problem.
As referenced in the Wikipedia article, this problem is NP-complete for p > 2. Graham's list scheduling algorithm is optimal for p <= 3, and provides an approximation ratio of 2 - 1/p. You can check out the Wikipedia article for other algorithms and their approximation.
All the algorithms given on this page are either solving for a different objective, incorrect/suboptimal, or can be used to solve any problem in NP :)
This is very similar to the case of the one-dimensional bin packing problem, see http://www.cs.sunysb.edu/~algorith/files/bin-packing.shtml. In the associated book, The Algorithm Design Manual, Skienna suggests a first-fit decreasing approach. I.e. figure out your bin size (mean = sum / N), and then allocate the largest remaining object into the first bin that has room for it. You either get to a point where you have to start over-filling a bin, or if you're lucky you get a perfect fit. As Skiena states "First-fit decreasing has an intuitive appeal to it, for we pack the bulky objects first and hope that little objects can fill up the cracks."
As a previous poster said, the problem looks like it's NP-complete, so you're not going to solve it perfectly in reasonable time, and you need to look for heuristics.
I recently needed this and did as follows;
create an initial sub-arrays array of length given sub arrays count. sub arrays should have a sum property too. ie [[sum:0],[sum:0]...[sum:0]]
sort the main array descending.
search for the sub-array with the smallest sum and insert one item from main array and increment the sub arrays sum property by the inserted item's value.
repeat item 3 up until the end of main array is reached.
return the initial array.
This is the code in JS.
function groupTasks(tasks,groupCount){
var sum = tasks.reduce((p,c) => p+c),
initial = [...Array(groupCount)].map(sa => (sa = [], sa.sum = 0, sa));
return tasks.sort((a,b) => b-a)
.reduce((groups,task) => { var group = groups.reduce((p,c) => p.sum < c.sum ? p : c);
group.push(task);
group.sum += task;
return groups;
},initial);
}
var tasks = [...Array(50)].map(_ => ~~(Math.random()*10)+1), // create an array of 100 random elements among 1 to 10
result = groupTasks(tasks,7); // distribute them into 10 sub arrays with closest sums
console.log("input array:", JSON.stringify(tasks));
console.log(result.map(r=> [JSON.stringify(r),"sum: " + r.sum]));
You can use Max Flow algorithm.

Resources