Do loop with condition - segmentation fault

Do loop with condition - segmentation fault - arrays

I am doing a project on particle dynamics and I started by letting a particle (a sphere) fall from a certain height towards a fixed particle in the ground.
Inside a do loop (a time loop, from the initial time to a certain time elapsed with a certain time-step), I use an Euler's method to integrate the positions and velocities, and also calculate the forces (gravitational and elastic) and the collision conditions as well.
This model will later be generalized for 3, 4, ..., n particles (in the scale of hundreds of thousands), and so I am using arrays to punctuate the particles which positions and velocities as time goes by I am integrating. That way, I also put a do loop inside the time loop - for each particle - from 1 to N (the number of particles), and define N as 2 (since in this case alone I have only two particles). This is the segmentation fault, since I tell it to calculate 3 things when I only specified that I have two.
Whilst trying to fix it, when I define the parameters for i and i+1, when i=2, i+1 = 2+1 = 3 will be calculated - but I do not have a third particle. In a similar way, if I put i-1 and i instead, for i = 1 (where the loop starts), i-1 = 0, but that doesn't make sense, since I do not have a "0th" particle. In another attempt, if I change the loop from 1,N to 1,N-1, since N=1, it won't calculate for N=2. Also, I have thought about printing my results in twos, that is, for particles 1 and 2, 2 and 3, 3 and 4, and so on... (calculating i AND i+1 simultaneously for each integration, making the run time longer - which will cost me a lot of time later, since these simulations for a big number of particles can take weeks). But if I state that in files, it will repeat the file creation for all particles, except for the first and the last (even more time wasted). How can I run it only considering the first and two particles, generalizing for any number of particles that I choose?
do t = tmin, tmax, dt
do i = 1,N
call contact (xold(i), xold(i+1), r(i), r(i+1))
call forces (m(i), g, k, r(i), r(i+1), xold(i), xold(i+1))
call euler(xold(i), xnew(i), vold(i), vnew(i), dt, F(i), m(i))
write(i, *), "t=", t, "x=", xold(i), "v=", vold(i), "dx=", dx, "force=", F(i)
end do
end do

I am not quite sure, what exactly you would like to achieve. For a particle simulation, where each particle interacts with each other particle you would need to have a second loop, wouldn't you?
Somewhat like this:
do i = 1,N
do j=i+1,N
call contact (xold(i), xold(j), r(i), r(j))
call forces (m(i), g, k, r(i), r(j), xold(i), xold(j))
end do
call euler(xold(i), xnew(i), vold(i), vnew(i), dt, F(i), m(i))
write(i, *), "t=", t, "x=", xold(i), "v=", vold(i), "dx=", dx, "force=", F(i)
end do
The inner loop will not be executed if i+1 > N, so everything should be fine. For N=2 you would just get one execution with i=1 and j=2.
Edit:
calculating i AND i+1 simultaneously for each integration, making the run time longer - which will cost me a lot of time later, since these simulations for a big number of particles can take weeks
You most likely do not want to do a all-to-all particle simulation for large numbers of particles. Most people use some tree algorithms to speed this up considerably. Consider using an existing Framework for that, like PEPC.

when i=2, i+1 = 2+1 = 3 will be calculated - but I do not have a third
particle. In a similar way, if I put i-1 and i instead, for i = 1
(where the loop starts), i-1 = 0, but that doesn't make sense, since I
do not have a "0th" particle
Modulo?
when i = 0, i % 2 = 0, (i % 2) + 1 = 1
when i = 1, i % 2 = 1, (i % 2) + 1 = 2
when i = 2, i % 2 = 0, (i % 2) + 1 = 1
when i = 3, i % 2 = 1, (i % 2) + 1 = 2

Related

How to find contiguous subarray of integers in an array from n arrays such that the sum of elements of such contiguous subarrays is minimum

Input: n arrays of integers of length p.
Output: An array of p integers built by copying contiguous subarrays of the input arrays into matching indices of the output, satisfying the following conditions.
At most one subarray is used from each input array.
Every index of the output array is filled from exactly one subarray.
The output array has the minimum possible sum.
Suppose I have 2 arrays:
[1,7,2]
[2,1,8]
So if I choose a subarray [1,7] from array 1 and subarray [8] from array 2. since these 2 subarrays are not overlapping for any index and are contiguous. We are also not taking any subarray twice from an array from which we have already chosen a subarray.
We have the number of elements in the arrays inside the collection = 2 + 1 = 3, which is the same as the length of the individual array (i.e. len(array 1) which is equal to 3). So, this collection is valid.
The sum here for [1,7] and [8] is 1 + 7 + 8 = 16
We have to find a collection of such subarrays such that the total sum of the elements of subarrays is minimum.
A solution to the above 2 arrays would be a collection [2,1] from array 1 and [2] from array 2.
This is a valid collection and the sum is 2 + 1 + 2 = 5 which is the minimum sum for any such collection in this case.
I cannot think of any optimal or correct approach, so I need help.
Some Ideas:
I tried a greedy approach by choosing minimum elements from all array for a particular index since the index is always increasing (non-overlapping) after a valid choice, I don't have to bother about storing minimum value indices for every array. But this approach is clearly not correct since it will visit the same array twice.
Another method I thought was to start from the 0th index for all arrays and start storing their sum up to k elements for every array since the no. of arrays are finite, I can store the sum upto k elements in an array. Now I tried to take a minimum across these sums and for a "minimum sum", the corresponding subarray giving this sum (i.e. k such elements in that array) can be a candidate for a valid subarray of size k, thus if we take this subarray, we can add a k + 1-th element corresponding to every array into their corresponding sum and if the original minimum still holds, then we can keep on repeating this step. When the minima fail, we can consider the subarray up to the index for which minima holds and this will be a valid starting subarray. However, this approach will also clearly fail because there could exist another subarray of size < k giving minima along with remaining index elements from our subarray of size k.
Sorting is not possible either, since if we sort then we are breaking consecutive condition.
Of course, there is a brute force method too.
I am thinking, working through a greedy approach might give a progress in the approach.
I have searched on other Stackoverflow posts, but couldn't find anything which could help my problem.

To get you started, here's a recursive branch-&-bound backtracking - and potentially exhaustive - search. Ordering heuristics can have a huge effect on how efficient these are, but without mounds of "real life" data to test against there's scant basis for picking one over another. This incorporates what may be the single most obvious ordering rule.
Because it's a work in progress, it prints stuff as it goes along: all solutions found, whenever they meet or beat the current best; and the index at which a search is cut off early, when that happens (because it becomes obvious that the partial solution at that point can't be extended to meet or beat the best full solution known so far).
For example,
>>> crunch([[5, 6, 7], [8, 0, 3], [2, 8, 7], [8, 2, 3]])
displays
new best
L2[0:1] = [2] 2
L1[1:2] = [0] 2
L3[2:3] = [3] 5
sum 5
cut at 2
L2[0:1] = [2] 2
L1[1:3] = [0, 3] 5
sum 5
cut at 2
cut at 2
cut at 2
cut at 1
cut at 1
cut at 2
cut at 2
cut at 2
cut at 1
cut at 1
cut at 1
cut at 0
cut at 0
So it found two ways to get a minimal sum 5, and the simple ordering heuristic was effective enough that all other paths to full solutions were cut off early.
def disp(lists, ixs):
from itertools import groupby
total = 0
i = 0
for k, g in groupby(ixs):
j = i + len(list(g))
chunk = lists[k][i:j]
total += sum(chunk)
print(f"L{k}[{i}:{j}] = {chunk} {total}")
i = j
def crunch(lists):
n = len(lists[0])
assert all(len(L) == n for L in lists)
# Start with a sum we know can be beat.
smallest_sum = sum(lists[0]) + 1
smallest_ixs = [None] * n
ixsofar = [None] * n
def inner(i, sumsofar, freelists):
nonlocal smallest_sum
assert sumsofar <= smallest_sum
if i == n:
print()
if sumsofar < smallest_sum:
smallest_sum = sumsofar
smallest_ixs[:] = ixsofar
print("new best")
disp(lists, ixsofar)
print("sum", sumsofar)
return
# Simple greedy heuristic: try available lists in the order
# of smallest-to-largest at index i.
for lix in sorted(freelists, key=lambda lix: lists[lix][i]):
L = lists[lix]
newsum = sumsofar
freelists.remove(lix)
# Try all slices in L starting at i.
for j in range(i, n):
newsum += L[j]
# ">" to find all smallest answers;
# ">=" to find just one (potentially faster)
if newsum > smallest_sum:
print("cut at", j)
break
ixsofar[j] = lix
inner(j + 1, newsum, freelists)
freelists.add(lix)
inner(0, 0, set(range(len(lists))))
How bad is brute force?
Bad. A brute force way to compute it: say there are n lists each with p elements. The code's ixsofar vector contains p integers each in range(n). The only constraint is that all occurrences of any integer that appears in it must be consecutive. So a brute force way to compute the total number of such vectors is to generate all p-tuples and count the number that meet the constraints. This is woefully inefficient, taking O(n**p) time, but is really easy, so hard to get wrong:
def countb(n, p):
from itertools import product, groupby
result = 0
seen = set()
for t in product(range(n), repeat=p):
seen.clear()
for k, g in groupby(t):
if k in seen:
break
seen.add(k)
else:
#print(t)
result += 1
return result
For small arguments, we can use that as a sanity check on the next function, which is efficient. This builds on common "stars and bars" combinatorial arguments to deduce the result:
def count(n, p):
# n lists of length p
# for r regions, r from 1 through min(p, n)
# number of ways to split up: comb((p - r) + r - 1, r - 1)
# for each, ff(n, r) ways to spray in list indices = comb(n, r) * r!
from math import comb, prod
total = 0
for r in range(1, min(n, p) + 1):
total += comb(p-1, r-1) * prod(range(n, n-r, -1))
return total
Faster
Following is the best code I have for this so far. It builds in more "smarts" to the code I posted before. In one sense, it's very effective. For example, for randomized p = n = 20 inputs it usually finishes within a second. That's nothing to sneeze at, since:
>>> count(20, 20)
1399496554158060983080
>>> _.bit_length()
71
That is, trying every possible way would effectively take forever. The number of cases to try doesn't even fit in a 64-bit int.
On the other hand, boost n (the number of lists) to 30, and it can take an hour. At 50, I haven't seen a non-contrived case finish yet, even if left to run overnight. The combinatorial explosion eventually becomes overwhelming.
OTOH, I'm looking for the smallest sum, period. If you needed to solve problems like this in real life, you'd either need a much smarter approach, or settle for iterative approximation algorithms.
Note: this is still a work in progress, so isn't polished, and prints some stuff as it goes along. Mostly that's been reduced to running a "watchdog" thread that wakes up every 10 minutes to show the current state of the ixsofar vector.
def crunch(lists):
import datetime
now = datetime.datetime.now
start = now()
n = len(lists[0])
assert all(len(L) == n for L in lists)
# Start with a sum we know can be beat.
smallest_sum = min(map(sum, lists)) + 1
smallest_ixs = [None] * n
ixsofar = [None] * n
import threading
def watcher(stop):
if stop.wait(60):
return
lix = ixsofar[:]
while not stop.wait(timeout=600):
print("watch", now() - start, smallest_sum)
nlix = ixsofar[:]
for i, (a, b) in enumerate(zip(lix, nlix)):
if a != b:
nlix.insert(i,"--- " + str(i) + " -->")
print(nlix)
del nlix[i]
break
lix = nlix
stop = threading.Event()
w = threading.Thread(target=watcher, args=[stop])
w.start()
def inner(i, sumsofar, freelists):
nonlocal smallest_sum
assert sumsofar <= smallest_sum
if i == n:
print()
if sumsofar < smallest_sum:
smallest_sum = sumsofar
smallest_ixs[:] = ixsofar
print("new best")
disp(lists, ixsofar)
print("sum", sumsofar, now() - start)
return
# If only one input list is still free, we have to take all
# of its tail. This code block isn't necessary, but gives a
# minor speedup (skips layers of do-nothing calls),
# especially when the length of the lists is greater than
# the number of lists.
if len(freelists) == 1:
lix = freelists.pop()
L = lists[lix]
for j in range(i, n):
ixsofar[j] = lix
sumsofar += L[j]
if sumsofar >= smallest_sum:
break
else:
inner(n, sumsofar, freelists)
freelists.add(lix)
return
# Peek ahead. The smallest completion we could possibly get
# would come from picking the smallest element in each
# remaining column (restricted to the lists - rows - still
# available). This probably isn't achievable, but is an
# absolute lower bound on what's possible, so can be used to
# cut off searches early.
newsum = sumsofar
for j in range(i, n): # pick smallest from column j
newsum += min(lists[lix][j] for lix in freelists)
if newsum >= smallest_sum:
return
# Simple greedy heuristic: try available lists in the order
# of smallest-to-largest at index i.
sortedlix = sorted(freelists, key=lambda lix: lists[lix][i])
# What's the next int in the previous slice? As soon as we
# hit an int at least that large, we can do at least as well
# by just returning, to let the caller extend the previous
# slice instead.
if i:
prev = lists[ixsofar[i-1]][i]
else:
prev = lists[sortedlix[-1]][i] + 1
for lix in sortedlix:
L = lists[lix]
if prev <= L[i]:
return
freelists.remove(lix)
newsum = sumsofar
# Try all non-empty slices in L starting at i.
for j in range(i, n):
newsum += L[j]
if newsum >= smallest_sum:
break
ixsofar[j] = lix
inner(j + 1, newsum, freelists)
freelists.add(lix)
inner(0, 0, set(range(len(lists))))
stop.set()
w.join()
Bounded by DP
I've had a lot of fun with this :-) Here's the approach they were probably looking for, using dynamic programming (DP). I have several programs that run faster in "smallish" cases, but none that can really compete on a non-contrived 20x50 case. The runtime is O(2**n * n**2 * p). Yes, that's more than exponential in n! But it's still a minuscule fraction of what brute force can require (see above), and is a hard upper bound.
Note: this is just a loop nest slinging machine-size integers, and using no "fancy" Python features. It would be easy to recode in C, where it would run much faster. As is, this code runs over 10x faster under PyPy (as opposed to the standard CPython interpreter).
Key insight: suppose we're going left to right, have reached column j, the last list we picked from was D, and before that we picked columns from lists A, B, and C. How can we proceed? Well, we can pick the next column from D too, and the "used" set {A, B, C} doesn't change. Or we can pick some other list E, the "used" set changes to {A, B, C, D}, and E becomes the last list we picked from.
Now in all these cases, the details of how we reached state "used set {A, B, C} with last list D at column j" make no difference to the collection of possible completions. It doesn't matter how many columns we picked from each, or the order in which A, B, C were used: all that matters to future choices is that A, B, and C can't be used again, and D can be but - if so - must be used immediately.
Since all ways of reaching this state have the same possible completions, the cheapest full solution must have the cheapest way of reaching this state.
So we just go left to right, one column at a time, and remember for each state in the column the smallest sum reaching that state.
This isn't cheap, but it's finite ;-) Since states are subsets of row indices, combined with (the index of) the last list used, there are 2**n * n possible states to keep track of. In fact, there are only half that, since the way sketched above never includes the index of the last-used list in the used set, but catering to that would probably cost more than it saves.
As is, states here are not represented explicitly. Instead there's just a large list of sums-so-far, of length 2**n * n. The state is implied by the list index: index i represents the state where:
i >> n is the index of the last-used list.
The last n bits of i are a bitset, where bit 2**j is set if and only if list index j is in the set of used list indices.
You could, e.g., represent these by dicts mapping (frozenset, index) pairs to sums instead, but then memory use explodes, runtime zooms, and PyPy becomes much less effective at speeding it.
Sad but true: like most DP algorithms, this finds "the best" answer but retains scant memory of how it was reached. Adding code to allow for that is harder than what's here, and typically explodes memory requirements. Probably easiest here: write new to disk at the end of each outer-loop iteration, one file per column. Then memory use isn't affected. When it's done, those files can be read back in again, in reverse order, and mildly tedious code can reconstruct the path it must have taken to reach the winning state, working backwards one column at a time from the end.
def dumbdp(lists):
import datetime
_min = min
now = datetime.datetime.now
start = now()
n = len(lists)
p = len(lists[0])
assert all(len(L) == p for L in lists)
rangen = range(n)
USEDMASK = (1 << n) - 1
HUGE = sum(sum(L) for L in lists) + 1
new = [HUGE] * (2**n * n)
for i in rangen:
new[i << n] = lists[i][0]
for j in range(1, p):
print("working on", j, now() - start)
old = new
new = [HUGE] * (2**n * n)
for key, g in enumerate(old):
if g == HUGE:
continue
i = key >> n
new[key] = _min(new[key], g + lists[i][j])
newused = (key & USEDMASK) | (1 << i)
for i in rangen:
mask = 1 << i
if newused & mask == 0:
newkey = newused | (i << n)
new[newkey] = _min(new[newkey],
g + lists[i][j])
result = min(new)
print("DONE", result, now() - start)
return result

Define a vector with random steps

I want to create an array that has incremental random steps, I've used this simple code.
t_inici=(0:10*rand:100);
The problem is that the random number keeps unchangable between steps. Is there any simple way to change the seed of the random number within each step?

If you have a set number of points, say nPts, then you could do the following
nPts = 10; % Could use 'randi' here for random number of points
lims = [0, 10] % Start and end points
x = rand(1, nPts); % Create random numbers
% Sort and scale x to fit your limits and be ordered
x = diff(lims) * ( sort(x) - min(x) ) / diff(minmax(x)) + lims(1)
This approach always includes your end point, which a 0:dx:10 approach would not necessarily.
If you had some maximum number of points, say nPtsMax, then you could do the following
nPtsMax = 1000; % Max number of points
lims = [0,10]; % Start and end points
% Could do 10* or any other multiplier as in your example in front of 'rand'
x = lims(1) + [0 cumsum(rand(1, nPtsMax))];
x(x > lims(2)) = []; % remove values above maximum limit
This approach may be slower, but is still fairly quick and better represents the behaviour in your question.

My first approach to this would be to generate N-2 samples, where N is the desired amount of samples randomly, sort them, and add the extrema:
N=50;
endpoint=100;
initpoint=0;
randsamples=sort(rand(1, N-2)*(endpoint-initpoint)+initpoint);
t_inici=[initpoint randsamples endpoint];
However not sure how "uniformly random" this is, as you are "faking" the last 2 data, to have the extrema included. This will somehow distort pure randomness (I think). If you are not necessarily interested on including the extrema, then just remove the last line and generate N points. That will make sure that they are indeed random (or as random as MATLAB can create them).

Here is an alternative solution with "uniformly random"
[initpoint,endpoint,coef]=deal(0,100,10);
t_inici(1)=initpoint;
while(t_inici(end)<endpoint)
t_inici(end+1)=t_inici(end)+rand()*coef;
end
t_inici(end)=[];
In my point of view, it fits your attempts well with unknown steps, start from 0, but not necessarily end at 100.

From your code it seems you want a uniformly random step that varies between each two entries. This implies that the number of entries that the vector will have is unknown in advance.
A way to do that is as follows. This is similar to Hunter Jiang's answer but adds entries in batches instead of one by one, in order to reduce the number of loop iterations.
Guess a number of required entries, n. Any value will do, but a large value will result in fewer iterations and will probably be more efficient.
Initiallize result to the first value.
Generate n entries and concatenate them to the (temporary) result.
See if the current entries are already too many.
If they are, cut as needed and output (final) result. Else go back to step 3.
Code:
lower_value = 0;
upper_value = 100;
step_scale = 10;
n = 5*(upper_value-lower_value)/step_scale*2; % STEP 1. The number 5 here is arbitrary.
% It's probably more efficient to err with too many than with too few
result = lower_value; % STEP 2
done = false;
while ~done
result = [result result(end)+cumsum(step_scale*rand(1,n))]; % STEP 3. Include
% n new entries
ind_final = find(result>upper_value,1)-1; % STEP 4. Index of first entry exceeding
% upper_value, if any
if ind_final % STEP 5. If non-empty, we're done
result = result(1:ind_final-1);
done = true;
end
end

Finding optimal path (if exists)

Given the parameter k and an array [a0, a1, a2, a3, ..., an], where ax defines the height of the terrain, find the least amount of work you need to make the terrain passable. The terrain is passable that if the difference between two neighbouring places is smaller or equal to k. The height of the terrain at ax can be changed and the amount of work needed is equal to the difference in height you make. The height of a0 and an can't be changed and therefore some terrains may be completely unpassable.
I've been struggling with this for a while: This is what I figured out so far: Of course finding a solution (not taking the least amount of work needed into account) is easy - make 'steps' from a0 to an as in the diagram with k = 2 (red dots are the heights of the old terrain, grey for the new).
(The input for this particular terrain would be: k = 2, [0, 2, 3, 2, 5, 4, 5, 7])
As you can see, the new terrain doesn't take the old terrain into account and thus the work needed to transform the old one into this one can be huge.
Determining whether a path exists is trivial: k >= |a0-an|/n, but I have no idea about how would I go around in finding the optimal solution.

Use the simplex method to solve the following linear program.
minimize sum_i y_i
subject to
# y_i is notionally |x_i - a_i|
# equality holds in every optimal solution
y_i >= x_i - a_i for all i
y_i >= a_i - x_i for all i
# transformed terrain is passable, i.e., |x_i - x_{i+1}| <= k
x_i - x_{i+1} <= k for all i
x_{i+1} - x_i <= k for all i

One brute force solution is like this
For each element consider three next states, one with h + 1, h + 0, h - 1 where h is height of element. Now keep on traversing the array and see the following points
1)If the state has been arrived earlier discard it( so make bitmask dp for that )
2)If the you arrive at a state where traversed adjacent element diff is not k discard it.
3)If total work done till now is more than the minimum till found discard it.

Let be ht = an-a0 the total height. Let hAverageStep = ht/n. And k is the maximum step.
Ideally, at step 'i' the difference of height should be i*hAverageStep. In this case the path is just a constant straight. We want to get closer to this straight.
For the step 'i' you can observe if you need some "work" in this point to move it to the straight. If that work is greater than k, you must move also other points, for example some previous you have not worked enough.
If after moving every point as much as needed still 'an' is not reached, there's no solution.

Algorithm to split an array into P subarrays of balanced sum

I have an big array of length N, let's say something like:
2 4 6 7 6 3 3 3 4 3 4 4 4 3 3 1
I need to split this array into P subarrays (in this example, P=4 would be reasonable), such that the sum of the elements in each subarray is as close as possible to sigma, being:
sigma=(sum of all elements in original array)/P
In this example, sigma=15.
For the sake of clarity, one possible result would be:
2 4 6 7 6 3 3 3 4 3 4 4 4 3 3 1
(sums: 12,19,14,15)
I have written a very naive algorithm based in how I would do the divisions by hand, but I don't know how to impose the condition that a division whose sums are (14,14,14,14,19) is worse than one that is (15,14,16,14,16).
Thank you in advance.

First, let’s formalize your optimization problem by specifying the input, output, and the measure for each possible solution (I hope this is in your interest):
Given an array A of positive integers and a positive integer P, separate the array A into P non-overlapping subarrays such that the difference between the sum of each subarray and the perfect sum of the subarrays (sum(A)/P) is minimal.
Input: Array A of positive integers; P is a positive integer.
Output: Array SA of P non-negative integers representing the length of each subarray of A where the sum of these subarray lengths is equal to the length of A.
Measure: abs(sum(sa)-sum(A)/P) is minimal for each sa ∈ {sa | sa = (Ai, …, Ai+‍SAj) for i = (Σ SAj), j from 0 to P-1}.
The input and output define the set of valid solutions. The measure defines a measure to compare multiple valid solutions. And since we’re looking for a solution with the least difference to the perfect solution (minimization problem), measure should also be minimal.
With this information, it is quite easy to implement the measure function (here in Python):
def measure(a, sa):
sigma = sum(a)/len(sa)
diff = 0
i = 0
for j in xrange(0, len(sa)):
diff += abs(sum(a[i:i+sa[j]])-sigma)
i += sa[j]
return diff
print measure([2,4,6,7,6,3,3,3,4,3,4,4,4,3,3,1], [3,4,4,5]) # prints 8
Now finding an optimal solution is a little harder.
We can use the Backtracking algorithm for finding valid solutions and use the measure function to rate them. We basically try all possible combinations of P non-negative integer numbers that sum up to length(A) to represent all possible valid solutions. Although this ensures not to miss a valid solution, it is basically a brute-force approach with the benefit that we can omit some branches that cannot be any better than our yet best solution. E.g. in the example above, we wouldn’t need to test solutions with [9,…] (measure > 38) if we already have a solution with measure ≤ 38.
Following the pseudocode pattern from Wikipedia, our bt function looks as follows:
def bt(c):
global P, optimum, optimum_diff
if reject(P,c):
return
if accept(P,c):
print "%r with %d" % (c, measure(P,c))
if measure(P,c) < optimum_diff:
optimum = c
optimum_diff = measure(P,c)
return
s = first(P,c)
while s is not None:
bt(list(s))
s = next(P,s)
The global variables P, optimum, and optimum_diff represent the problem instance holding the values for A, P, and sigma, as well as the optimal solution and its measure:
class MinimalSumOfSubArraySumsProblem:
def __init__(self, a, p):
self.a = a
self.p = p
self.sigma = sum(a)/p
Next we specify the reject and accept functions that are quite straight forward:
def reject(P,c):
return optimum_diff < measure(P,c)
def accept(P,c):
return None not in c
This simply rejects any candidate whose measure is already more than our yet optimal solution. And we’re accepting any valid solution.
The measure function is also slightly changed due to the fact that c can now contain None values:
def measure(P, c):
diff = 0
i = 0
for j in xrange(0, P.p):
if c[j] is None:
break;
diff += abs(sum(P.a[i:i+c[j]])-P.sigma)
i += c[j]
return diff
The remaining two function first and next are a little more complicated:
def first(P,c):
t = 0
is_complete = True
for i in xrange(0, len(c)):
if c[i] is None:
if i+1 < len(c):
c[i] = 0
else:
c[i] = len(P.a) - t
is_complete = False
break;
else:
t += c[i]
if is_complete:
return None
return c
def next(P,s):
t = 0
for i in xrange(0, len(s)):
t += s[i]
if i+1 >= len(s) or s[i+1] is None:
if t+1 > len(P.a):
return None
else:
s[i] += 1
return s
Basically, first either replaces the next None value in the list with either 0 if it’s not the last value in the list or with the remainder to represent a valid solution (little optimization here) if it’s the last value in the list, or it return None if there is no None value in the list. next simply increments the rightmost integer by one or returns None if an increment would breach the total limit.
Now all you need is to create a problem instance, initialize the global variables and call bt with the root:
P = MinimalSumOfSubArraySumsProblem([2,4,6,7,6,3,3,3,4,3,4,4,4,3,3,1], 4)
optimum = None
optimum_diff = float("inf")
bt([None]*P.p)

If I am not mistaken here, one more approach is dynamic programming.
You can define P[ pos, n ] as the smallest possible "penalty" accumulated up to position pos if n subarrays were created. Obviously there is some position pos' such that
P[pos', n-1] + penalty(pos', pos) = P[pos, n]
You can just minimize over pos' = 1..pos.
The naive implementation will run in O(N^2 * M), where N - size of the original array and M - number of divisions.

#Gumbo 's answer is clear and actionable, but consumes lots of time when length(A) bigger than 400 and P bigger than 8. This is because that algorithm is kind of brute-forcing with benefits as he said.
In fact, a very fast solution is using dynamic programming.
Given an array A of positive integers and a positive integer P, separate the array A into P non-overlapping subarrays such that the difference between the sum of each subarray and the perfect sum of the subarrays (sum(A)/P) is minimal.
Measure: , where is sum of elements of subarray , is the average of P subarray' sums.
This can make sure the balance of sum, because it use the definition of Standard Deviation.
Persuming that array A has N elements; Q(i,j) means the minimum Measure value when split the last i elements of A into j subarrays. D(i,j) means (sum(B)-sum(A)/P)^2 when array B consists of the i~jth elements of A ( 0<=i<=j<N ).
The minimum measure of the question is to calculate Q(N,P). And we find that:
Q(N,P)=MIN{Q(N-1,P-1)+D(0,0); Q(N-2,P-1)+D(0,1); ...; Q(N-1,P-1)+D(0,N-P)}
So it like can be solved by dynamic programming.
Q(i,1) = D(N-i,N-1)
Q(i,j) = MIN{ Q(i-1,j-1)+D(N-i,N-i);
Q(i-2,j-1)+D(N-i,N-i+1);
...;
Q(j-1,j-1)+D(N-i,N-j)}
So the algorithm step is:
1. Cal j=1:
Q(1,1), Q(2,1)... Q(3,1)
2. Cal j=2:
Q(2,2) = MIN{Q(1,1)+D(N-2,N-2)};
Q(3,2) = MIN{Q(2,1)+D(N-3,N-3); Q(1,1)+D(N-3,N-2)}
Q(4,2) = MIN{Q(3,1)+D(N-4,N-4); Q(2,1)+D(N-4,N-3); Q(1,1)+D(N-4,N-2)}
... Cal j=...
P. Cal j=P:
Q(P,P), Q(P+1,P)...Q(N,P)
The final minimum Measure value is stored as Q(N,P)!
To trace each subarray's length, you can store the
MIN choice when calculate Q(i,j)=MIN{Q+D...}
space for D(i,j);
time for calculate Q(N,P)
compared to the pure brute-forcing algorithm consumes time.

Working code below (I used php language). This code decides part quantity itself;
$main = array(2,4,6,1,6,3,2,3,4,3,4,1,4,7,3,1,2,1,3,4,1,7,2,4,1,2,3,1,1,1,1,4,5,7,8,9,8,0);
$pa=0;
for($i=0;$i < count($main); $i++){
$p[]= $main[$i];
if(abs(15 - array_sum($p)) < abs(15 - (array_sum($p)+$main[$i+1])))
{
$pa=$pa+1;
$pi[] = $i+1;
$pc = count($pi);
$ba = $pi[$pc-2] ;
$part[$pa] = array_slice( $main, $ba, count($p));
unset($p);
}
}
print_r($part);
for($s=1;$s<count($part);$s++){
echo '<br>';
echo array_sum($part[$s]);
}
code will output part sums like as below
13
14
16
14
15
15
17

I'm wondering whether the following would work:
Go from the left, as soon as sum > sigma, branch into two, one including the value that pushes it over, and one that doesn't. Recursively process data to the right with rightSum = totalSum-leftSum and rightP = P-1.
So, at the start, sum = 60
2 4 6 7 6 3 3 3 4 3 4 4 4 3 3 1
Then for 2 4 6 7, sum = 19 > sigma, so split into:
2 4 6 7 6 3 3 3 4 3 4 4 4 3 3 1
2 4 6 7 6 3 3 3 4 3 4 4 4 3 3 1
Then we process 7 6 3 3 3 4 3 4 4 4 3 3 1 and 6 3 3 3 4 3 4 4 4 3 3 1 with P = 4-1 and sum = 60-12 and sum = 60-19 respectively.
This results in, I think, O(P*n).
It might be a problem when 1 or 2 values is by far the largest, but, for any value >= sigma, we can probably just put that in it's own partition (preprocessing the array to find these might be the best idea (and reduce sum appropriately)).
If it works, it should hopefully minimise sum-of-squared-error (or close to that), which seems like the desired measure.

I propose an algorithm based on backtracking. The main function chosen randomly select an element from the original array and adds it to an array partitioned. For each addition will check to obtain a better solution than the original. This will be achieved by using a function that calculates the deviation, distinguishing each adding a new element to the page. Anyway, I thought it would be good to add an original variables in loops that you can not reach desired solution will force the program ends. By desired solution I means to add all elements with respect of condition imposed by condition from if.
sum=CalculateSum(vector)
Read P
sigma=sum/P
initialize P vectors, with names vector_partition[i], i=1..P
list_vector initialize a list what pointed this P vectors
initialize a diferences_vector with dimension of P
//that can easy visualize like a vector of vectors
//construct a non-recursive backtracking algorithm
function Deviation(vector) //function for calculate deviation of elements from a vector
{
dev=0
for i=0 to Size(vector)-1 do
dev+=|vector[i+1]-vector[i]|
return dev
}
iteration=0
//fix some maximum number of iteration for while loop
Read max_iteration
//as the number of iterations will be higher the more it will get
//a more accurate solution
while(!IsEmpty(vector))
{
for i=1 to Size(list_vector) do
{
if(IsEmpty(vector)) break from while loop
initial_deviation=Deviation(list_vector[i])
el=SelectElement(vector) //you can implement that function using a randomized
//choice of element
difference_vector[i]=|sigma-CalculateSum(list_vector[i])|
PutOnBackVector(vector_list[i], el)
if(initial_deviation>Deviation(difference_vector))
ExtractFromBackVectorAndPutOnSecondVector(list_vector, vector)
}
iteration++
//prevent to enter in some infinite loop
if (iteration>max_iteration) break from while loop
}
You can change this by adding in first if some code witch increment with a amount the calculated deviation.
aditional_amount=0
iteration=0
while
{
...
if(initial_deviation>Deviation(difference_vector)+additional_amount)
ExtractFromBackVectorAndPutOnSecondVector(list_vector, vector)
if(iteration>max_iteration)
{
iteration=0
aditional_amout+=1/some_constant
}
iteration++
//delete second if from first version
}

Your problem is very similar to, or the same as, the minimum makespan scheduling problem, depending on how you define your objective. In the case that you want to minimize the maximum |sum_i - sigma|, it is exactly that problem.
As referenced in the Wikipedia article, this problem is NP-complete for p > 2. Graham's list scheduling algorithm is optimal for p <= 3, and provides an approximation ratio of 2 - 1/p. You can check out the Wikipedia article for other algorithms and their approximation.
All the algorithms given on this page are either solving for a different objective, incorrect/suboptimal, or can be used to solve any problem in NP :)

This is very similar to the case of the one-dimensional bin packing problem, see http://www.cs.sunysb.edu/~algorith/files/bin-packing.shtml. In the associated book, The Algorithm Design Manual, Skienna suggests a first-fit decreasing approach. I.e. figure out your bin size (mean = sum / N), and then allocate the largest remaining object into the first bin that has room for it. You either get to a point where you have to start over-filling a bin, or if you're lucky you get a perfect fit. As Skiena states "First-fit decreasing has an intuitive appeal to it, for we pack the bulky objects first and hope that little objects can fill up the cracks."
As a previous poster said, the problem looks like it's NP-complete, so you're not going to solve it perfectly in reasonable time, and you need to look for heuristics.

I recently needed this and did as follows;
create an initial sub-arrays array of length given sub arrays count. sub arrays should have a sum property too. ie [[sum:0],[sum:0]...[sum:0]]
sort the main array descending.
search for the sub-array with the smallest sum and insert one item from main array and increment the sub arrays sum property by the inserted item's value.
repeat item 3 up until the end of main array is reached.
return the initial array.
This is the code in JS.
function groupTasks(tasks,groupCount){
var sum = tasks.reduce((p,c) => p+c),
initial = [...Array(groupCount)].map(sa => (sa = [], sa.sum = 0, sa));
return tasks.sort((a,b) => b-a)
.reduce((groups,task) => { var group = groups.reduce((p,c) => p.sum < c.sum ? p : c);
group.push(task);
group.sum += task;
return groups;
},initial);
}
var tasks = [...Array(50)].map(_ => ~~(Math.random()*10)+1), // create an array of 100 random elements among 1 to 10
result = groupTasks(tasks,7); // distribute them into 10 sub arrays with closest sums
console.log("input array:", JSON.stringify(tasks));
console.log(result.map(r=> [JSON.stringify(r),"sum: " + r.sum]));

You can use Max Flow algorithm.

Why is the average number of steps for finding an item in an array N/2?

Could somebody explain why the average number of steps for finding an item in an unsorted array data-structure is N/2?

This really depends what you know about the numbers in the array. If they're all drawn from a distribution where all the probability mass is on a single value, then on expectation it will take you exactly 1 step to find the value you're looking for, since every value is the same, for example.
Let's now make a pretty strong assumption, that the array is filled with a random permutation of distinct values. You can think of this as picking some arbitrary sorted list of distinct elements and then randomly permuting it. In this case, suppose you're searching for some element in the array that actually exists (this proof breaks down if the element is not present). Then the number of steps you need to take is given by X, where X is the position of the element in the array. The average number of steps is then E[X], which is given by
E[X] = 1 Pr[X = 1] + 2 Pr[X = 2] + ... + n Pr[X = n]
Since we're assuming all the elements are drawn from a random permutation,
Pr[X = 1] = Pr[X = 2] = ... = Pr[X = n] = 1/n
So this expression is given by
E[X] = sum (i = 1 to n) i / n = (1 / n) sum (i = 1 to n) i = (1 / n) (n)(n + 1) / 2
= (n + 1) / 2
Which, I think, is the answer you're looking for.

The question as stated is just wrong. Linear search may perform better.

Perhaps a simpler example that shows why the average is N/2 is this:
Assume you have an unsorted array of 10 items: [5, 0, 9, 8, 1, 2, 7, 3, 4, 6]. This is all the digits [0..9].
Since the array is unsorted (i.e. you know nothing about the order of the items), the only way you can find a particular item in the array is by doing a linear search: start at the first item and go until you find what you're looking for, or you reach the end.
So let's count how many operations it takes to find each item. Finding the first item (5) takes only one operation. Finding the second item (0) takes two. Finding the last item (6) takes 10 operations. The total number of operations required to find all 10 items is 1+2+3+4+5+6+7+8+9+10, or 55. The average is 55/10, or 5.5.
The "linear search takes, on average, N/2 steps" conventional wisdom makes a number of assumptions. The two biggest are:
The item you're looking for is in the array. If an item isn't in the array, then it takes N steps to determine that. So if you're often looking for items that aren't there, then your average number of steps per search is going to be much higher than N/2.
On average, each item is searched for approximately as often as any other item. That is, you search for "6" as often as you search for "0", etc. If some items are looked up significantly more often than others, then the average number of steps per search is going to be skewed in favor of the items that are searched for more frequently. The number will be higher or lower than N/2, depending on the positions of the most frequently looked-up items.

While I think templatetypedef has the most instructive answer, in this case there is a much simpler one.
Consider permutations of the set {x1, x2, ..., xn} where n = 2m. Now take some element xi you wish to locate. For each permutation where xi occurs at index m - k, there is a corresponding mirror image permutation where xi occurs at index m + k. The mean of these possible indices is just [(m - k) + (m + k)]/2 = m = n/2. Therefore the mean of all all possible permutations of the set is n/2.

Consider a simple reformulation of the question:
What would be the limit of
lim (i->inf) of (sum(from 1 to i of random(n)) /i)
Or in C:
int sum = 0, i;
for (i = 0; i < LARGE_NUM; i++) sum += random(n);
sum /= LARGE_NUM;
If we assume that our random have even distribution of values (each value from 1 to n is equally likely to be produced), then the expected result would be (1+n)/2.