Find all possible row-wise sums in a 2D array - arrays

Ideally I'm looking for a c# solution, but any help on the algorithm will do.
I have a 2-dimension array (x,y). The max columns (max x) varies between 2 and 10 but can be determined before the array is actually populated. Max rows (y) is fixed at 5, but each column can have a varying number of values, something like:
1 2 3 4 5 6 7...10
A 1 1 7 9 1 1
B 2 2 5 2 2
C 3 3
D 4
E 5
I need to come up with the total of all possible row-wise sums for the purpose of looking for a specific total. That is, a row-wise total could be the cells A1 + B2 + A3 + B5 + D6 + A7 (any combination of one value from each column).
This process will be repeated several hundred times with different cell values each time, so I'm looking for a somewhat elegant solution (better than what I've been able to come with). Thanks for your help.

The Problem Size
Let's first consider the worst case:
You have 10 columns and 5 (full) rows per column. It should be clear that you will be able to get (with the appropriate number population for each place) up to 5^10 ≅ 10^6 different results (solution space).
For example, the following matrix will give you the worst case for 3 columns:
| 1 10 100 |
| 2 20 200 |
| 3 30 300 |
| 4 40 400 |
| 5 50 500 |
resulting in 5^3=125 different results. Each result is in the form {a1 a2 a3} with ai ∈ {1,5}
It's quite easy to show that such a matrix will always exist for any number n of columns.
Now, to get each numerical result, you will need to do n-1 sums, adding up to a problem size of O(n 5^n). So, that's the worst case and I think nothing can be done about it, because to know the possible results you NEED to effectively perform the sums.
More benign incarnations:
The problem complexity may be cut off in two ways:
Less numbers (i.e. not all columns are full)
Repeated results (i.e. several partial sums give the same result, and you can join them in one thread). Much more in this later.
Let's see a simplified example of the later with two rows:
| 7 6 100 |
| 3 4 200 |
| 1 2 200 |
at first sight you will need to do 2 3^3 sums. But that's not the real case. As you add up the first column you don't get the expected 9 different results, but only 6 ({13,11,9,7,5,3}).
So you don't have to carry your nine results up to the third column, but only 6.
Of course, that is on the expense of deleting the repeating numbers from the list. The "Removal of Repeated Integer Elements" was posted before in SO and I'll not repeat the discussion here, but just cite that doing a mergesort O(m log m) in the list size (m) will remove the duplicates. If you want something easier, a double loop O(m^2) will do.
Anyway, I'll not try to calculate the size of the (mean) problem in this way for several reasons. One of them is that the "m" in the sort merge is not the size of the problem, but the size of the vector of results after adding up any two columns, and that operation is repeated (n-1) times ... and I really don't want to do the math :(.
The other reason is that as I implemented the algorithm, we will be able to use some experimental results and save us from my surely leaking theoretical considerations.
The Algorithm
With what we said before, it is clear that we should optimize for the benign cases, as the worst case is a lost one.
For doing so, we need to use lists (or variable dim vectors, or whatever can emulate those) for the columns and do a merge after every column add.
The merge may be replaced by several other algorithms (such as an insertion on a BTree) without modifying the results.
So the algorithm (procedural pseudocode) is something like:
Set result_vector to Column 1
For column i in (2 to n-1)
Remove repeated integers in the result_vector
Add every element of result_vector to every element of column i+1
giving a new result vector
Next column
Remove repeated integers in the result_vector
Or as you asked for it, a recursive version may work as follows:
function genResVector(a:list, b:list): returns list
local c:list
{
Set c = CartesianProduct (a x b)
Set c = Sum up each element {a[i],b[j]} of c </code>
Drop repeated elements of c
Return(c)
}
function ResursiveAdd(a:matrix, i integer): returns list
{
genResVector[Column i from a, RecursiveAdd[a, i-1]];
}
function ResursiveAdd(a:matrix, i==0 integer): returns list={0}
Algorithm Implementation (Recursive)
I choose a functional language, I guess it's no big deal to translate to any procedural one.
Our program has two functions:
genResVector, which sums two lists giving all possible results with repeated elements removed, and
recursiveAdd, which recurses on the matrix columns adding up all of them.
recursiveAdd, which recurses on the matrix columns adding up all of them.
The code is:
genResVector[x__, y__] := (* Header: A function that takes two lists as input *)
Union[ (* remove duplicates from resulting list *)
Apply (* distribute the following function on the lists *)
[Plus, (* "Add" is the function to be distributed *)
Tuples[{x, y}],2] (*generate all combinations of the two lists *)];
recursiveAdd[t_, i_] := genResVector[t[[i]], recursiveAdd[t, i - 1]];
(* Recursive add function *)
recursiveAdd[t_, 0] := {0}; (* With its stop pit *)
Test
If we take your example list
| 1 1 7 9 1 1 |
| 2 2 5 2 2 |
| 3 3 |
| 4 |
| 5 |
And run the program the result is:
{11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27}
The maximum and minimum are very easy to verify since they correspond to taking the Min or Max from each column.
Some interesting results
Let's consider what happens when the numbers on each position of the matrix is bounded. For that we will take a full (10 x 5 ) matrix and populate it with Random Integers.
In the extreme case where the integers are only zeros or ones, we may expect two things:
A very small result set
Fast execution, since there will be a lot of duplicate intermediate results
If we increase the Range of our Random Integers we may expect increasing result sets and execution times.
Experiment 1: 5x10 matrix populated with varying range random integers
It's clear enough that for a result set near the maximum result set size (5^10 ≅ 10^6 ) the Calculation time and the "Number of != results" have an asymptote. The fact that we see increasing functions just denote that we are still far from that point.
Morale: The smaller your elements are, the better chances you have to get it fast. This is because you are likely to have a lot of repetitions!
Note that our MAX calculation time is near 20 secs for the worst case tested
Experiment 2: Optimizations that aren't
Having a lot of memory available, we can calculate by brute force, not removing the repeated results.
The result is interesting ... 10.6 secs! ... Wait! What happened ? Our little "remove repeated integers" trick is eating up a lot of time, and when there are not a lot of results to remove there is no gain, but looses in trying to get rid of the repetitions.
But we may get a lot of benefits from the optimization when the Max numbers in the matrix are well under 5 10^5. Remember that I'm doing these tests with the 5x10 matrix fully loaded.
The Morale of this experiment is: The repeated integer removal algorithm is critical.
HTH!
PS: I have a few more experiments to post, if I get the time to edit them.

Related

Efficient removal of duplicates in array

How can duplicates be removed and recorded from an array with the following constraints:
The running time must be at most O(n log n)
The additional memory used must be at most O(n)
The result must fulfil the following:
Duplicates must be moved to the end of the original array
The order of the first occurrence of each unique element must be preserved
For example, from this input:
int A[] = {2,3,7,3,2,11,2,3,1,15};
The result should be similar to this (only the order of duplicates may differ):
2 3 7 11 1 15 3 3 2 2
As I understand it, the goal is to split an array into two parts: unique elements and duplicates in such a way that the order of the first occurrence of the unique elements is preserved.
Using the the array of the OP as an example:
A={2,3,7,3,2,11,2,3,1,15}
A solution could do the following::
Initialize the helper array with indices 0, ..., n-1:
B={0,1,2,3,4,5,6,7,8,9}
Sort the pairs (A[i],B[i]) using A[i] as key and with a stable sorting algorithm of complexity O(n log n):
A={1,2,2,2,3,3,3,7,11,15}
B={8,0,4,6,1,3,7,2,5, 9}
With n being the size of the array, go through the pairs (A[i],B[i]) and for all duplicates (A[i]==A[i-1]), add n to B[i]:
A={1,2, 2, 2,3, 3, 3,7,11,15}
B={8,0,14,16,1,13,17,2, 5, 9}
Sort the pairs (A[i],B[i]) again, but now using B[i] as key:
A={2,3,7,11,1,15, 3, 2, 2, 3}
B={0,1,2, 5,8, 9,13,14,16,17}
A then contains the desired result.
Steps 1 and 3 are O(n) and steps 2 and 4 can be done in O(n log n), so overall complexity is O(n log n).
Note that this method also preserves the order of duplicates. If you want them sorted, you can assign indices n, n+1, ... in step 3 instead of adding n.
Here is a very important hint: when an algorithm is permitted O(n) extra space, that is not the same as saying it can only use the same amount of memory as the input array!
For example, given the input array int array[] = {2,3,7,3,2,11,2,3,1,15}; (10 elements)That is a total space of 10 * sizeof(int) bytes.On a 64-bit machine an int is 8 bytes long, making the array 80 bytes of data.
However, I can use more space for my extra array than just 80 bytes! In fact, I can make a histogram structure that looks like this:
struct histogram
{
bool is_used; // Is this element in use in the histogram?
int value; // The integer value represented by this element
size_t index; // The index in the output array of the FIRST instance of the value
size_t count; // The number of times the value appears in the source array
};
typedef struct histogram histogram;
And since that is a fixed, finite amount of space, I can feel totally free to allocate n of them!
histogram * new_histogram( size_t size )
{
return calloc( size, sizeof(struct histogram) );
}
On my machine that’s 240 bytes.
And yes, this absolutely, totally complies with the O(n) extra space requirement! (Because we are only using space for n extra items. Bigger items, yes, but only n of them.)
Goals
So, why make a histogram with all that extra stuff in it?
We are counting duplicates — suggesting that we should be looking at a Counting Sort, and hence, a histogram.
Accept integers in a range beyond [0,n).
The example array has 10 items, so our histogram should only have 10 slots. But there are integer values larger than 9.
Keep all the non-duplicate values in the same order as input
So we need to track the index of the first instance of each value in the input array.
We are obviously not sorting the data, but the basic idea behind a Counting Sort is to build a histogram and then use that histogram to overwrite the array with the ordered elements.
This is a powerful idea. We are going to tweak it.
The Algorithm
Remember that our input array is also our output array! So we will overwrite the array’s input values with our algorithm.
Let’s look at our example again:
2 3 7 3 2 11 2 3 1 15
  0    1    2    3    4    •5     6    7    8     9
❶ Build the histogram:
0 1 2 3 4 5 6 7 8 9 (index in histogram)
used?: no yes yes yes yes yes no yes no no
value: 0 11 2 3 1 15 0 7 0 0
index: 0 3 0 1 4 5 0 2 0 0
count: 0 1 3 3 1 1 0 1 0 0
I used a simple non-negative modulo function to get a hash index into the histogram: abs(value) % histogram_size, then found the first matching or unused entry, again modulo the histogram size. Our histogram has a single collision: 1 and 11 (mod 10) both hash to 1. Since we encountered 11 first it gets stored at index 1 of the histogram, and for 1 we had to seek to the first unused index: 4.
We can see that the duplicate values all have a count of 2 or more, and all non-duplicate values have a count of 1.
The magic here is the index value. Look at 11. It’s index is 3, not 5. If we look at our desired output we can see why:
2 3 7 11 1 15   2 2 3 3.
  0    1    2    •3     4     5       6    7    8    9
The 11 is in index 3 of the output. This is a very simple counting trick when building the histogram. Keep a running index that we only increment when we first add a value to the histogram. This index is where the value should appear in the ouput!
❷ Use the histogram to put the non-duplicate values into the array.
Clearly, anything with a non-zero count appears at least once in the input, so it must also be output.
Here’s where our magic histogram index first helps us. We already know exactly where in the array to put the value!
2 3 7 11 1 15
  0    1    2     3     4     5    ⟵   index into the array to put the value
You should take a moment to compare the array output index with the index values stored in the histogram above and convince yourself that it works.
❸ Use the histogram to put the duplicate values into the array.
So, at what index do we start putting duplicates into the array? Do we happen to have some magic index laying around somewhere that could help? From when we built the histogram?
Again stating the obvious, anything with a count greater than 1 is a value with duplicates. For each duplicate, put count-1 copies into the array.
We don’t care what order the duplicates appear, so we’ll just take them in the order they are stored in the histogram.
Complexity
The complexity of a Counting Sort is O(n+k): one pass over the input array (to build the histogram) and one pass over the histogram data (to rebuild the array in sorted order).
Our modification is: one pass over the input array (to build the histogram), then one pass over the histogram to build the non-duplicate partition, then one more pass over the histogram to build the duplicates partition. That’s a complexity of O(n+2k).
In both cases it reduces to an O(n) worst-case complexity. In fact, it is also an Ω(n) best-case complexity, making it a Θ(n) complexity — it takes the same processing per element no matter what the input.
Aaaaaahhhh! I gotta code that!!!?
Yep. It is a only a tiny bit more complex than you are used to. Remember, you only need a few things:
An array of integer values (obtained from the user?)
A histogram array
A function to turn an integer value into an index into the histogram
A function that does the three things:
Build the histogram from the array
Use the histogram to write the non-duplicate values back into the array in the correct spots
Use the histogram to write the duplicate values to the end of the array
Ability to print an integer array
Your main() should look something like this:
int main(void)
{
// Get number of integers to input
int size = 0;
scanf( "%d", &n );
// Allocate and get the integers
int * array = malloc( size );
for (int n = 0; n < size; n++)
scanf( "%d", &array[n] );
// Partition the array between non-duplicate and duplicate values
int pivot = partition( array, size );
// Print the results
print_array( "non-duplicates:", array, pivot );
print_array( "duplicates: ", array+pivot, size-pivot );
free( array );
return 0;
}
Notice the complete lack of input error checking. You can assume that your professor will test your program without inputting hello or anything like that.
You can do this!

2D array minimum sum of Y elements and just two rows that we can chose to get minimum

With given 2d array[X][Y], i have to find the smallest possible sum of Y elements but:
the sum must be created by using just 2 rows,
each value must be from different index
Example:
for array
7 3 7 9
2 20 10 6
8 8 8 8
Result should be 18, as we get 3 + 7 from 1st row and 2 + 6 from 2nd.
I've been thinking about few hours but i can't figure out how to deal with it.
Try this one here.
Method 1 (Naive Approach): Check every possible submatrix in given 2D
array. This solution requires 4 nested loops and time complexity of
this solution would be O(n^4).
Method 2 (Efficient Approach): Kadane’s algorithm for 1D array can be
used to reduce the time complexity to O(n^3).

Efficient algorithm to print sum of elements at all possible subsequences of length 2 to n+1 [duplicate]

This question already has answers here:
Sum of products of elements of all subarrays of length k
(2 answers)
Permutation of array
(13 answers)
Closed 7 years ago.
I will start with an example. Suppose we have an array of size 3 with elements a, b and c like: (where a, b and c are some numerical values)
|1 | 2| 3| |a | b| c|
(Assume index starts from 1 as shown in the example above)
Now all possible increasing sub-sequence of length 2 are:
12 23 13
so the sum of product of elements at those indexes is required, that is, ab+bc+ac
For length 3 we have only one increasing sub-sequence, that is, 123 so abc should be printed.
For length 4 we have no sequence so 0 is printed and the program terminates.
So output for the given array will be:
ab+bc+ac,abc,0
So for example if the elements a, b and c are 1, 2 and 3 respectively then the output should be 11,6,0
Similarly, for an array of size 4 with elements a,b,c,d the output will be:
ab+ac+ad+bc+bd+cd,abc+abd+acd+bcd,abcd,0
and so on...
Now obviously brute force will be too inefficient for large value of array size. I was wondering if there is an efficient algorithm to compute the output for an array of given size?
Edit 1: I tried finding a pattern. For example for an array of size 4:
The first value we need is :(ab+ac+bc)+d(a+b+c)= ab+ac+ad+bc+bd+cd (Take A=ab+ac+bd)
then the second value we need is:(abc) +d(A) = abc+abd+acd+bcd(B=abc)
then the third value we need is : (0) +d(B) = abcd(Let's take 0 as C)
then the fourth value we need is: +d(C) = 0
But it still requires a lot of computation and I can't figure out an efficient way to implement this.
Edit 2: My question is different then this since:
I don't need all possible permutations. I need all possible increasing sub-sequences from length 2 to n+1.
I also don't need to print all possible such sequences, I just need the value thus obtained (as explained above) and hence I am looking for some maths concept or/and some dynamic programming approach to solve this problem efficiently.
Note I am finding the set of all possible such increasing sub-sequences based on the index value and then computing based on the values at those index position as explained above.
As a post that seems to have disappeared pointed out one way is to get a recurrence relation. Let S(n,k) be the sum over increasing subsequences (of 1..n) of length k of the product of the array elements indexed by the sequence. Such a subsequence either ends in n or not; in the first case it's the concatenation of a subsequence of length k-1 of 1..n-1 and {n}; in the second case it's a subsequence of 1..n-1 of length k. Thus:
S(n,k) = S(n-1,k) + A[n] * S(n-1,k-1)
For this always to make sense we need to add:
S(n,0) = 1
S(n,m) = 0 for m>n

Can we use binary search to find most frequently occuring integer in sorted array? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Problem:
Given a sorted array of integers find the most frequently occurring integer. If there are multiple integers that satisfy this condition, return any one of them.
My basic solution:
Scan through the array and keep track of how many times you've seen each integer. Since it's sorted, you know that once you see a different integer, you've gotten the frequency of the previous integer. Keep track of which integer had the highest frequency.
This is O(N) time, O(1) space solution.
I am wondering if there's a more efficient algorithm that uses some form of binary search. It will still be O(N) time, but it should be faster for the average case.
Asymptotically (big-oh wise), you cannot use binary search to improve the worst case, for the reasons the answers above mine have presented. However, here are some ideas that may or may not help you in practice.
For each integer, binary search for its last occurrence. Once you find it, you know how many times it appears in the array, and can update your counts accordingly. Then, continue your search from the position you found.
This is advantageous if you have only a few elements that repeat a lot of times, for example:
1 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 3 3 3
Because you will only do 3 binary searches. If, however, you have many distinct elements:
1 2 3 4 5 6
Then you will do O(n) binary searches, resulting in O(n log n) complexity, so worse.
This gives you a better best case and a worse worst case than your initial algorithm.
Can we do better? We could improve the worst case by finding the last occurrence of the number at position i like this: look at 2i, then at 4i etc. as long as the value at those positions are the same. If they are not, look at (i + 2i) / 2 etc.
For example, consider the array:
i
1 2 3 4 5 6 7 ...
1 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 3 3 3
We look at 2i = 2, it has the same value. We look at 4i = 4, same value. We look at 8i = 8, different value. We backtrack to (4 + 8) / 2 = 6. Different value. Backtrack to (4 + 6) / 2 = 5. Same value. Try (5 + 6) / 2 = 5, same value. We search no more, because our window has width 1, so we're done. Continue the search from position 6.
This should improve the best case, while keeping the worst case as fast as possible.
Asymptotically, nothing is improved. To see if it actually works better on average in practice, you'll have to test it.
Binary search, which eliminates half of the remaining candidates, probably wouldn't work. There are some techniques you could use to avoid reading every element in the array. Unless your array is extremely long or you're solving a problem for curiosity, the naive (linear scan) solution is probably good enough.
Here's why I think binary search wouldn't work: start with an array: given the value of the middle item, you do not have enough information to eliminate the lower or upper half from the search.
However, we can scan the array in multiple passes, each time checking twice as many elements. When we find two elements that are the same, make one final pass. If no other elements were repeated, you've found the longest element run (without even knowing how many of that element is in the sorted list).
Otherwise, investigate the two (or more) longer sequences to determine which is longest.
Consider a sorted list.
Index 0 1 2 3 4 5 6 7 8 9 a b c d e f
List 1 2 3 3 3 3 3 3 3 4 5 5 6 6 6 7
Pass1 1 . . . . . . 3 . . . . . . . 7
Pass2 1 . . 3 . . . 3 . . . 5 . . . 7
Pass3 1 2 . 3 . x . 3 . 4 . 5 . 6 . 7
After pass 3, we know that the run of 3's must be at least 5, while the longest run of any other number is at most 3. Therefore, 3 is the most frequently occurring number in the list.
Using the right data structures and algorithms (use binary-tree-style indexing), you can avoid reading values more than once. You can also avoid reading the 3 (marked as an x in pass 3) since you already know its value.
This solution has running time O(n/k) which degrades to O(n) for k=1 for a list with n elements and a longest run of k elements. For small k, the naive solution will perform better due to simpler logic, data structures, and higher RAM cache hits.
If you need to determine the frequency of the most common number, it would take O((n/k) log k) as indicated by David to find the first and last position of the longest run of numbers using binary search on up to n/k groups of size k.
The worst case cannot be better than O(n) time. Consider the case where each element exists once, except for one element which exists twice. In order to find that element, you'd need to look at every element in the array until you find it. This is because knowing the value of any array element does not give you any information regarding the location of the duplicate element, until it's actually found. This is in contrast to binary search, where the value of an array element allows you to rule out many other elements.
No, in the worst case we have to scan at least n - 2 elements, but see
below for an algorithm that exploits inputs with many duplicates.
Consider an adversary that, for the first n - 3 distinct probes into the
n-element array, returns m for the value at index m. Now the algorithm
knows that the array looks like
1 2 3 ... i-1 ??? i+1 ... j-1 ??? j+1 ... k-1 ??? k+1 ... n-2 n-1 n.
Depending on what the ???s are, the sole correct answer could be j-1
or j+1, so the algorithm isn’t done yet.
This example involved an array where there were very few duplicates. In
fact, we can design an algorithm that, if the most frequent element
occurs k times out of n, uses O((n/k) log k) probes into the array. For
j from ceil(log2(n)) - 1 down to 0, examine the subarray consisting of
every (2**j)th element. Stop if we find a duplicate. The cost so far
is O(n/k). Now, for each element in the subarray, use binary search to
find its extent (O(n/k) searches in subarrays of size O(k), for a total
of O((n/k) log k)).
It can be shown that all algorithms have a worst case of Omega((n/k) log
k), making this one optimal in the worst case up to constant factors.

Making Minimal Changes to Change Range of the Array

Consider having an array filled with elements a0,a1,a2,....,a(n-1).
Consider that this array is sorted already; it will be easier to describe the problem.
Now the range of the array is defined as the biggest element - smallest element.
Say this range is some value x.
Now the problem I have is that, I want to change the elements in such a way that the range becomes less than/equal to some target value y.
I also have the additional constraint that I want to change minimal amount for each element. Consider an element a(i) that has value z. If I change it by r amount, this costsr^2.
Thus, what is an efficient algorithm to update this array to make the range less than or equal to target range y that minimizes the cost.
An example:
Array = [ 0, 3, 19, 20, 23 ] Target range is 17.
I would make the new array [ 3, 3, 19, 20, 20 ] . The cost is (3)^2 + (3)^2 = 18.
This is the minimal cost.
If you are adding/removing to some certain element a(i), you must add/remove that quantity q all at once. You can not remove 3 times 1 unit from a certain element, but must remove a quantity of 3 units once.
I think you can build two heaps from the array - one min-heap, one max-heap. Now you will take the top elements of both heaps and peek at the ones right under them and compare the differences. The one that has the bigger difference you will take and if that difference is bigger than you need, you will just take the required size and add the cost.
Now, if you had to take the whole difference and didn't achieve your goal, you will need to repeat this step. However, if you once again choose from the same heap, you have to remember to add the cost for the element you are taking out of the heap in that steps AND also for those that have been taken out of the processed heap before.
This yields an O(N*logN) algorithm, I'm not sure if it can be done faster.
Example:
Array [2,5,10,12] , I want difference 4.
First heap has 2 on top, second one 12. the 2 is 3 far from 5 and 12 is 2 far from 10 so I take the min-heap and the two will have to be changed by 3. So now we have a new situation:
[5, 10, 12]
The 12 is 2 far from 10 and we take it, subtract 2 and get new situation:
[5,10]
Now we can choose any heap, both differences are the same (the same numbers :-) ). We just need to change by 1 so we get subtract 1 from 10 and get the right result. Now, because we changed 5 to 6 we would also have to change the number that was originally 12 once more to 9 so the resulting cost:
[2 - changed to 5, 5 - unchanged, 10 - changed to 9, 12 - changed to 9].
Here is a linear-time algorithm that minimizes the piecewise quadratic objective function. Probably it can be simplified.
Let the range be [x, x + y], where x is a variable. For different choices of x, there are at most 2n + 1 possibilities for which points lie in the range, arising from 2n critical values a0 - y, a1 - y, ..., a(n-1) - y, a0, a1, ..., a(n-1). One linear-time merge yields the critical values in sorted order. For each of the 2n - 1 intervals [w, z] between critical values where the range contains at least one point, we can construct and minimize a quadratic function consisting of a sum where every point aj less than w yields a term (x - aj)^2 and every point aj greater than z + y yields a term (x + y - aj)^2. The global minimum lies at the mean of aj (for terms of the first type) or aj - y (for terms of the second type); the endpoints of the interval must be checked as well. Naively, this gives a quadratic-time algorithm.
To get down to linear time, it suffices to update the sum preceding the mean computation incrementally. Each of the critical values has an associated event indicating whether the point responsible for it is entering or leaving the interval, meaning that that point's term should enter or leave the sum.

Resources