Correctness of algorithm to find maximum in array - arrays

I want to show correctness of "Algorithm to find maximum element in array" using induction and contradiction.
ans=-infinity
for (i=0; i<n; i++)
ans= max(ans, A[i])
where A[0:n-1] is array and max is the function to return maximum of its two arguments.
What I am doing:
Base case: i=0, ans= max(-infinity,A[0])=A[0], as only one element has been processsed, it is maximum.
Induction Hypothesis: i=k<n-1, assume the algorithm correctly find maximum upto k iterations.
Inductive Step: i=k+1, let ans_{i} denote maximum element obtained by algorithm upto i steps and let ans'_{i} denote another maximum element from array A[0:i-1].
Then from induction hypothsis, ans_{k} = ans'_{k}
Now, for the sake of contradiction, assume ans_{k+1} < ans'_{k+1}
Now, how should I proceed to show this contradiction ?
Any suggestion? Should I change this approach ?

Where n is zero, we obtain -infinity. Where n is one or higher, we obtain max2( ans_(n-1), A[n-1]). So induction works, unless max2(-infinity, x) returns
-infinity, which it might if A[n-1] = NaN. The contradiction step actually shows that the function maybe isn't a rigorous as it should be.

Related

Finding the Average case complexity of an Algorithm

I have an algorithm for Sequential search of an unsorted array:
SequentialSearch(A[0..n-1],K)
i=0
while i < n and A[i] != K do
i = i+1
if i < n then return i
else return -1
Where we have an input array A[0...n-1] and a search key K
I know that the worst case is n, because we would have to search the entire array, hence n items O(n)
I know that the best case is 1, since that would mean the first item we search is the one we want, or the array has all the same items, either way it's O(1)
But I have no idea on how to calculate the average case. The answer my textbook gives is:
= (p/n)[1+2+...+i+...+n] + n(1-p)
is there a general formula I can follow for when I see an algorithm like this one, to calculate it?
PICTURE BELOW
Textbook example
= (p/n)[1+2+...+i+...+n] + n(1-p)
p here is the probability of an search key found in the array, since we have n elements, we have p/n as the probability of finding the key at the particular index within n . We essentially doing weighted average as in each iteration, we weigh in 1 comparison, 2 comparison, and until n comparison. Because we have to take all inputs into account, the second part n(1-p) tells us the probability of input that doesn't exist in the array 1-p. and it takes n as we search through the entire array.
You'd need to consider the input cases, something like equivalence classes of input, which depends on the context of the algorithm. If none of those things are known, then assuming that the input is an array of random integers, the average case would probably be O(n). This is because, roughly, you have no way of proving to a useful extent how often your query will be found in an array of N integer values in the range of ~-32k to ~32k.
More formally, let X be a discrete random variable denoting the number of elements of the array A that are needed to be scanned. There are n elements and since all positions are equally likely for inputs generated randomly, X ~ Uniform(1,n) where X = 1,..,n, given that search key is found in the array (with probability p), otherwise all the elements need to be scanned, with X=n (with probability 1-p).
Hence, P(X=x)=(1/n).p.I{x<n}+((1/n).p+(1-p)).I{x=n} for x = 1,..,n, where I{x=n} is the indicator function and will have value 1 iff x=n otherwise 0.
Average time complexity of the algorithm is the expected time taken to execute the algorithm when the input is an arbitrary sequence. By definition,
The following figure shows how time taken for searching the array changes with n and p.

Adapting quickselect for smallest k elements in an array

I know that I can get the Kth order statistic (i.e. the kth smallest number in an array) by using quickselect in almost linear time, but what if I needed the k smallest elements of an array?
The wikipedia link has a pseudocode for the single-element lookup, but not for the k smallest elements lookup.
How should quickselect be modified to attain it in linear time (if possible) ?
I believe that after you use quickselest to find the k-th statictic, you will automatically find that the first k elements of the resulting array are the k smallest elements, only probably not sorted.
Moreover, quickselect actually does partitioning with respect to the k-th statistics: all the elements before k-th statistic is smaller (or equal) to it, and all the elements after are bigger or equal. This is easy to prove.
Note, for example that for C++ nth_element
The other elements are left without any specific order, except that
none of the elements preceding nth are greater than it, and none of
the elements following it are less.
If you need not just k smallest elements, but sorted k smallest elements, you can of course sort them after quickselect.
Actually modifying quickselect is not needed. If I had an array (called arrayToSearch in this example) and I wanted the k smallest items I'd do this:
int i;
int k = 10; // if you wanted the 10 smallest elements
int smallestItems = new Array(k);
for (i = 0; i < k; i++)
{
smallestItems[i] = quickselect(i, arrayToSearch);
}
Edit: I was under the assumption that k would be a relatively small number which would make the effective Big-O O(n). If not assuming k is small this would have a speed of O(k*n), not linear time. My answer is easier to comprehend, and applicable for most practical purposes. recursion.ninja's answer may be more technically correct, and therefore better for academic purposes.

Does "Find all triplets whose sum is less than some number" have any solution better than O(n3) runtime? [duplicate]

This question already has answers here:
Find all triplets in array with sum less than or equal to given sum
(5 answers)
Closed 8 years ago.
I got asked this on an interview.
Given an array of ints, find all triplets whose sum is less than some number
After some scrambling I told the interviewer that the best solution would still lead to worst-case runtime O(n3) and possibly would need O(n3).
The interviewer blatantly disagreed with me and told me "you need to go back to your algorithms...".
Am I missing something?
A possible optimization will be:
Remove all elements in the array that bigger than sum;
Sort the array;
Run O(N^2) to pick up a[i] + a[j], then binary search for sum - a[i] - a[j] in the range of [j + 1, N], the index is the number of possible candidates, but you should subtract j since they have been covered.
The complexity will be O(N^2 log N), slightly better.
You can solve this O(n^2) time:
First, sort the array.
Then, loop over the array with the first pointer i.
Now, use a second pointer j to loop up from there and a third pointer k to simultaneously loop down from the end.
Whenever you're in a situation where A[i]+A[j]+A[k] < X, you know that the same holds for all j<k'<k so increment your count with k-j and increment j. I keep the hidden invariant that A[i]+A[j]+A[k+1] >= X, so incrementing j only makes that statement stronger.
Otherwise, decrement k. When j and k meet, increment i.
You will only increment j and decrement k, so they need O(n) amortized time to meet.
In pseudocode:
count= 0
for i = 0; i < N; i++
j = i+1
k = N-1
while j < k
if A[i] + A[j] + A[k] < X
count += k-j
j++
else
k--
I see that you ask for all triplets. It is quite obvious that there can be O(n^3) triplets, so if you want them all you will need as much time, worst case.
This is an example of a problem where the output size matters. For example, if the array contains just 1, 2, 3, 4, 5, ..., n and the maximum value is set at 3n then every single triplet will be an answer, and you have to do Ω(n3) work just to list them all. On the other hand, if the maximum value had been 0, it would be nice to finish in O(n) time after confirming all the items are too large.
Basically, we want an output-sensitive algorithm with a running time that's something like O(f(n) + t) where t is the output size and n is the input size.
An O(n2 + t) algorithm would work by essentially tracking the transition points where triplets transitioned from being over the limit to under the limit. Then it would yield everything under that surface. The space is three-dimensional so the surface is two-dimensional, and you can track along it from point to point in aggregate constant time.
Here's some python code (untested!):
def findTripletsBelow(items, limit):
surfaceCoords = []
s = sorted(items)
for i in range(len(s)):
k = len(s)-1
for j in range(i, len(s))
while k >= 0 and s[i]+s[j]+s[k] > limit:
k -= 1
if k < 0: break
surfaceCoords.append((i,j,k))
results = []
for (i,j,k) in surfaceCoords:
for k2 in range(k+1):
results.append((s[i], s[j], s[k2]))
return results
O(n2) algorithm.
Sort the list.
For every element ai, this is how you calculate the number of combinations:
Binary search and find maximum aj such that j < i and ai+aj <= total.
Binary search and find maximum ak such that k < j and ai+aj+ak <= total
For this particular combination of (ai, aj), k is the number of sums that is less than or equal to total.
Now decrement j and increment k as much as possible (but ai+aj+ak <= total )
The total number of increments and decrements is less than i. So for a particular i the complexity is O(i). Therefore overall complexity is O(n2).
I am leaving out many corner conditions, but this should give you an idea.
Edit:
In the worst case there are O(n3) solutions. So outputting them explicitly would certainly require O(n3) time. There is no way around it.
But if you want to return a implicit list (i.e. a compressed list of combinations) this would still work. An example of compressed output would be (ai, aj, ak) for k in 1:p.

Algorithm Olympiad : conditional minimum in array

I have an array A = [a1, a2, a3, a4, a5...] and I want to find two elements of the array, say A[i] and A[j] such that i is less than j and A[j]-A[i] is minimal and positive.
The runtime has to be O(nlog(n)).
Would this code do the job:
First sort the array and keep track of the original index of each element (ie : the index of the element in the ORIGINAL (unsorted) array.
Go through the sorted array and calculate the differences between any two successive elements that verify the initial condition that the Original Index of the bigger element is bigger than the original index of the smaller element.
The answer would be the minimum value of all these differences.
Here is how this would work on an example:
A = [0, -5, 10, 1]
In this case the result should be 1 coming from the difference between A[3] and A[0].
sort A : newA=[-5,0,1,10]
since OriginalIndex(-5)>OriginalIndex(0), do not compute the difference
since OriginalIndex(1)>OriginalIndex(0), we compute the difference = 1
since OriginalIndex(10)>OriginalIndex(1), we compute the difference = 9
The result is the minimal difference, which is 1.
Contrary to the claim made in the other post there wouldn't be any problem regarding the runtime of your algorithm. Using heapsort for example the array could be sorted in O(n log n) as given as an upper bound in your question. An additional O (n) running once along the sorted array couldn't harm this any more, so you would still stay with runtime O (n log n).
Unfortunately your answer still doesn't seem to be correct as it doesn't give the correct result.
Taking a closer look at the example given you should be able to verify that yourself. The array given in your example was: A=[0,-5,10,1]
Counting from 0 choosing indices i=2 and j=3 meets the given requirement i < j as 2 < 3. Calculating the difference A[j] - A[i] which with the chosen values comes down to A[3] - A[2] calculates to 1 - 10 = -9 which is surely less than the minimal value of 1 calculated in the example application of your algorithm.
Since you're minimising the distance between elements, they must be next to each other in the sorted list (if they weren't then the element in between would be a shorter distance to one of them -> contradiction). Your algorithm runs in O(nlogn) as specified so it looks fine to me.

Need idea for solving this algorithm puzzle

I've came across some similar problems to this one in the past, and I still haven't got good idea how to solve this problem. Problem goes like this:
You are given an positive integer array with size n <= 1000 and k <= n which is the number of contiguous subarrays that you will have to split your array into. You have to output minimum m, where m = max{s[1],..., s[k]}, and s[i] is the sum of the i-th subarray. All integers in the array are between 1 and 100. Example :
Input: Output:
5 3 >> n = 5 k = 3 3
2 1 1 2 3
Splitting array into 2+1 | 1+2 | 3 will minimize the m.
My brute force idea was to make first subarray end at position i (for all possible i) and then try to split the rest of the array in k-1 subarrays in the best way possible. However, this is exponential solution and will never work.
So I'm looking for good ideas to solve it. If you have one please tell me.
Thanks for your help.
You can use dynamic programming to solve this problem, but you can actually solve with greedy and binary search on the answer. This algorithm's complexity is O(n log d), where d is the output answer. (An upper bound would be the sum of all the elements in the array.) (or O( n d ) in the size of the output bits)
The idea is to binary search on what your m would be - and then greedily move forward on the array, adding the current element to the partition unless adding the current element pushes it over the current m -- in that case you start a new partition. The current m is a success (and thus adjust your upper bound) if the numbers of partition used is less than or equal to your given input k. Otherwise, you used too many partitions, and raise your lower bound on m.
Some pseudocode:
// binary search
binary_search ( array, N, k ) {
lower = max( array ), upper = sum( array )
while lower < upper {
mid = ( lower + upper ) / 2
// if the greedy is good
if partitions( array, mid ) <= k
upper = mid
else
lower = mid
}
}
partitions( array, m ) {
count = 0
running_sum = 0
for x in array {
if running_sum + x > m
running_sum = 0
count++
running_sum += x
}
if running_sum > 0
count++
return count
}
This should be easier to come up with conceptually. Also note that because of the monotonic nature of the partitions function, you can actually skip the binary search and do a linear search, if you are sure that the output d is not too big:
for i = 0 to infinity
if partitions( array, i ) <= k
return i
Dynamic programming. Make an array
int best[k+1][n+1];
where best[i][j] is the best you can achieve splitting the first j elements of the array int i subarrays. best[1][j] is simply the sum of the first j array elements. Having row i, you calculate row i+1 as follows:
for(j = i+1; j <= n; ++j){
temp = min(best[i][i], arraysum[i+1 .. j]);
for(h = i+1; h < j; ++h){
if (min(best[i][h], arraysum[h+1 .. j]) < temp){
temp = min(best[i][h], arraysum[h+1 .. j]);
}
}
best[i+1][j] = temp;
}
best[m][n] will contain the solution. The algorithm is O(n^2*k), probably something better is possible.
Edit: a combination of the ideas of ChingPing, toto2, Coffee on Mars and rds (in the order they appear as I currently see this page).
Set A = ceiling(sum/k). This is a lower bound for the minimum. To find a good upper bound for the minimum, create a good partition by any of the mentioned methods, moving borders until you don't find any simple move that still decreases the maximum subsum. That gives you an upper bound B, not much larger than the lower bound (if it were much larger, you'd find an easy improvement by moving a border, I think).
Now proceed with ChingPing's algorithm, with the known upper bound reducing the number of possible branches. This last phase is O((B-A)*n), finding B unknown, but I guess better than O(n^2).
I have a sucky branch and bound algorithm ( please dont downvote me )
First take the sum of array and dvide by k, which gives you the best case bound for you answer i.e. the average A. Also we will keep a best solution seen so far for any branch GO ( global optimal ).Lets consider we put a divider( logical ) as a partition unit after some array element and we have to put k-1 partitions. Now we will put the partitions greedily this way,
Traverse the array elements summing them up until you see that at the next position we will exceed A, now make two branches one where you put the divider at this position and other where you put at next position, Do this recursiely and set GO = min (GO, answer for a branch ).
If at any point in any branch we have a partition greater then GO or the no of position are less then the partitions left to be put we bound. In the end you should have GO as you answer.
EDIT:
As suggested by Daniel, we could modify the divider placing strategy a little to place it until you reach sum of elements as A or the remaining positions left are less then the dividers.
This is just a sketch of an idea... I'm not sure that it works, but it's very easy (and probably fast too).
You start say by putting the separations evenly distributed (it does not actually matter how you start).
Make the sum of each subarray.
Find the subarray with the largest sum.
Look at the right and left neighbor subarrays and move the separation on the left by one if the subarray on the left has a lower sum than the one on the right (and vice-versa).
Redo for the subarray with the current largest sum.
You'll reach some situation where you'll keep bouncing the separation between the same two positions which will probably mean that you have the solution.
EDIT: see the comment by #rds. You'll have to think harder about bouncing solutions and the end condition.
My idea, which unfortunately does not work:
Split the array in N subarrays
Locate the two contiguous subarrays whose sum is the least
Merge the subarrays found in step 2 to form a new contiguous subarray
If the total number of subarrays is greater than k, iterate from step 2, else finish.
If your array has random numbers, you can hope that a partition where each subarray has n/k is a good starting point.
From there
Evaluate this candidate solution, by computing the sums
Store this candidate solution. For instance with:
an array of the indexes of every sub-arrays
the corresponding maximum of sum over sub-arrays
Reduce the size of the max sub-array: create two new candidates: one with the sub-array starting at index+1 ; one with sub-array ending at index-1
Evaluate the new candidates.
If their maximum is higher, discard
If their maximum is lower, iterate on 2, except if this candidate was already evaluated, in which case it is the solution.

Resources