Understanding Loop invariant. Finding and proving them for an algorithm - loops

I need a little help understanding loop invariant. I just need someone to explain how to go about finding a loop invariant and proving it for the following algorithm:
public int sumSquares ( int n )
{
int sum = 0 ;
for ( int i = 1 ; i <= n ; i++)
{
sum += i ∗ i ;
}
return sum ;
}

The loop invariant is a statement that is provably true immediately before and after each iteration of a loop. Turns out there are lots of these, but the trick is proving one that actually helps.
For example, it is true that
the value of i is always equal to i before and after each iteration.
Not terribly helpful. Instead, think about what you want to prove
When i is equal to n, sum is equal to the sum of the first n squares
In this case, if you want to prove your algorithm produces the sum of the first n squares, you'll want to state the following invariant. For each i,
The value of sum is equal to the sum of the first i squares

You can also note the similarity to mathematical induction.
When you prove that a property holds, you prove a base case and an inductive step.
If the invariant hold before the first iteration it is similar to the base case. Showing that the invariant loop holds from iteration to iteration is similar to inductive step.

As said by others, the invariant is true before and after each iteration. In general it becomes invalid somewhere inside the loop and is later restored.
To establish a proof, you track conditions that hold between the individual statements and relate them to the invariant.
For the given example, follow the annotations; for clarity, the for statement is rewritten as a while.
int sum = 0 ;
i= 1;
// This trivially establishes the invariant: sum == 0 == Σ k² for 0<=k<i=1
while (i <= n)
{
// The invariant is still true here (nothing changed)
sum += i ∗ i ;
// We now have sum == Σ k² for 0<=k<=i (notice the <=; this differs from the invariant)
i++;
// As i has been increased, sum == Σ k² for 0<=k<i is restored
}
// Here the invariant holds, as well as i == n+1
// Hence sum == Σ k² for 0<=k<n+1
return sum ;
In general the invariant expresses that we have solved a part of the problem. In the given example, it says that a part of the sum has been computed. Initially, the partial solution is trivial (and useless). As the iterations progress, the partial solution gets closer and closer to the complete one.

Related

Big O notation for these functions

I want to solve that question but I am not sure if I am right or not. I found O(n^2-n)=O(n^2)
double fact(long i)
{
if (i==1 || i==0) return i;
else return i*fact(i-1);
}
funcQ2()
{
for (i=1; i<=n; i++)
sum=sum+log(fact(i));
}
Your fact function is recursive, so you should start by writing the corresponding recurrence relation for the time complexity T(i):
T(0) = 1 // base case i==0
T(1) = 1 // base case i==1
T(i) = T(i-1) + 1 // because fact calls itself on i-1 and does one multiplication afterwards
It's easy to see that the solution to this recurrence relation is T(i) = i for all i > 0, so T(i) ∈ O(i).
Your second function funcQ2 has no inputs, and assuming that n is a constant, its complexity is trivially O(1). If, on the other hand, you assume n to be an input parameter and want to measure time complexity with respect to n, it would be O(n^2) since you are calling fact(i) within the loop (standard arithmetic series).

c loop function computing time complexity

I am learning to compute the time complexity of algorithms.
Simple loops and nested loops can be compute but how can I compute if there are assignments inside the loop.
For example :
void f(int n){
int count=0;
for(int i=2;i<=n;i++){
if(i%2==0){
count++;
}
else{
i=(i-1)*i;
}
}
}
i = (i-1)*i affects how many times the loop will run. How can I compute the time complexity of this function?
As i * (i-1) is even all the time ((i * (i-1)) % 2 == 0), if the else part will be true for one time in the loop, i++ makes the i odd number. As result, after the first odd i in the loop, always the condition goes inside the else part.
Therefore, as after the first iteration, i will be equal to 3 which is odd and goes inside the else part, i will be increased by i * (i-1) +‌ 1 in each iteration. Hence, if we denote the time complexity of the loop by T(n), we can write asymptotically: T(n) = T(\sqrt(n)) + 1. So, if n = 2^{2^k}, T(n) = k = log(log(n)).
There is no general rule to calculate the time complexity for such algorithms. You have to use your knowledge of mathematics to get the complexity.
For this particular algorithm, I would approach it like this.
Since initially i=2 and it is even, let's ignore that first iteration.
So I am only considering from i=3. From there I will always be odd.
Your expression i = (i-1)*i along with the i++ in the for loop finally evaluates to i = (i-1)*i+1
If you consider i=3 as 1st iteration and i(j) is the value of i in the jth iteration, then i(1)=3.
Also
i(j) = [i(j-1)]^2 - i(j-1) + 1
The above equation is called a recurrence relation and there are standard mathematical ways to solve it and get the value of i as a function of j. Sometimes it is possible to get and sometimes it might be very difficult or impossible. Frankly, I don't know how to solve this one.
But generally, we don't get situations where you need to go that far. In practical situations, I would just assume that the complexity is logarithmic because the value of i is increasing exponentially.

What is the time complexity of the following dependent loops?

I have a question that needs answer before an exam I'm supposed to have this week.
i = 1;
while (i <= n)
{
for (j = 1; j < i; j++)
printf("*");
j *= 2;
i *= 3;
}
I have those dependent loops, I calculated the outer loop's big O to be O(logn).
The inner loop goes from 1 to i - 1 for every iteration of the outer loop,
the problem I'm having with this is that I do not know how calculate the inner loop's time complexity, and then the overall complexity (I'm used to just multiplying both complexities but I'm not sure about this one)
Thanks a lot!
P.S: I know that the j *= 2 doesn't affect the for loop.
As you recognized, computing the complexity of a loop nest where the bounds of an inner loop vary for different iterations of the outer loop is not as easy a simple multiplication of two iteration counts. You need to look more deeply to get the tightest possible bound.
The question can be taken to be asking about how many times the body of the inner loop is executed, as a function of n. On the first outer-loop iteration, i is 1, so j is never less than i, so there are no inner-loop iterations. Next, i is 3, so there are two inner-loop iterations, then eight the next time, then 26 ... in short, 3i-1 - 1 inner-loop iterations. You need to add those all up to compute the overall complexity.
Well, that sum is Σi = 1, floor(log n) (3i-1 - 1), so you could say that the complexity of the loop nest is
O(Σi = 1, floor(log n) (3i-1 - 1))
, but such an answer is unlikely to get you full marks.
We can simplify that by observing that our sum is bounded by a related one:
= O(Σi = 1, floor(log n) (3i-1))
. At this point (if not sooner) it would be useful to recognize the sum of powers pattern therein. It is often useful to know that 20 + 21 + ... 2k - 1 = 2k - 1. This is closely related to base-2 numeric representations, and a similar formula can be written for any other natural number base. For example, for base 3, it is 2 * 30 + 2 * 31 + ... 2 * 3k - 1 = 3k - 1. This might be enough for you to intuit the answer: that the total number of inner-loop iterations is bounded by a constant multiple of the number of inner-loop iterations on the last iteration of the outer loop, which in turn is bounded by n.
But if you want to prove it, then you can observe that the sum in the previous bound expression is itself bounded by a related definite integral:
= O(∫0log n 3i di)
... and that has a closed-form solution:
= O((3log n - 30) / log 3)
, which clearly has a simpler bound itself
= O(3log n)
. Exponentials of logarithms reduce to linear functions of the logarithm argument. Since we need only an asymptotic bound, we don't care about the details, and thus we can go straight to
= O(n)

Correctness of algorithm to find maximum in array

I want to show correctness of "Algorithm to find maximum element in array" using induction and contradiction.
ans=-infinity
for (i=0; i<n; i++)
ans= max(ans, A[i])
where A[0:n-1] is array and max is the function to return maximum of its two arguments.
What I am doing:
Base case: i=0, ans= max(-infinity,A[0])=A[0], as only one element has been processsed, it is maximum.
Induction Hypothesis: i=k<n-1, assume the algorithm correctly find maximum upto k iterations.
Inductive Step: i=k+1, let ans_{i} denote maximum element obtained by algorithm upto i steps and let ans'_{i} denote another maximum element from array A[0:i-1].
Then from induction hypothsis, ans_{k} = ans'_{k}
Now, for the sake of contradiction, assume ans_{k+1} < ans'_{k+1}
Now, how should I proceed to show this contradiction ?
Any suggestion? Should I change this approach ?
Where n is zero, we obtain -infinity. Where n is one or higher, we obtain max2( ans_(n-1), A[n-1]). So induction works, unless max2(-infinity, x) returns
-infinity, which it might if A[n-1] = NaN. The contradiction step actually shows that the function maybe isn't a rigorous as it should be.

Does "Find all triplets whose sum is less than some number" have any solution better than O(n3) runtime? [duplicate]

This question already has answers here:
Find all triplets in array with sum less than or equal to given sum
(5 answers)
Closed 8 years ago.
I got asked this on an interview.
Given an array of ints, find all triplets whose sum is less than some number
After some scrambling I told the interviewer that the best solution would still lead to worst-case runtime O(n3) and possibly would need O(n3).
The interviewer blatantly disagreed with me and told me "you need to go back to your algorithms...".
Am I missing something?
A possible optimization will be:
Remove all elements in the array that bigger than sum;
Sort the array;
Run O(N^2) to pick up a[i] + a[j], then binary search for sum - a[i] - a[j] in the range of [j + 1, N], the index is the number of possible candidates, but you should subtract j since they have been covered.
The complexity will be O(N^2 log N), slightly better.
You can solve this O(n^2) time:
First, sort the array.
Then, loop over the array with the first pointer i.
Now, use a second pointer j to loop up from there and a third pointer k to simultaneously loop down from the end.
Whenever you're in a situation where A[i]+A[j]+A[k] < X, you know that the same holds for all j<k'<k so increment your count with k-j and increment j. I keep the hidden invariant that A[i]+A[j]+A[k+1] >= X, so incrementing j only makes that statement stronger.
Otherwise, decrement k. When j and k meet, increment i.
You will only increment j and decrement k, so they need O(n) amortized time to meet.
In pseudocode:
count= 0
for i = 0; i < N; i++
j = i+1
k = N-1
while j < k
if A[i] + A[j] + A[k] < X
count += k-j
j++
else
k--
I see that you ask for all triplets. It is quite obvious that there can be O(n^3) triplets, so if you want them all you will need as much time, worst case.
This is an example of a problem where the output size matters. For example, if the array contains just 1, 2, 3, 4, 5, ..., n and the maximum value is set at 3n then every single triplet will be an answer, and you have to do Ω(n3) work just to list them all. On the other hand, if the maximum value had been 0, it would be nice to finish in O(n) time after confirming all the items are too large.
Basically, we want an output-sensitive algorithm with a running time that's something like O(f(n) + t) where t is the output size and n is the input size.
An O(n2 + t) algorithm would work by essentially tracking the transition points where triplets transitioned from being over the limit to under the limit. Then it would yield everything under that surface. The space is three-dimensional so the surface is two-dimensional, and you can track along it from point to point in aggregate constant time.
Here's some python code (untested!):
def findTripletsBelow(items, limit):
surfaceCoords = []
s = sorted(items)
for i in range(len(s)):
k = len(s)-1
for j in range(i, len(s))
while k >= 0 and s[i]+s[j]+s[k] > limit:
k -= 1
if k < 0: break
surfaceCoords.append((i,j,k))
results = []
for (i,j,k) in surfaceCoords:
for k2 in range(k+1):
results.append((s[i], s[j], s[k2]))
return results
O(n2) algorithm.
Sort the list.
For every element ai, this is how you calculate the number of combinations:
Binary search and find maximum aj such that j < i and ai+aj <= total.
Binary search and find maximum ak such that k < j and ai+aj+ak <= total
For this particular combination of (ai, aj), k is the number of sums that is less than or equal to total.
Now decrement j and increment k as much as possible (but ai+aj+ak <= total )
The total number of increments and decrements is less than i. So for a particular i the complexity is O(i). Therefore overall complexity is O(n2).
I am leaving out many corner conditions, but this should give you an idea.
Edit:
In the worst case there are O(n3) solutions. So outputting them explicitly would certainly require O(n3) time. There is no way around it.
But if you want to return a implicit list (i.e. a compressed list of combinations) this would still work. An example of compressed output would be (ai, aj, ak) for k in 1:p.

Resources