Find time complexity of a code that runs three loops - c

I have this piece of code and I would like to find its time complexity. I am preparing for interviews and I think this one is a bit tough.
int foo (int n)
{
int sum = 0;
int k, i, j;
int t = 2;
for (i=n/2; i>0; i/=2)
{
for(j=0; j<i; j++)
{
for(k=0; k<log2(t-1); k++)
{
sum += bar(sum);
// bar time-complexity for all inputs is O(1)
}
}
t = pow(2, i);
}
}
I don't know why but I am unable to bound this expression and find a complexity.
Any help on how to resolve this ?

Since you've shown no progress, I'll give you top-level hints:
let log_t = log2(t). What is log_t as a function of i?
Note that the outer loop is executed log2(n) times.
How many total times is the j loop executed?
Pick a sample value of n, such as 32. For each value of i, how many times is the sum += statement executed? Can you generalize an equation for this based on n?

Lets write it down as:
(n/2)*log(1) + (n/4)*log(3) + ... + 1*(log(n-1)). Which is equal to:
< n * [log(2^i)/(2^i) for i in range 1...n] .
= n * [log(2)/2 + log(4)/4 + log(8)/8 + ... + log(n)/n)]
This yeilds to O(n)

Related

Find a tight upper bound on complexity of the below program: [duplicate]

This question already has answers here:
Why is the runtime of this code O(n^5)?
(3 answers)
Closed 3 years ago.
Find a tight upper bound on the complexity of this program.
I tried. I think the time complexity of this code O(n2).
void function(int n)
{
int count = 0;
for (int i=0; i<n; i++)
for (int j=i; j< i*i; j++)
if (j%i == 0)
{
for (int k=0; k<j; k++)
printf("*");
}
}
But the answer given is O(n5). How?
EDIT: Yikes, after spending the time to answer this question, I discovered that this is a duplicate of a previous question that I had also answered three years ago. Oops!
The tightest bound you can get on this function's runtime is Θ(n4). Here's the derivation.
This is a great place to illustrate a great general strategy for determining the big-O of a piece of code:
"When in doubt, work inside out!"
Let's take your code:
for (int i=0; i<n; i++)
{
for (int j=i; j< i*i; j++)
{
if (j%i == 0)
{
for (int k=0; k<j; k++)
{
printf("*");
}
}
}
}
Our approach for analyzing the runtime complexity will be to repeatedly take the innermost loop and replace it with the amount of work that it does. When we're done, we'll have our final time complexity.
Let's begin with this innermost loop:
for (int k=0; k<j; k++)
{
printf("*");
}
The amount of work done here is Θ(j), since the number of loop iterations is directly proportional to j and we do a constant amount of work per loop iteration. So let's replace this loop with the simpler "do Θ(j) work," giving us this simplified loop nest:
for (int i=0; i<n; i++)
{
for (int j=i; j< i*i; j++)
{
if (j%i == 0)
{
do Θ(j) work
}
}
}
Now, let's take aim at what's now the innermost loop:
for (int j=i; j < i*i; j++)
{
if (j%i == 0)
{
do Θ(j) work
}
}
This loop is unusual in that the amount of work that it does varies pretty dramatically from one iteration to the next. Specifically:
most iterations will do only O(1) work, but
one out of every i iterations will do Θ(j) work.
To analyze this loop, we'll therefore split the work apart into these two constituent pieces and see how much each contributes to the total.
First, let's look at the "easy" iterations of the loop, which do only O(1) work. There are a total of Θ(i2) iterations of the loop (the loop starts counting at j = i and stops when j = i2 and i2 - i = Θ(i2). We can therefore bound the contribution of the of these "easy" loop iterations at O(i2) work.
Now, what about the "hard" loop iterations? These iterations occur when j = i, when j = 2i, when j = 3i, j = 4i, etc. Moreover, each of these "hard" iterations do work directly proportional to j during the iteration. This means that, if we add up the work across all of these iterations, the total work done is given by
i + 2i + 3i + 4i + ... + (i - 1)i.
We can simplify this as follows:
i + 2i + 3i + 4i + ... + (i - 1)i
= i(1 + 2 + 3 + ... + i-1)
= i · Θ(i2)
= Θ(i3).
This uses the fact that 1 + 2 + 3 + ... + k = k(k + 1) / 2 = Θ(k2), which is Gauss's famous sum.
So now we see that the work done by the inner loop here is given by
O(i2) work for the easy iterations, and
Θ(i3) work for the hard iterations.
Summing this up, we see that the total work done by this inner loop is Θ(i3). Continuing our process of working inside out, we can replace this inner loop with "do Θ(i3) work" to get the following:
for (int i=0; i<n; i++)
{
do Θ(i^3) work
}
From here, we see that the work done is
13 + 23 + 33 + ... + (n - 1)3,
and that sum is Θ(n4). (Specifically, it's n2(n - 1)2 / 4.)
So overall, the theory predicts that the runtime should be Θ(n4), which is a factor of n lower than the O(n5) bound you mentioned above. How does the theory match the practice?
I ran this code on a variety of values of n and counted how many times that a star was printed. Here's the values I got back:
n = 500: 7760510375
n = 1000: 124583708250
n = 1500: 631407093625
n = 2000: 1996668166500
n = 2500: 4876304426875
n = 3000: 10113753374750
n = 3500: 18739952510125
n = 4000: 31973339333000
n = 4500: 51219851343375
If the runtime is Θ(n4), then if we double the size of the input, we should scale the output by a factor of 16. If the runtime is Θ(n5), then doubling the input size should scale the output by a factor of 32. Here's what I found:
Ratio of n = 1000 to n = 500: 16.0535
Ratio of n = 2000 to n = 1000: 16.0267
Ratio of n = 3000 to n = 1500: 16.0178
Ratio of n = 4000 to n = 2000: 16.0133
This strongly suggests that the runtime of this function is indeed Θ(n4) rather than Θ(n5).
Hope this helps!
I agree with the other poster, except the innermost loop is O(n^2), as k spans from 0 to j, which itself spans up to n^2. This gives us the desired answer of O(n^5).
The first loop
for (int i=0; i<n; i++)
Gives your first O(n) multiplier.
Next, the second loop
for (int j=i; j< i*i; j++)
Itself multiplies by O(n^2) complexity because i is "like" n here.
The third loop
for (int k=0; k<j; k++)
Multiplies by another O(n^2) because j is "like" n^2 here.
So you get complexity O(n^5).

complexity of functions in C

int f1(int N) {
int Sum, i, j, k;
Sum = 0;
for (i = 0; i < N; i++)
for (j = 0; j < i * i; j++)
for (k = 0; k < j; k++)
Sum++;
return Sum;
}
int f2(int N) {
int Sum, i, j;
Sum = 0;
for (i = 0; i < 10; i++)
for (j = 0; j < i; j++)
Sum += j * N;
return Sum;
}
What are the complexities of f1 and f2?
I have no idea about the complexity of f1 and I think the complexity of f2 should be O(1) since the number of iterations is constant. It is correct?
Your first function has the complexity O(N^(1+2+2)) = O(N^5).
In the first loop i goes from 0 on N, the second one j loops over a limit that depends on N^2 and in the 3rd one k loops on an interval whose size depends on N^2 as well.
The function F2 is constant time, so O(1) because the loops do not have any degree of liberty.
This kind of stuff is studied in the courses of algorithms at the topic "complexity".
There is also another kind of measurement of complexity of algorithms, based on omega-notation.
The complexity of f1 is in O(n^5) since
for(i=0; i<N; i++) //i has a upper bound of n
for(j=0; j<i*i; j++) //j has a upper bound of i^2, which itself has the upper bound of n, so this is n^2
for(k=0; k<j; k++) //k has a upper bound of j, which is n^2
Sum++; //constant
So the complete upper bound is n * n^2 * n^2 which is n^5 so f1 is in O(n^5).
for (i = 0; i < 10; i++) //upper bound of 10 so in O(10) which is in O(1)
for (j = 0; j < i; j++) //upper bound of i so also O(1)
Sum += j * N; //N is just a integer here, and multiplication is a constant operation independent of the size of N, so also O(1)
So f2 is in O(1*1*1) which is simply O(1).
Note all assignments and declarations are also constant.
BTW since Sum++ has no side effects and with the according loops develops a series we know a solution for (math yay), a programmer or optimal compiler optimiser could reduce f1 to a constant program using the gaussian sum formula (n*n+n) / 2, so sum could be just calculated by something like (N*N + N ) / 2 * (N*N*N*N + N*N) / 2) * 2 , however my formula does not consider starting at 0.
Using sigma notation:
f1:
The outer loop runs from 0 to N, the one inside it runs from 0 to i^2 and the last one runs from 0 to j, and inside we only have one operation so we are summing 1. Thus we get:
1+1+1... j times gives 1*j=j, thus we get:
Using the rule of the summation of natural numbers but we replace n (in the Wikipedia article) with i^2 so we get:
The reason for the approximation is because when finding the time complexity of a function and we have the addition of multiple powers we take the highest one. This just makes the math simpler. For example f(n)=(n^3+n^2+n)=O(n^3) (supposing that f(n) represents the maximal running time required by the given algorithm depending on the input size n) .
And using the formula for the summation of the first N numbers to 4th power we get (look at the note in the end):
Thus the time complexity for f1 is O(n^5).
f2:
Using the same method we get:
But this just gives a constant which doesn't depend on n thus the time complexity for f2 is O(1).
note:
When we have a summation of the first N numbers that are to the K power, the time complexity of it would be N^(K+1), so you obviously don't need to remember the formula. For example:

What is the complexity of this sum algorithm?

#include <stdio.h>
int main() {
int N = 8; /* for example */
int sum = 0;
for (int i = 1; i <= N; i++)
for (int j = 1; j <= i*i; j++)
sum++;
printf("Sum = %d\n", sum);
return 0;
}
for each n value (i variable), j values will be n^2. So the complexity will be n . n^2 = n^3. Is that correct?
If problem becomes:
#include <stdio.h>
int main() {
int N = 8; /* for example */
int sum = 0;
for (int i = 1; i <= N; i++)
for (int j = 1; j <= i*i; j++)
for (int k = 1; k <= j*j; k++)
sum++;
printf("Sum = %d\n", sum);
return 0;
}
Then you use existing n^3 . n^2 = n^5 ? Is that correct?
We have i and j < i*i and k < j*j which is x^1 * x^2 * (x^2)^2 = x^3 * x^4 = x^7 by my count.
In particular, since 1 < i < N we have O(N) for the i loop. Since 1 < j <= i^2 <= N^2 we have O(n^2) for the second loop. Extending the logic, we have 1 < k <= j^2 <= (i^2)^2 <= N^4 for the third loop.
Inner to Outer loops, we execute up to N^4 times for each j loop, and up to N^2 times for each i loop, and up to N times over the i loop, making the total be of order N^4 * N^2 * N = N^7 = O(N^7).
I think the complexity is actually O(n^7).
The first loop executes N steps.
The second loop executes N^2 steps.
In the third loop, j*j can reach N^4, so it has O(N^4) complexity.
Overall, N * N^2 * N^4 = O(N^7)
For i = 1 inner loop runs 1^1 times, for i = 2inner loop runs 2^2 times .... and for i = N inner loop runs N^N times. Its complexity is (1^1 + 2^2 + 3^3 + ...... + N^N) of order O(N^3).
In second case, for i = N first inner loop iterates N^N times and hence the second inner loop(inner most) will iterate up to N * (N^N) * (N^N) times. Hence the complexity is of order N * N^2 * N^4, i.e, O(N^7).
Yes. In the first example, the i loop runs N times, and the inner j loop tuns i*i times, which is O(N^2). So the whole thing is O(N^3).
In the second example there is an additional O(N^4) loop (loop to j*j), so it is O(N^5) overall.
For a more formal proof, work out how many times sum++ is executed in terms of N, and look at the highest polynomial order of N. In the first example it will be a(N^3)+b(N^2)+c(N)+d (for some values of a, b, c and d), so the answer is 3.
NB: Edited re example 2 to say it's O(N^4): misread i*i for j*j.
Consider the number of times all loops will be called.
int main() {
int N = 8; /* for example */
int sum = 0;
for (int i = 1; i <= N; i++) /* Called N times */
for (int j = 1; j <= i*i; j++) /* Called N*N times for i=0..N times */
for (int k = 1; k <= j*j; k++) /* Called N^2*N^2 times for j=0..N^2 times and i=0..N times */
sum++;
printf("Sum = %d\n", sum);
return 0;
}
Thus sum++ statement is called O(N^4)*O(N^2)*O(N) times = O(N^7) and this the overall complexity of the program.
The incorrect way to solve this (although common, and often gives the correct answer) is to approximate the average number of iterations of an inner loop with its worst-case. Here, the inner loop loops at worst O(N^4), the middle loop loops at worst O(N^2) times and the outer loop loops O(N) times, giving the (by chance correct) solution of O(N^7) by multiplying these together.
The right way is to work from the inside out, being careful to be explicit about what's being approximated.
The total number of iterations, T, of the increment instruction is the same as your code. Just writing it out:
T = sum(i=1..N)sum(j=1..i^2)sum(k=1..j^2)1.
The innermost sum is just j^2, giving:
T = sum(i=1..N)sum(j=1..i^2)j^2
The sum indexed by j is a sum of squares of consecutive integers. We can calculate that exactly: sum(j=1..n)j^2 is n*(n+1)*(2n+1)/6. Setting n=i^2, we get
T = sum(i=1..N)i^2*(i^2+1)*(2i^2+1)/6
We could continue to compute the exact answer, by using the formula for sums of 6th, 4th and 2nd powers of consecutive integers, but it's a pain, and for complexity we only care about the highest power of i. So we can approximate.
T = sum(i=1..N)(i^6/3 + o(i^5))
We can now use that sum(i=1..N)i^p = Theta(N^{p+1}) to get the final result:
T = Theta(N^7)

complexity for nested loops

I am trying to figure out the complexity of a for loop using Big O notation. I have done this before in my other classes, but this one is more rigorous than the others because it is on the actual algorithm. The code is as follows:
for(i=n ; i>1 ; i/=2) //for any size n
{
for(j = 1; j < i; j++)
{
x+=a
}
}
and
for(i=1 ; i<=n;i++,x=1) //for any size n
{
for(j = 1; j <= i; j++)
{
for(k = 1; k <= j; x+=a,k*=a)
{
}
}
}
I have arrived that the first loop is of O(n) complexity because it is going through the list n times. As for the second loop I am a little lost!
Thank you for the help in the analysis. Each loop is in its own space, they are not together.
Consider the first code fragment,
for(i=n ; i>1 ; i/=2) //for any size n
{
for(j = 1; j < i; j++)
{
x+=a
}
}
The instruction x+=a is executed for a total of n + n/2 + n/4 + ... + 1 times.
Sum of the first log2n terms of a G.P. with starting term n and common ratio 1/2 is, (n (1-(1/2)log2n))/(1/2). Thus the complexity of the first code fragment is O(n).
Now consider the second code fragment,
for(i=1 ; i<=n; i++,x=1)
{
for(j = 1; j <= i; j++)
{
for(k = 1; k <= j; x+=a,k*=a)
{
}
}
}
The two outer loops together call the innermost loop a total of n(n+1)/2 times. The innermost loop is executed at most log<sub>a</sub>n times. Thus the total time complexity of the second code fragment is O(n2logan).
You may formally proceed like the following:
Fragment 1:
Fragment 2 (Pochhammer, G-Function, and Stirling's Approximation):
With log(G(n)).
[UPDATE of Fragment 2]:
With some enhancements from "DISCRETE LOOPS AND WORST CASE PERFORMANCE" publication, by Dr. Johann Blieberger (All cases verified for a = 2):
Where:
Therefore,
EDIT: I agree the first code block is O( n )
You decrement the outer loop i by diving by 2, and in the inner loop you run i times, so the number of iterations will be a sum over all the powers of two less than or equal to N but greater than 0, which is nlog(n)+1 - 1, so O(n).
The second code block is O(loga(n)n2) assuming a is a constant.
The two outermost loops equate to a sum of all the numbers less than or equal to n, which is n(n-1)/2, so O(n2). Finally the inner loop is the powers of a less than an upper bound of n, which is O(logan).

Determining the complexities given codes

Given a snipplet of code, how will you determine the complexities in general. I find myself getting very confused with Big O questions. For example, a very simple question:
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
System.out.println("*");
}
}
The TA explained this with something like combinations. Like this is n choose 2 = (n(n-1))/2 = n^2 + 0.5, then remove the constant so it becomes n^2. I can put int test values and try but how does this combination thing come in?
What if theres an if statement? How is the complexity determined?
for (int i = 0; i < n; i++) {
if (i % 2 ==0) {
for (int j = i; j < n; j++) { ... }
} else {
for (int j = 0; j < i; j++) { ... }
}
}
Then what about recursion ...
int fib(int a, int b, int n) {
if (n == 3) {
return a + b;
} else {
return fib(b, a+b, n-1);
}
}
In general, there is no way to determine the complexity of a given function
Warning! Wall of text incoming!
1. There are very simple algorithms that no one knows whether they even halt or not.
There is no algorithm that can decide whether a given program halts or not, if given a certain input. Calculating the computational complexity is an even harder problem since not only do we need to prove that the algorithm halts but we need to prove how fast it does so.
//The Collatz conjecture states that the sequence generated by the following
// algorithm always reaches 1, for any initial positive integer. It has been
// an open problem for 70+ years now.
function col(n){
if (n == 1){
return 0;
}else if (n % 2 == 0){ //even
return 1 + col(n/2);
}else{ //odd
return 1 + col(3*n + 1);
}
}
2. Some algorithms have weird and off-beat complexities
A general "complexity determining scheme" would easily get too complicated because of these guys
//The Ackermann function. One of the first examples of a non-primitive-recursive algorithm.
function ack(m, n){
if(m == 0){
return n + 1;
}else if( n == 0 ){
return ack(m-1, 1);
}else{
return ack(m-1, ack(m, n-1));
}
}
function f(n){ return ack(n, n); }
//f(1) = 3
//f(2) = 7
//f(3) = 61
//f(4) takes longer then your wildest dreams to terminate.
3. Some functions are very simple but will confuse lots of kinds of static analysis attempts
//Mc'Carthy's 91 function. Try guessing what it does without
// running it or reading the Wikipedia page ;)
function f91(n){
if(n > 100){
return n - 10;
}else{
return f91(f91(n + 11));
}
}
That said, we still need a way to find the complexity of stuff, right? For loops are a simple and common pattern. Take your initial example:
for(i=0; i<N; i++){
for(j=0; j<i; j++){
print something
}
}
Since each print something is O(1), the time complexity of the algorithm will be determined by how many times we run that line. Well, as your TA mentioned, we do this by looking at the combinations in this case. The inner loop will run (N + (N-1) + ... + 1) times, for a total of (N+1)*N/2.
Since we disregard constants we get O(N2).
Now for the more tricky cases we can get more mathematical. Try to create a function whose value represents how long the algorithm takes to run, given the size N of the input. Often we can construct a recursive version of this function directly from the algorithm itself and so calculating the complexity becomes the problem of putting bounds on that function. We call this function a recurrence
For example:
function fib_like(n){
if(n <= 1){
return 17;
}else{
return 42 + fib_like(n-1) + fib_like(n-2);
}
}
it is easy to see that the running time, in terms of N, will be given by
T(N) = 1 if (N <= 1)
T(N) = T(N-1) + T(N-2) otherwise
Well, T(N) is just the good-old Fibonacci function. We can use induction to put some bounds on that.
For, example, Lets prove, by induction, that T(N) <= 2^n for all N (ie, T(N) is O(2^n))
base case: n = 0 or n = 1
T(0) = 1 <= 1 = 2^0
T(1) = 1 <= 2 = 2^1
inductive case (n > 1):
T(N) = T(n-1) + T(n-2)
aplying the inductive hypothesis in T(n-1) and T(n-2)...
T(N) <= 2^(n-1) + 2^(n-2)
so..
T(N) <= 2^(n-1) + 2^(n-1)
<= 2^n
(we can try doing something similar to prove the lower bound too)
In most cases, having a good guess on the final runtime of the function will allow you to easily solve recurrence problems with an induction proof. Of course, this requires you to be able to guess first - only lots of practice can help you here.
And as f final note, I would like to point out about the Master theorem, the only rule for more difficult recurrence problems I can think of now that is commonly used. Use it when you have to deal with a tricky divide and conquer algorithm.
Also, in your "if case" example, I would solve that by cheating and splitting it into two separate loops that don; t have an if inside.
for (int i = 0; i < n; i++) {
if (i % 2 ==0) {
for (int j = i; j < n; j++) { ... }
} else {
for (int j = 0; j < i; j++) { ... }
}
}
Has the same runtime as
for (int i = 0; i < n; i += 2) {
for (int j = i; j < n; j++) { ... }
}
for (int i = 1; i < n; i+=2) {
for (int j = 0; j < i; j++) { ... }
}
And each of the two parts can be easily seen to be O(N^2) for a total that is also O(N^2).
Note that I used a good trick trick to get rid of the "if" here. There is no general rule for doing so, as shown by the Collatz algorithm example
In general, deciding algorithm complexity is theoretically impossible.
However, one cool and code-centric method for doing it is to actually just think in terms of programs directly. Take your example:
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
System.out.println("*");
}
}
Now we want to analyze its complexity, so let's add a simple counter that counts the number of executions of the inner line:
int counter = 0;
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
System.out.println("*");
counter++;
}
}
Because the System.out.println line doesn't really matter, let's remove it:
int counter = 0;
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
counter++;
}
}
Now that we have only the counter left, we can obviously simplify the inner loop out:
int counter = 0;
for (int i = 0; i < n; i++) {
counter += n;
}
... because we know that the increment is run exactly n times. And now we see that counter is incremented by n exactly n times, so we simplify this to:
int counter = 0;
counter += n * n;
And we emerged with the (correct) O(n2) complexity :) It's there in the code :)
Let's look how this works for a recursive Fibonacci calculator:
int fib(int n) {
if (n < 2) return 1;
return fib(n - 1) + fib(n - 2);
}
Change the routine so that it returns the number of iterations spent inside it instead of the actual Fibonacci numbers:
int fib_count(int n) {
if (n < 2) return 1;
return fib_count(n - 1) + fib_count(n - 2);
}
It's still Fibonacci! :) So we know now that the recursive Fibonacci calculator is of complexity O(F(n)) where F is the Fibonacci number itself.
Ok, let's look at something more interesting, say simple (and inefficient) mergesort:
void mergesort(Array a, int from, int to) {
if (from >= to - 1) return;
int m = (from + to) / 2;
/* Recursively sort halves */
mergesort(a, from, m);
mergesort(m, m, to);
/* Then merge */
Array b = new Array(to - from);
int i = from;
int j = m;
int ptr = 0;
while (i < m || j < to) {
if (i == m || a[j] < a[i]) {
b[ptr] = a[j++];
} else {
b[ptr] = a[i++];
}
ptr++;
}
for (i = from; i < to; i++)
a[i] = b[i - from];
}
Because we are not interested in the actual result but the complexity, we change the routine so that it actually returns the number of units of work carried out:
int mergesort(Array a, int from, int to) {
if (from >= to - 1) return 1;
int m = (from + to) / 2;
/* Recursively sort halves */
int count = 0;
count += mergesort(a, from, m);
count += mergesort(m, m, to);
/* Then merge */
Array b = new Array(to - from);
int i = from;
int j = m;
int ptr = 0;
while (i < m || j < to) {
if (i == m || a[j] < a[i]) {
b[ptr] = a[j++];
} else {
b[ptr] = a[i++];
}
ptr++;
count++;
}
for (i = from; i < to; i++) {
count++;
a[i] = b[i - from];
}
return count;
}
Then we remove those lines that do not actually impact the counts and simplify:
int mergesort(Array a, int from, int to) {
if (from >= to - 1) return 1;
int m = (from + to) / 2;
/* Recursively sort halves */
int count = 0;
count += mergesort(a, from, m);
count += mergesort(m, m, to);
/* Then merge */
count += to - from;
/* Copy the array */
count += to - from;
return count;
}
Still simplifying a bit:
int mergesort(Array a, int from, int to) {
if (from >= to - 1) return 1;
int m = (from + to) / 2;
int count = 0;
count += mergesort(a, from, m);
count += mergesort(m, m, to);
count += (to - from) * 2;
return count;
}
We can now actually dispense with the array:
int mergesort(int from, int to) {
if (from >= to - 1) return 1;
int m = (from + to) / 2;
int count = 0;
count += mergesort(from, m);
count += mergesort(m, to);
count += (to - from) * 2;
return count;
}
We can now see that actually the absolute values of from and to do not matter any more, but only their distance, so we modify this to:
int mergesort(int d) {
if (d <= 1) return 1;
int count = 0;
count += mergesort(d / 2);
count += mergesort(d / 2);
count += d * 2;
return count;
}
And then we get to:
int mergesort(int d) {
if (d <= 1) return 1;
return 2 * mergesort(d / 2) + d * 2;
}
Here obviously d on the first call is the size of the array to be sorted, so you have the recurrence for the complexity M(x) (this is in plain sight on the second line :)
M(x) = 2(M(x/2) + x)
and this you need to solve in order to get to a closed form solution. This you do easiest by guessing the solution M(x) = x log x, and verify for the right side:
2 (x/2 log x/2 + x)
= x log x/2 + 2x
= x (log x - log 2 + 2)
= x (log x - C)
and verify it is asymptotically equivalent to the left side:
x log x - Cx
------------ = 1 - [Cx / (x log x)] = 1 - [C / log x] --> 1 - 0 = 1.
x log x
Even though this is an over generalization, I like to think of Big-O in terms of lists, where the length of the list is N items.
Thus, if you have a for-loop that iterates over everything in the list, it is O(N). In your code, you have one line that (in isolation all by itself) is 0(N).
for (int i = 0; i < n; i++) {
If you have a for loop nested inside another for loop, and you perform an operation on each item in the list that requires you to look at every item in the list, then you are doing an operation N times for each of N items, thus O(N^2). In your example above you do in fact, have another for loop nested inside your for loop. So you can think about it as if each for loop is 0(N), and then because they are nested, multiply them together for a total value of 0(N^2).
Conversely, if you are just doing a quick operation on a single item then that would be O(1). There is no 'list of length n' to go over, just a single one time operation.To put this in context, in your example above, the operation:
if (i % 2 ==0)
is 0(1). What is important isn't the 'if', but the fact that checking to see if a single item is equal to another item is a quick operation on a single item. Like before, the if statement is nested inside your external for loop. However, because it is 0(1), then you are multiplying everything by '1', and so there is no 'noticeable' affect in your final calculation for the run time of the entire function.
For logs, and dealing with more complex situations (like this business of counting up to j or i, and not just n again), I would point you towards a more elegant explanation here.
I like to use two things for Big-O notation: standard Big-O, which is worst case scenario, and average Big-O, which is what normally ends up happening. It also helps me to remember that Big-O notation is trying to approximate run-time as a function of N, the number of inputs.
The TA explained this with something like combinations. Like this is n choose 2 = (n(n-1))/2 = n^2 + 0.5, then remove the constant so it becomes n^2. I can put int test values and try but how does this combination thing come in?
As I said, normal big-O is worst case scenario. You can try to count the number of times that each line gets executed, but it is simpler to just look at the first example and say that there are two loops over the length of n, one embedded in the other, so it is n * n. If they were one after another, it'd be n + n, equaling 2n. Since its an approximation, you just say n or linear.
What if theres an if statement? How is the complexity determined?
This is where for me having average case and best case helps a lot for organizing my thoughts. In worst case, you ignore the if and say n^2. In average case, for your example, you have a loop over n, with another loop over part of n that happens half of the time. This gives you n * n/x/2 (the x is whatever fraction of n gets looped over in your embedded loops. This gives you n^2/(2x), so you'd get n^2 just the same. This is because its an approximation.
I know this isn't a complete answer to your question, but hopefully it sheds some kind of light on approximating complexities in code.
As has been said in the answers above mine, it is clearly not possible to determine this for all snippets of code; I just wanted to add the idea of using average case Big-O to the discussion.
For the first snippet, it's just n^2 because you perform n operations n times. If j was initialized to i, or went up to i, the explanation you posted would be more appropriate but as it stands it is not.
For the second snippet, you can easily see that half of the time the first one will be executed, and the second will be executed the other half of the time. Depending on what's in there (hopefully it's dependent on n), you can rewrite the equation as a recursive one.
The recursive equations (including the third snippet) can be written as such: the third one would appear as
T(n) = T(n-1) + 1
Which we can easily see is O(n).
Big-O is just an approximation, it doesn't say how long an algorithm takes to execute, it just says something about how much longer it takes when the size of its input grows.
So if the input is size N and the algorithm evaluates an expression of constant complexity: O(1) N times, the complexity of the algorithm is linear: O(N). If the expression has linear complexity, the algorithm has quadratic complexity: O(N*N).
Some expressions have exponential complexity: O(N^N) or logarithmic complexity: O(log N). For an algorithm with loops and recursion, multiply the complexities of each level of loop and/or recursion. In terms of complexity, looping and recursion are equivalent. An algorithm that has different complexities at different stages in the algorithm, choose the highest complexity and ignore the rest. And finally, all constant complexities are considered equivalent: O(5) is the same as O(1), O(5*N) is the same as O(N).

Resources