I have this sorting code below which is bubble sort, but I think this code is not exactly O(N^2) . I was wondering what would be the Time computational complexity in terms of Big O for this code below. My guess it is O(N.logN).
Code is just given here as example, not claiming it to be compilable as it is.
for(i = 0; i < n-1; i++)
{
for(j = 0; j < n-i-1; j++)
{
if (a[j+1] < a[j])
{
temp = a[j];
a[j] = a[j+1];
a[j+1] = temp;
}
}
}
My guess it is O(N.logN).
Why guess? Look at what's actually happening...
The first time though the outer loop, i == 0. That means that j will range from 0 to n-1.
The second time through, i == 1, so j will range from 0 to n-2.
The third time though, i == 2, so j ranges from 0 to n-3.
...
The last time through, i == n-1, so j ranges from 0 to 0.
So, the total number of operations is n-1 + n-2 + n-3 + ... + 0.
What's the sum ∑i, i=0..n-1? Now convert that to a big-O bound.
Related
This question already has answers here:
Why is the runtime of this code O(n^5)?
(3 answers)
Closed 3 years ago.
Find a tight upper bound on the complexity of this program.
I tried. I think the time complexity of this code O(n2).
void function(int n)
{
int count = 0;
for (int i=0; i<n; i++)
for (int j=i; j< i*i; j++)
if (j%i == 0)
{
for (int k=0; k<j; k++)
printf("*");
}
}
But the answer given is O(n5). How?
EDIT: Yikes, after spending the time to answer this question, I discovered that this is a duplicate of a previous question that I had also answered three years ago. Oops!
The tightest bound you can get on this function's runtime is Θ(n4). Here's the derivation.
This is a great place to illustrate a great general strategy for determining the big-O of a piece of code:
"When in doubt, work inside out!"
Let's take your code:
for (int i=0; i<n; i++)
{
for (int j=i; j< i*i; j++)
{
if (j%i == 0)
{
for (int k=0; k<j; k++)
{
printf("*");
}
}
}
}
Our approach for analyzing the runtime complexity will be to repeatedly take the innermost loop and replace it with the amount of work that it does. When we're done, we'll have our final time complexity.
Let's begin with this innermost loop:
for (int k=0; k<j; k++)
{
printf("*");
}
The amount of work done here is Θ(j), since the number of loop iterations is directly proportional to j and we do a constant amount of work per loop iteration. So let's replace this loop with the simpler "do Θ(j) work," giving us this simplified loop nest:
for (int i=0; i<n; i++)
{
for (int j=i; j< i*i; j++)
{
if (j%i == 0)
{
do Θ(j) work
}
}
}
Now, let's take aim at what's now the innermost loop:
for (int j=i; j < i*i; j++)
{
if (j%i == 0)
{
do Θ(j) work
}
}
This loop is unusual in that the amount of work that it does varies pretty dramatically from one iteration to the next. Specifically:
most iterations will do only O(1) work, but
one out of every i iterations will do Θ(j) work.
To analyze this loop, we'll therefore split the work apart into these two constituent pieces and see how much each contributes to the total.
First, let's look at the "easy" iterations of the loop, which do only O(1) work. There are a total of Θ(i2) iterations of the loop (the loop starts counting at j = i and stops when j = i2 and i2 - i = Θ(i2). We can therefore bound the contribution of the of these "easy" loop iterations at O(i2) work.
Now, what about the "hard" loop iterations? These iterations occur when j = i, when j = 2i, when j = 3i, j = 4i, etc. Moreover, each of these "hard" iterations do work directly proportional to j during the iteration. This means that, if we add up the work across all of these iterations, the total work done is given by
i + 2i + 3i + 4i + ... + (i - 1)i.
We can simplify this as follows:
i + 2i + 3i + 4i + ... + (i - 1)i
= i(1 + 2 + 3 + ... + i-1)
= i · Θ(i2)
= Θ(i3).
This uses the fact that 1 + 2 + 3 + ... + k = k(k + 1) / 2 = Θ(k2), which is Gauss's famous sum.
So now we see that the work done by the inner loop here is given by
O(i2) work for the easy iterations, and
Θ(i3) work for the hard iterations.
Summing this up, we see that the total work done by this inner loop is Θ(i3). Continuing our process of working inside out, we can replace this inner loop with "do Θ(i3) work" to get the following:
for (int i=0; i<n; i++)
{
do Θ(i^3) work
}
From here, we see that the work done is
13 + 23 + 33 + ... + (n - 1)3,
and that sum is Θ(n4). (Specifically, it's n2(n - 1)2 / 4.)
So overall, the theory predicts that the runtime should be Θ(n4), which is a factor of n lower than the O(n5) bound you mentioned above. How does the theory match the practice?
I ran this code on a variety of values of n and counted how many times that a star was printed. Here's the values I got back:
n = 500: 7760510375
n = 1000: 124583708250
n = 1500: 631407093625
n = 2000: 1996668166500
n = 2500: 4876304426875
n = 3000: 10113753374750
n = 3500: 18739952510125
n = 4000: 31973339333000
n = 4500: 51219851343375
If the runtime is Θ(n4), then if we double the size of the input, we should scale the output by a factor of 16. If the runtime is Θ(n5), then doubling the input size should scale the output by a factor of 32. Here's what I found:
Ratio of n = 1000 to n = 500: 16.0535
Ratio of n = 2000 to n = 1000: 16.0267
Ratio of n = 3000 to n = 1500: 16.0178
Ratio of n = 4000 to n = 2000: 16.0133
This strongly suggests that the runtime of this function is indeed Θ(n4) rather than Θ(n5).
Hope this helps!
I agree with the other poster, except the innermost loop is O(n^2), as k spans from 0 to j, which itself spans up to n^2. This gives us the desired answer of O(n^5).
The first loop
for (int i=0; i<n; i++)
Gives your first O(n) multiplier.
Next, the second loop
for (int j=i; j< i*i; j++)
Itself multiplies by O(n^2) complexity because i is "like" n here.
The third loop
for (int k=0; k<j; k++)
Multiplies by another O(n^2) because j is "like" n^2 here.
So you get complexity O(n^5).
I'm having hard time trying to figure out how to calculate the time complexity of some code. I know the basics of Big O, although I can't fully understand how to calculate in general.
Here is an example to something I couldn't solve. Hopefully you can:
void f(int n) {
int j, s;
for (j = 0, s = 1; s < n; j++, s*=2)
printf(“!”);
double values[j];
for (int k = 0; k < j; k++)
values[k] = 0;
while (j--)
for (int k = 1; k < j; k++)
values[k] += 1.0 / k;
}
What's the run time? I would love an explanation :)
The first loop iterates log2(n) times, computing j, the order of the highest bit of n. Complexity O(log(n)).
The second loop initializes an array of size j: time and space complexity O(log(n)).
The third loop is a nested loop iterating j times with the nested loop iterating j to 1 times, for a total of j * (j - 1) / 2 times. The time complexity of this is O(log(n)^2), and dominates the previous phases.
The overall time complexity of this function is O(log(n)^2), while the space complexity is O(log(n)).
I have a very very huge array of positive integer numbers which represent how far a certain train station is away from the center, for example:
S = {10, 200, 1000, 1500, 2019, 2200}
Train station S[0] is 10 miles away from the center. S is always sorted in ascending order, at any point in time, also before the algorithm starts. Just simply always.
I want to find a function which checks if there exist two train station with a distance of exactly N miles.
For example:
N = 1300 would give me true because 1500 - 200 = 1300.
First approach
Iterate over S and check for each element if the distance to another element is N. This gives me two loops and I guess O(n^2). I don't want O(n^2) because the array can be so huge it needs better performance.
Other approaches
I did a lot of research but all I found was that O(n) is possible. I want to have this time complexity. My solution looks like this, but unfortunately it does not work out at all.
int a[] = {10, 200, 1000, 1500, 2019, 2200};
int size = 6;
int left = 0;
int right = size - 1;
int x, y, distance, tempdis;
int N = 1300;
while(left < right)
{
x = a[left];
y = a[right];
tempdis = x - y;
distance = tempdis < 0 ? tempdis*(-1) : tempdis;
if(distance == N)
{
printf("found pair: %d %d\n", left, right);
break;
}
if(distance > N)
left++;
if(distance < N)
right--;
}
You can achieve linear time (O(n)) by only incrementing two pointers, i and j. You want to find i and j such that a[j] - a[i] == N. The logic is simple:
if a[j] - a[i] < N: increment j (distance gets larger)
if a[j] - a[i] > N: increment i (distance gets smaller)
That's all! In code:
int i = 0;
int j = 0;
while ((a[j] - a[i] != N) && j < size) { // size is length of a
if (a[j] - a[i] < N) {
j++;
} else {
i++;
}
}
if (j < size) {
printf("found pair: %d %d\n", i, j);
}
Handwaving proof of correctness: in principle, we should check each a[j] against all a[i] that could potentially give a solution. That is, for each j, we check a range p_j <= i <= q_j, such that a[j] - a[p_j] > N and a[j] - a[q_j] < N. If there is a solution involving j, it must be found in that range of i values.
Now, this algorithm almost does that, with one exception: sometimes we increment j multiple times in a row, so we clearly did not check it against a whole range. We increment j again, because a[j] - a[i] < N. However, if that happens, we also know that a[j] - a[i-1] > N. I leave it up to you to verify this.
This means that we check j against the range of all i values that can potentially give a solution. And thus the result is correct.
We have two pointers. In each step, one pointer is incremented. The size of the larger of the 2 (j) is bounded by n, so this runs in O(n) time.
sum should be changed to distance
And it should be:
if(distance < N) {
right--;
left--;
}
Not just
if(distance < N)
right--;
There is an error in this block of code:
if (distance < N)
right--;
You also want to decrement the value of left here:
if (distance < N)
{
right--;
left--;
}
Also, the variable sum in line:12 should be distance.
I am trying to figure out the complexity of a for loop using Big O notation. I have done this before in my other classes, but this one is more rigorous than the others because it is on the actual algorithm. The code is as follows:
for(i=n ; i>1 ; i/=2) //for any size n
{
for(j = 1; j < i; j++)
{
x+=a
}
}
and
for(i=1 ; i<=n;i++,x=1) //for any size n
{
for(j = 1; j <= i; j++)
{
for(k = 1; k <= j; x+=a,k*=a)
{
}
}
}
I have arrived that the first loop is of O(n) complexity because it is going through the list n times. As for the second loop I am a little lost!
Thank you for the help in the analysis. Each loop is in its own space, they are not together.
Consider the first code fragment,
for(i=n ; i>1 ; i/=2) //for any size n
{
for(j = 1; j < i; j++)
{
x+=a
}
}
The instruction x+=a is executed for a total of n + n/2 + n/4 + ... + 1 times.
Sum of the first log2n terms of a G.P. with starting term n and common ratio 1/2 is, (n (1-(1/2)log2n))/(1/2). Thus the complexity of the first code fragment is O(n).
Now consider the second code fragment,
for(i=1 ; i<=n; i++,x=1)
{
for(j = 1; j <= i; j++)
{
for(k = 1; k <= j; x+=a,k*=a)
{
}
}
}
The two outer loops together call the innermost loop a total of n(n+1)/2 times. The innermost loop is executed at most log<sub>a</sub>n times. Thus the total time complexity of the second code fragment is O(n2logan).
You may formally proceed like the following:
Fragment 1:
Fragment 2 (Pochhammer, G-Function, and Stirling's Approximation):
With log(G(n)).
[UPDATE of Fragment 2]:
With some enhancements from "DISCRETE LOOPS AND WORST CASE PERFORMANCE" publication, by Dr. Johann Blieberger (All cases verified for a = 2):
Where:
Therefore,
EDIT: I agree the first code block is O( n )
You decrement the outer loop i by diving by 2, and in the inner loop you run i times, so the number of iterations will be a sum over all the powers of two less than or equal to N but greater than 0, which is nlog(n)+1 - 1, so O(n).
The second code block is O(loga(n)n2) assuming a is a constant.
The two outermost loops equate to a sum of all the numbers less than or equal to n, which is n(n-1)/2, so O(n2). Finally the inner loop is the powers of a less than an upper bound of n, which is O(logan).
I'm preparing for an exam and these are some of problems from last year's tests. The task is to calculate both exact and asymptotic complexity. How would you solve it? Universally, if possible.
for ( i = j = 0; i < n; j ++ ) {
doSomething ();
i += j / n;
j %= n;
}
for ( i = 0; i < 2 * n; i += 2 )
for ( j = 1; j <= n; j <<= 1 )
if ( j & i )
doSomething ();
for (i = 0; i < 2*n; i++) {
if ( i > n )
for (j = i; j < 2 * i; j ++ ) doSomething();
else
for (j = n; j < 2 * n; j ++ ) doSomething();
}
Thanks in advance
My solution for the third loop is
t(n) = [ (n-1)*n + ((n-1)*n)/2 ] *D + [ n^2 +n ] *D + [ 2n ]*I
so it is in O(n^2) given that doSomething() has a constant time
and that i and j are integers.
The second term ( [ n^2 +n ] *D ) is fairly easy.
The loop
for (j = n; j < 2 * n; j ++ ) doSomething();
gets called while i <= n so it will be called n+1 times, since it starts from 0.
The loop for (j = n; j < 2 * n; j ++ ) calls doSomething() n times, so we have (n+1)*n*D = [n^2+n] *D. I assume that doSomething() has a constant time which is equal to D
The first term is a little bit more complex.
for (j = i; j < 2 * i; j ++ ) doSomething();
gets called when i>n so it will be called n-1 times.
The loop calls doSomething() i times.
The first time it gets called n+1, the second time ´n+2´ and so on until it is 2n-1 which is equal to n + (n-1).
So we get a sequence likes this {n+1, n+2, n+3, ... , n+(n-1)}.
If we sum up the sequence we get n-1 times n and the sum 1+2+3+...+ (n-1).
The last term can be solved with the "Gaußsche Summenformel" (sorry I don't have the English name for it but you can see the formula in the German wiki link) so it is equal to ((n-1)*n)/2
So the first term is (n-1) * n + ((n-1)*n)/2 *D
And the last term is therefor the if statement which is called 2*n*I, where I is the time to execute the If statement.
Well, the question here is, for all three loop structures, how the amount of iterations changes in proportion to n, right? so let's look at the loops. I'll omit the first one, since you solved it already.
for ( i = 0; i < 2 * n; i += 2 )
for ( j = 1; j <= n; j <<= 1 )
if ( j & i )
doSomething ();
the outer for loop obviously runs exactly n times. the inner loop runs log_2(n) times, because of the bitwise shift operation. The if clause runs in constant time, so the entire loop is in O(n * log_2(n)), assuming that doSomething() is in constant time as well.
Does that make it clearer? :)
As per request, I will explain how I came to the result that the first loop is equal to a construction like this:
int i, j;
for (i=0; i < n; i++) {
for (j=0; j <= n; j++) {
doSomething();
}
}
First of all, I must admit that before I really thought about it, I just wrote a little sample program including the first of the three for-loops that prints out i and j during the iteration. After I've seen the results, I was thinking about why the results are like this.
In the comment, I forgot to add that I defined n=200.
Explanation
We can say, that although j is incremented regularly every step in the iteration, it will never exceed a value of n. Why? After n iterations, j==n. It will be set to 0 in the statement j %= n after i has been incremented. In the statement i += j / n, i will be incremented by 0 n-1 times, and at the nth time, it will be incremented by 1. This starts all over again until i >= n.