This is a question from one of the old exams from algorithms and data structure that I recently came upon. I'm having a hard time understanding the solution.
I need to find big-O, big-ϴ and big-Ω bounds of a function:
void recursion(int n) {
int i;
if (n == 0) {
return;
}
for (i = 0; i < n; i++) {
recursion(i);
}
}
The solution is 2^n for all three and I can't understand why. I've tried writing things down and I can't even get close to the solution. I would appreciate if anyone would explain where the 2^n comes from here.
Let's look at a simpler recursion which is known to be O(2^n)
void fib(int n) {
if (n < 3) {
return 1;
} else {
return fib(n - 1) + fib(n - 2);
}
}
Here you can see, for the non-trivial case of n > 2, this will result in 2^(n-2) calls to itself. For example, if n = 5:
n = 5
n = 4
n = 3
n = 2
n = 1
n = 2
n = 3
n = 2
n = 1
There are 8 (2^3) recursive calls, because each call with n > 2 spawns two more recursive calls, so fib(n+1) has twice as many recursive calls as fib(n).
So for your example:
n = 3
n = 2
n = 1
n = 0
n = 0
n = 1
n = 0
n = 0
so we get 7 recursive calls when n = 3
for n = 4
n = 4
n = 3
n = 2
n = 1
n = 0
n = 0
n = 1
n = 0
n = 0
n = 2
n = 1
n = 0
n = 0
n = 1
n = 0
n = 0
Here, we have 15 calls. Looking at the execution tree above, you can see that recusrsion(4) is basically recursion(3) + recursion(3) + 1
n = 4
n = 3 // + 1
n = 2 //
n = 1 //
n = 0 // recursion(3)
n = 0 //
n = 1 //
n = 0 //
n = 0 //
n = 2 //
n = 1 //
n = 0 // recursion(3)
n = 0 //
n = 1 //
n = 0 //
n = 0 //
So in general, recursion(n + 1) will have one more recursive calls than 2 * recursion(n)....which is basically doubling for every +1 to n....which is O(2^n)
Let's denote the total runtime as f(n). Due to the loop in the function the f(n) is actually a sum of f(i) for i between 0 and n-1. That's a sum of n items. Let's try to simplify the expression. A standard trick in such situations is to find a complimentary equation. Let's see what is the value of f(n-1). Similary to the previous case, it's a sum of f(i) for i between 0 and n-2. So now we have 2 equations:
f(n)=f(1)+...+f(n-1)
f(n-1)=f(1)+...+f(n-2)
Let's subtract second from the first:
f(n)-f(n-1)=f(n-1)
--> f(n)=2f(n-1)
Now this is a homogeneous linear recurrence relation with constant coefficients.
The solution is immediate (see the link for more details):
f(n)=f(1)*2n=2n
Since this smells like a homework question, this answer is incomplete by design.
The usual trick behind these kind of problems is to create a recurrence equation. That is, the time complexity of recursion(k+1) is somehow related to the complexity of recursion(k). Just writing down the recurrence itself is not sufficient to prove the complexity, you have to demonstrate why the recurrence is true. But, for 2n, this suggests that recursion(k+1) takes twice as long as recursion(k).
Let T(k) denote the time complexity of recursion(k). Since recursion(0) returns immediately, let T(0) = 1. For k > 0, given the iterative implementation of recursion Thus You can inductively prove that T(k) = 2k.
r(n) = r(n-1)+r(n-2)+...+r(0) // n calls.
r(n-1) = r(n-2)+r(n-3)+...+r(0) // n-1 calls.
r(n-2) = r(n-3)+r(n-4)+...+r(0) // n-2 calls.
.
.
.
r(1) = r(0) // 1 call.
r(0) = return; // 0 call.
So,
r(n) = r(n-1)+r(n-2)+...+r(0) // n calls.
= 2 * (r(n-2)+...+r(0)) // 2 * (n - 1) calls.
= 2 * ( 2 * (r(n-3)+...+r(0)) ) // 2 * 2 * (n - 2) calls.
.
.
.
This follows that =>
2^(n-1) * (n - (n-1))
And that would be
2^n calls...
Related
I have a test in computer sience about complexity and I have this question:
int counter = 0;
for (int i = 2; i < n; ++i) {
for (int j = 1; j < n; j = j * i) {
counter++;
}
}
My solution is O(nlogn) because the first for is n-2 and the second for is doing log in base i of n and it's n-2 * logn, that is O(nlogn)-
But my teacher told us it's n and when I tried in cLion to run it it gives me 2*n and it's O(n). Can someone explain why it is O(n)?
Empirically, you can see that this is correct (that's around the right value for the sum of the series), for n=100 and n=1,000
If you want more intuition, you can think about the fact that for nearly all the series, i > sqrt(2).
for example, if n = 100 then 90% of values have i > 10, and for n = 1,000 97% have i > 32.
From that point onwards, all iterations of the outer loop will have at most 2 iterations in the inner loop (since log(n) with base sqrt(n) is 2, by definition).
If n grows really large, you can also apply the same logic to show that from the cube root to the square root, log is between 2 and 3, etc...
This would be O(nlogn) if j was incremented by i each iteration, not multiplied by it. As it is now, the j loop increases much more slowly than n grows, which is why your teacher and CLion state the time complexity as O(n).
Note that it's j=j*i, not j=j*2. That means most of the time, the inner loop will only have one pass. For example, with n of 33, the inner loop will only have one pass when i is in [7,33).
n = 33
j = 32
j = 16
j = 8 27
j = 4 9 16 25
j = 2 3 4 5 6
j = 1 1 1 1 1 1 1 1 1 1 1 1
--------------------------------------------
i = 2 3 4 5 6 7 8 9 10 11 ... 28 29
If you think of the above as a graph, it looks like the complexity of algorithm is O( area under 1/log(n) ). I have no idea how to prove that, and calculating that integral involves the unfamiliar-to-me logarithmic integral function. But the Wikipedia page does say this function is O( n / log n ).
Let's do it experimentally.
#include <stdio.h>
int main( void ) {
for ( int n = 20; n <= 20000; ++n ) {
int counter = 0;
for ( int i = 2; i < n; ++i ) {
for ( int j = 1; j < n ; j *= i ) {
++counter;
}
}
if ( n % 1000 == 0 )
printf( "%d: %.3f\n", n, counter / (n-1) );
}
}
1000: 2.047
2000: 2.033
3000: 2.027
4000: 2.023
5000: 2.021
6000: 2.019
7000: 2.017
8000: 2.016
9000: 2.015
10000: 2.014
11000: 2.013
12000: 2.013
13000: 2.012
14000: 2.012
15000: 2.011
16000: 2.011
17000: 2.011
18000: 2.010
19000: 2.010
20000: 2.010
So it doubles plus a little. But the extra little shrinks as n grows. So it's definitely not O( n log n ). It's something of the form O( n / f(n) ), where f() produces some number ≥1. It looks like it could be O( n / log n ), but that's pure speculation.
Whatever f(n) is, O( n / f(n) ) approaches O( n ) as n approaches infinity. So we can also call this O( n ).
For some value of i, j will go like
1 i^1 i^2 i^3 ....
So the number of times the inner loop needs to execute is found like
log_i(n)
which would lead to the following:
log_2(n) + log_3(n) + log_4(n) + ....
But... there is the stop condition j < n which need to be considered.
Now consider n as a number that can be written as m^2. As soon a i reach the value m all remaining inner loop iterations will only be done for j equal 1 and j equal i (because i^2 will be greater than n). In other words - there will only be 2 executions of the inner loop.
So the total number of iterations will be:
2 * (m^2 - m) + number_of_iteration(i=2:m)
Now divide that by n which is m^2:
(2 * (m^2 - m) + number_of_iteration(i=2:m)) / m^2
gives
2 * (1 -1/m) + number_of_iteration(i=2:m) / m^2
The first part 2 * (1 -1/m) clear goes towards 2 as m goes to inifinity.
The second part is (at worst):
(log_2(n) + log_3(n) + log_4(n) + ... + log_m(n)) / m^2
or
(log_2(n) + log_3(n) + log_4(n) + ... + log_m(n)) / n
As log(x)/x goes towards zero as x goes towards infinity, the above expression will also go towards zero.
So the full expression:
(2 * (m^2 - m) + number_of_iteration(i=2:m)) / m^2
will go towards 2 as m goes towards infinity.
In other words: The total number of iterations divided by n will go towards 2. Consequently we have O(n).
I'm trying to figure out why the time complexity of this code is n2/3. The space complexity is log n, but I don't know how to continue the time complexity calculation (or if it's right).
int g2 (int n, int m)
{
if (m >= n)
{
for (int i = 0; i < n; ++i)
printf("#");
return 1;
}
return 1 + g2 (n / 2, 4 * m);
}
int main (int n)
{
return g2 (n, 1);
}
As long as m < n, you perform an O(1) operation: making a recursive call. You halve n and quadruple m, so after k steps, you get
n(k) = n(0) * 0.5^k
m(k) = m(0) * 4^k
You can set them equal to each other to find that
n(0) / m(0) = 8^k
Taking the log
log(n(0)) - log(m(0)) = k log(8)
or
k = log_8(n(0)) - log_8(m(0))
On the kth recursion you perform n(k) loop iterations.
You can plug k back into n(k) = n(0) * 0.5^k to estimate the number of iterations. Let's ignore m(0) for now:
n(k) = n(0) * 0.5^log_8(n(0))
Taking again the log of both sides,
log_8(n(k)) = log_8(n(0)) + log_8(0.5) * log_8(n(0))
Since log_8(0.5) = -1/3, you get
log_8(n(k)) = log_8(n(0)) * (2/3)`
Taking the exponent again:
n(k) = n(0)^(2/3)
Since any positive exponent will overwhelm the O(log(n)) recursion, your final complexity is indeed O(n^(2/3)).
Let's look for a moment what happens if m(0) > 1.
n(k) = n(0) * 0.5^(log_8(n(0)) - log_8(m(0)))
Again taking the log:
log_8(n(k)) = log_8(n(0)) - 1/3 * (log_8(n(0)) - log_8(m(0)))
log_8(n(k)) = log_8(n(0)^(2/3)) + log_8(m(0)^(1/3))
So you get
n(k) = n(0)^(2/3) * m(0)^(1/3)
Or
n(k) = (m n^2)^(1/3)
Quick note on corner cases in the starting conditions:
For m > 0:
If n <= 0:, n <= m is immediately true and the recursion terminates and there is no loop.
For m < 0:
If n <= m, the recursion terminates immediately and there is no loop. If n > m, n will converge to zero while m diverges, and the algorithm will run forever.
The only interesting case is where m == 0. Regardless of whether n is positive or negative, it will reach zero because of integer truncation, so the complexity depends on when it reaches 1:
n(0) * 0.5^k = 1
log_2(n(0)) - k = 0
So in this case, the runtime of the recursion is still O(log(n)). The loop does not run.
m starts at 1, and at each step n -> n/2 and m -> m*4 until m>n. After k steps, n_final = n/2^k and m_final = 4^k. So the final value of k is where n/2^k = 4^k, or k = log8(n).
When this is reached, the inner loop performs n_final (approximately equal to m_final) steps, leading to a complexity of O(4^k) = O(4^log8(n)) = O(4^(log4(n)/log4(8))) = O(n^(1/log4(8))) = O(n^(2/3)).
I have a recursive function g3, which I cannot understand what is the logic behind it, and what it actually does in general case.
double g3(double n) {
if (n <= 1)
{
return 2;
}
double temp = g3(n / 2);
return temp * temp;
}
For 1 I got 2
For 2 I got 4
For 3 I got 16
For 4 I got 16
Can you help me understand what it does?
You could start by analyzing the cases, going from stop clause up, and looking not only on the "number" but what it represents:
g3(1) = 2 = 2^1
g3(2) = g3(1)^2 = 2^2
g3(4) = g3(2)^2 = (2^2)^2 = 2^4
g3(8) = g3(4)^2 = (2^4)^2 = 2^8
g3(16) = g3(8)^2 = (2^8)^2 = 2^16
So, this is pretty clear (I hope) what happens when n = 2^k for some integer k.
Can you prove it?
Can you repeat the process to answer what happens when n != 2^k for some integer k ?
For this piece of code:
// n is a user input that can be any integer
s = 0
i = 0
while i < n:
s = s + 1
i = i + 1
return s
I would like to prove that the post condition is if n > 0 then s = sum(0, n) else s = 0 where sum(s,e) just adds 1 from s to e exclusive, starting from initial value of 0.
I thought an invariant is
if n > 0 and i < n then s = sum(0, i) else s = 0 but I can't get it to be proven in Coq or z3. Any hints?
You seem to imply that this algorithm computes the sum but it doesn't actually do that. Instead, it'll count up to n. Perhaps what you intended is:
i = 0
s = 0
while i < n:
i = i+1
s = s+i
Note that we increment s by i, not by 1 as in your program.
Assuming this is the intended program, then a good invariant would be:
s is the sum of all numbers upto and including i
i is at most n
In more programmatic notation:
s == i*(i+1)/2 && i <= n
To see why, remember that the invariant has to hold before and after each loop iteration; and when the loop condition is false, it needs to imply your post-condition. That's why you need the conjunct i <= n, so that when you exit the loop, s will contain the sum indeed.
How about this solution:
// Function left unimplemented, for simplicity
function sum(s: Int, e: Int): Int
ensures result == e - s
method foo(n: Int) returns (s: Int)
requires 0 <= n
{
var i: Int := 0
s := 0
while (i < n)
invariant s == n - sum(i, n)
{
s := s + 1
i := i + 1
}
}
Language and tool are called Viper. You can try your example online (the web interface is somewhat slow and unstable), or use the VSCode plugin.
For a given b and N and a range of a say (0...n),
I need to find ans(0...n-1)
where,
ans[i] = no of a's for which pow(a, b)modN == i
What I am searching here is a possible repetition in pow(a,b)modN for a range of a, to reduce computation time.
Example:-
if b = 2 N = 3 and n = 5
for a in (0...4):
A[pow(a,b)modN]++;
so that would be
pow(0,2)mod3 = 0
pow(1,2)mod3 = 1
pow(2,2)mod3 = 1
pow(3,2)mod3 = 0
pow(4,2)mod3 = 1
so the final results would be:
ans[0] = 2 // no of times we have found 0 as answer .
ans[1] = 3
...
Your algorithm have a complexity of O(n).
Meaning it take a lot of time when n gets bigger.
You could have the same result with an algorithm O(N).
As N << n it will reduce your computation time.
Firts, two math facts :
pow(a,b) modulo N == pow (a modulo N,b) modulo N
and
if (i < n modulo N)
ans[i] = (n div N) + 1
else if (i < N)
ans[i] = (n div N)
else
ans[i] = 0
So a solution to your problem is to fill your result array with the following loop :
int nModN = n % N;
int nDivN = n / N;
for (int i = 0; i < N; i++)
{
if (i < nModN)
ans[pow(i,b) % N] += nDivN + 1;
else
ans[pow(i,b) % N] += nDivN;
}
You could calculate pow for primes only, and use pow(a*b,n) == pow(a,n)*pow(b,n).
So if pow(2,2) mod 3 == 1 and pow(3,2) mod 3 == 2, then pow(6,2) mod 3 == 2.