I'm trying to figure out why the time complexity of this code is n2/3. The space complexity is log n, but I don't know how to continue the time complexity calculation (or if it's right).
int g2 (int n, int m)
{
if (m >= n)
{
for (int i = 0; i < n; ++i)
printf("#");
return 1;
}
return 1 + g2 (n / 2, 4 * m);
}
int main (int n)
{
return g2 (n, 1);
}
As long as m < n, you perform an O(1) operation: making a recursive call. You halve n and quadruple m, so after k steps, you get
n(k) = n(0) * 0.5^k
m(k) = m(0) * 4^k
You can set them equal to each other to find that
n(0) / m(0) = 8^k
Taking the log
log(n(0)) - log(m(0)) = k log(8)
or
k = log_8(n(0)) - log_8(m(0))
On the kth recursion you perform n(k) loop iterations.
You can plug k back into n(k) = n(0) * 0.5^k to estimate the number of iterations. Let's ignore m(0) for now:
n(k) = n(0) * 0.5^log_8(n(0))
Taking again the log of both sides,
log_8(n(k)) = log_8(n(0)) + log_8(0.5) * log_8(n(0))
Since log_8(0.5) = -1/3, you get
log_8(n(k)) = log_8(n(0)) * (2/3)`
Taking the exponent again:
n(k) = n(0)^(2/3)
Since any positive exponent will overwhelm the O(log(n)) recursion, your final complexity is indeed O(n^(2/3)).
Let's look for a moment what happens if m(0) > 1.
n(k) = n(0) * 0.5^(log_8(n(0)) - log_8(m(0)))
Again taking the log:
log_8(n(k)) = log_8(n(0)) - 1/3 * (log_8(n(0)) - log_8(m(0)))
log_8(n(k)) = log_8(n(0)^(2/3)) + log_8(m(0)^(1/3))
So you get
n(k) = n(0)^(2/3) * m(0)^(1/3)
Or
n(k) = (m n^2)^(1/3)
Quick note on corner cases in the starting conditions:
For m > 0:
If n <= 0:, n <= m is immediately true and the recursion terminates and there is no loop.
For m < 0:
If n <= m, the recursion terminates immediately and there is no loop. If n > m, n will converge to zero while m diverges, and the algorithm will run forever.
The only interesting case is where m == 0. Regardless of whether n is positive or negative, it will reach zero because of integer truncation, so the complexity depends on when it reaches 1:
n(0) * 0.5^k = 1
log_2(n(0)) - k = 0
So in this case, the runtime of the recursion is still O(log(n)). The loop does not run.
m starts at 1, and at each step n -> n/2 and m -> m*4 until m>n. After k steps, n_final = n/2^k and m_final = 4^k. So the final value of k is where n/2^k = 4^k, or k = log8(n).
When this is reached, the inner loop performs n_final (approximately equal to m_final) steps, leading to a complexity of O(4^k) = O(4^log8(n)) = O(4^(log4(n)/log4(8))) = O(n^(1/log4(8))) = O(n^(2/3)).
Related
I have a test in computer sience about complexity and I have this question:
int counter = 0;
for (int i = 2; i < n; ++i) {
for (int j = 1; j < n; j = j * i) {
counter++;
}
}
My solution is O(nlogn) because the first for is n-2 and the second for is doing log in base i of n and it's n-2 * logn, that is O(nlogn)-
But my teacher told us it's n and when I tried in cLion to run it it gives me 2*n and it's O(n). Can someone explain why it is O(n)?
Empirically, you can see that this is correct (that's around the right value for the sum of the series), for n=100 and n=1,000
If you want more intuition, you can think about the fact that for nearly all the series, i > sqrt(2).
for example, if n = 100 then 90% of values have i > 10, and for n = 1,000 97% have i > 32.
From that point onwards, all iterations of the outer loop will have at most 2 iterations in the inner loop (since log(n) with base sqrt(n) is 2, by definition).
If n grows really large, you can also apply the same logic to show that from the cube root to the square root, log is between 2 and 3, etc...
This would be O(nlogn) if j was incremented by i each iteration, not multiplied by it. As it is now, the j loop increases much more slowly than n grows, which is why your teacher and CLion state the time complexity as O(n).
Note that it's j=j*i, not j=j*2. That means most of the time, the inner loop will only have one pass. For example, with n of 33, the inner loop will only have one pass when i is in [7,33).
n = 33
j = 32
j = 16
j = 8 27
j = 4 9 16 25
j = 2 3 4 5 6
j = 1 1 1 1 1 1 1 1 1 1 1 1
--------------------------------------------
i = 2 3 4 5 6 7 8 9 10 11 ... 28 29
If you think of the above as a graph, it looks like the complexity of algorithm is O( area under 1/log(n) ). I have no idea how to prove that, and calculating that integral involves the unfamiliar-to-me logarithmic integral function. But the Wikipedia page does say this function is O( n / log n ).
Let's do it experimentally.
#include <stdio.h>
int main( void ) {
for ( int n = 20; n <= 20000; ++n ) {
int counter = 0;
for ( int i = 2; i < n; ++i ) {
for ( int j = 1; j < n ; j *= i ) {
++counter;
}
}
if ( n % 1000 == 0 )
printf( "%d: %.3f\n", n, counter / (n-1) );
}
}
1000: 2.047
2000: 2.033
3000: 2.027
4000: 2.023
5000: 2.021
6000: 2.019
7000: 2.017
8000: 2.016
9000: 2.015
10000: 2.014
11000: 2.013
12000: 2.013
13000: 2.012
14000: 2.012
15000: 2.011
16000: 2.011
17000: 2.011
18000: 2.010
19000: 2.010
20000: 2.010
So it doubles plus a little. But the extra little shrinks as n grows. So it's definitely not O( n log n ). It's something of the form O( n / f(n) ), where f() produces some number ≥1. It looks like it could be O( n / log n ), but that's pure speculation.
Whatever f(n) is, O( n / f(n) ) approaches O( n ) as n approaches infinity. So we can also call this O( n ).
For some value of i, j will go like
1 i^1 i^2 i^3 ....
So the number of times the inner loop needs to execute is found like
log_i(n)
which would lead to the following:
log_2(n) + log_3(n) + log_4(n) + ....
But... there is the stop condition j < n which need to be considered.
Now consider n as a number that can be written as m^2. As soon a i reach the value m all remaining inner loop iterations will only be done for j equal 1 and j equal i (because i^2 will be greater than n). In other words - there will only be 2 executions of the inner loop.
So the total number of iterations will be:
2 * (m^2 - m) + number_of_iteration(i=2:m)
Now divide that by n which is m^2:
(2 * (m^2 - m) + number_of_iteration(i=2:m)) / m^2
gives
2 * (1 -1/m) + number_of_iteration(i=2:m) / m^2
The first part 2 * (1 -1/m) clear goes towards 2 as m goes to inifinity.
The second part is (at worst):
(log_2(n) + log_3(n) + log_4(n) + ... + log_m(n)) / m^2
or
(log_2(n) + log_3(n) + log_4(n) + ... + log_m(n)) / n
As log(x)/x goes towards zero as x goes towards infinity, the above expression will also go towards zero.
So the full expression:
(2 * (m^2 - m) + number_of_iteration(i=2:m)) / m^2
will go towards 2 as m goes towards infinity.
In other words: The total number of iterations divided by n will go towards 2. Consequently we have O(n).
Problem : Finding n prime numbers.
#include<stdio.h>
#include<stdlib.h>
void firstnprimes(int *a, int n){
if (n < 1){
printf("INVALID");
return;
}
int i = 0, j, k; // i is the primes counter
for (j = 2; i != n; j++){ // j is a candidate number
for (k = 0; k < i; k++)
{
if (j % a[k] == 0) // a[k] is k-th prime
break;
}
if (k == i) // end-of-loop was reached
a[i++] = j; // record the i-th prime, j
}
return;
}
int main(){
int n;
scanf_s("%d",&n);
int *a = (int *)malloc(n*sizeof(int));
firstnprimes(a,n);
for (int i = 0; i < n; i++)
printf("%d\n",a[i]);
system("pause");
return 0;
}
My function's inner loop runs for i times (at the most), where i is the number of prime numbers below a given candidate number, and the outer loop runs for (nth prime number - 2) times.
How can I derive the complexity of this algorithm in Big O notation?
Thanks in advance.
In pseudocode your code is
firstnprimes(n) = a[:n] # array a's first n entries
where
i = 0
a = [j for j in [2..]
if is_empty( [j for p in a[:i] if (j%p == 0)] )
&& (++i) ]
(assuming the short-circuiting is_empty which returns false as soon as the list is discovered to be non-empty).
What it does is testing each candidate number from 2 and up by all its preceding primes.
Melissa O'Neill analyzes this algorithm in her widely known JFP article and derives its complexity as O( n^2 ).
Basically, each of the n primes that are produced is paired up with (is tested by) all the primes preceding it (i.e. k-1 primes, for the k th prime) and the sum of the arithmetic progression 0...(n-1) is (n-1)n/2 which is O( n^2 ); and she shows that composites do not contribute any term which is more significant than that to the overall sum, as there are O(n log n) composites on the way to n th prime but the is_empty calculation fails early for them.
Here's how it goes: with m = n log n, there will be m/2 evens, for each of which the is_empty calculation takes just 1 step; m/3 multiples of 3 with 2 steps; m/5 with 3 steps; etc.
So the total contribution of the composites, overestimated by not dealing with the multiplicities (basically, counting 15 twice, as a multiple of both 3 and 5, etc.), is:
SUM{i = 1, ..., n} (i m / p_i) // p_i is the i-th prime
= m SUM{i = 1, ..., n} (i / p_i)
= n log(n) SUM{i = 1, ..., n} (i / p_i)
< n log(n) (n / log(n)) // for n > 14,000
= n^2
The inequality can be tested at Wolfram Alpha cloud sandbox as Sum[ i/Prime[i], {i, 14000}] Log[14000.0] / 14000.0 (which is 0.99921, and diminishing for bigger n, tested up to n = 2,000,000 where it's 0.963554).
The prime number theorem states that asymptotically, the number of primes less than n is equal to n/log n. Therefore, your inner loop will run Theta of i * max =n / log n * n times (assuming max=n).
Also, your outer loop runs on the order of n log n times, making the total complexity Theta of n / log n * n * n log n = n^3. In other words, this is not the most efficient algorithm.
Note that there are better approximations around (e.g. the n-th prime number is closer to:
n log n + n log log n - n + n log log n / log n + ...
But, since you are concerned with just big O, this approximation is good enough.
Also, there are much better algorithms for doing what you're looking to do. Look up the topic of pseudoprimes, for more information.
I'm trying to find the time complexity of this function:
int bin_search(int a[], int n, int x); // Binary search on an array with size n.
int f(int a[], int n) {
int i = 1, x = 1;
while (i < n) {
if (bin_search(a, i, x) >= 0) {
return x;
}
i *= 2;
x *= 2;
}
return 0;
}
The answer is (log n)^2. How come?
Best I could get is log n. First the i is 1, so the while will be run log n times.At first interaction, when i=1, the binary search will have only one interaction because the array's size is 1(i). Then, when i=2, two interactions, and so on until it's log n interactions.
So the formula I thought would fit is this.
The summation is for the while and the inner equation is because for i=1 it's log(1), for i=2 it's log(2) and so on until it's log(n) at the last.
Where am I wrong?
Each iteration performs a binary search on the first 2^i elements of the array.
You can compute the number of operations (comparisons):
log2(1) + log2(2) + log2(4) + ... + log2(2^m)
log(2^n) equals n, so this series simplifies into:
0 + 1 + 2 + ... + m
Where m is floor(log2(n)).
The series evaluates to m * (m + 1) / 2, replacing m we get
floor(log2(n)) * (floor(log2(n)) + 1) / 2
-> 0.5 * floor(log2(n))^2 + 0.5 * floor(log2(n))
The first element dominates the second, hence the complexity is O(log(n)^2)
I really need some help at this problem:
Given a positive integer N, we define xsum(N) as sum's sum of all positive integer divisors' numbers less or equal to N.
For example: xsum(6) = 1 + (1 + 2) + (1 + 3) + (1 + 2 + 4) + (1 + 5) + (1 + 2 + 3 + 6) = 33.
(xsum - sum of divizors of 1 + sum of divizors of 2 + ... + sum of div of 6)
Given a positive integer K, you are asked to find the lowest N that satisfies the condition: xsum(N) >= K
K is a nonzero natural number that has at most 14 digits
time limit : 0.2 sec
Obviously, the brute force will fall for most cases with Time Limit Exceeded. I haven't find something better than it yet, so that's the code:
fscanf(fi,"%lld",&k);
i=2;
sum=1;
while(sum<k) {
sum=sum+i+1;
d=2;
while(d*d<=i) {
if(i%d==0 && d*d!=i)
sum=sum+d+i/d;
else
if(d*d==i)
sum+=d;
d++;
}
i++;
}
Any better ideas?
For each number n in range [1 , N] the following applies: n is divisor of exactly roundDown(N / n) numbers in range [1 , N]. Thus for each n we add a total of n * roundDown(N / n) to the result.
int xsum(int N){
int result = 0;
for(int i = 1 ; i <= N ; i++)
result += (N / i) * i;//due to the int-division the two i don't cancel out
return result;
}
The idea behind this algorithm can aswell be used to solve the main-problem (smallest N such that xsum(N) >= K) in faster time than brute-force search.
The complete search can be further optimized using some rules we can derive from the above code: K = minN * minN (minN would be the correct result if K = 2 * 3 * ...). Using this information we have a lower-bound for starting the search.
Next step would be to search for the upper bound. Since the growth of xsum(N) is (approximately) quadratic we can use this to approximate N. This optimized guessing allows to find the searched value pretty fast.
int N(int K){
//start with the minimum-bound of N
int upperN = (int) sqrt(K);
int lowerN = upperN;
int tmpSum;
//search until xsum(upperN) reaches K
while((tmpSum = xsum(upperN)) < K){
int r = K - tmpSum;
lowerN = upperN;
upperN += (int) sqrt(r / 3) + 1;
}
//Now the we have an upper and a lower bound for searching N
//the rest of the search can be done using binary-search (i won't
//implement it here)
int N;//search for the value
return N;
}
Very similar complexity examples. I am trying to understand as to how these questions vary. Exam coming up tomorrow :( Any shortcuts for find the complexities here.
CASE 1:
void doit(int N) {
while (N) {
for (int j = 0; j < N; j += 1) {}
N = N / 2;
}
}
CASE 2:
void doit(int N) {
while (N) {
for (int j = 0; j < N; j *= 4) {}
N = N / 2;
}
}
CASE 3:
void doit(int N) {
while (N) {
for (int j = 0; j < N; j *= 2) {}
N = N / 2;
}
}
Thank you so much!
void doit(int N) {
while (N) {
for (int j = 0; j < N; j += 1) {}
N = N / 2;
}
}
To find the O() of this, notice that we are dividing N by 2 each iteration. So, (not to insult your intelligence, but for completeness) the final non-zero iteration through the loop we will have N=1. The time before that we will have N=a(2), then before that N=a(4)... where 0< a < N (note those are non-inclusive bounds). So, this loop will execute a total of log(N) times, meaning the first iteration we see that N=a2^(floor(log(N))).
Why do we care about that? Well, it's a geometric series which has a nice closed form:
Sum = \sum_{k=0}^{\log(N)} a2^k = a*\frac{1-2^{\log N +1}}{1-2} = 2aN-a = O(N).
If someone can figure out how to get that latexy notation to display correctly for me I would really appreciate it.
You already have the answer to number 1 - O(n), as given by #NickO, here is an alternative explanation.
Denote the number of outer repeats of inner loop by T(N), and let the number of outer loops be h. Note that h = log_2(N)
T(N) = N + N/2 + ... + N / (2^i) + ... + 2 + 1
< 2N (sum of geometric series)
in O(N)
Number 3: is O((logN)^2)
Denote the number of outer repeats of inner loop by T(N), and let the number of outer loops be h. Note that h = log_2(N)
T(N) = log(N) + log(N/2) + log(N/4) + ... + log(1) (because log(a*b) = log(a) + log(b)
= log(N * (N/2) * (N/4) * ... * 1)
= log(N^h * (1 * 1/2 * 1/4 * .... * 1/N))
= log(N^h) + log(1 * 1/2 * 1/4 * .... * 1/N) (because log(a*b) = log(a) + log(b))
< log(N^h) + log(1)
= log(N^h) (log(1) = 0)
= h * log(N) (log(a^b) = b*log(a))
= (log(N))^2 (because h=log_2(N))
Number 2 is almost identical to number 3.
(In 2,3: assuming j starts from 1, not from 0, if this is not the case #WhozCraig giving the reason why it never breaks)