Big O of this sorting algorithm [duplicate] - c

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What’s the complexity of for i: for o = i+1
I have done the following sorting algorithm for arrays of length 5:
int myarray[5] = {2,4,3,5,1};
int i;
for (i = 0; i < 5; i++)
{
printf("%d", myarray[i]);
int j;
for (j=i+1; j < 5; j++)
{
int tmp = myarray[i];
if (myarray[i] > myarray[j]) {
tmp = myarray[i];
myarray[i] = myarray[j];
myarray[j] = tmp;
}
}
}
I believe that the complexity of this sorting algorithm is O(n*n) because for each element you compare it with the rest. However, I also notice that for each time we increate in the outer loop, we don't compare with all the rest, but with the rest - i. What would be the complexity?

It's still O(n²) (or O(n * n), as you wrote). Only the highest order term matters when analysing computational complexity.

You are right:
It's O(1 + 2 + 3... + N)
But mathematically it's just:
= O(n*((n-1)/2))
but that is just:
= O(n^2)

You are right, that it is O(n2).
Here's how to calculate it. On the first iteration, you will look at n elements; on the next, n - 1, and so on. If you write two copies of that sum, and divide by two, you can pair up the terms, such that you add the first term in the first copy n to the last term of the second copy 1, and so on. You wind up with n copies of n + 1, so the sum winds up being n * (n + 1) / 2. Big-O only distinguishes asymptotic behavior; the asymptotic behavior is described by the highest order term, without regard to constant factor, which is n2.
n + (n - 1) + (n - 2) ... + 1
= 2 * (n + (n - 1) + (n - 2) ... + 1) / 2
= ((n + 1) + (n - 1 + 2) + (n - 2 + 3) + ... + (1 + n)) / 2
= ((n + 1) + (n + 1) + ... + (n + 1)) / 2
= n * (n + 1) / 2
= 1/2 * n2 + 1/2 * n
= O(n2)

This is bubble sort, and it is indeed of complexity O(n^2)
The entire run time of the algorithm can be surmised in the following summation:
n + (n-1) + (n-2) + ... + 1 = n(n+1)/2
Since only the highest order term is of interest in asymptotic analysis, the complexity is O(n^2)

The big O notation is asymptotic. It means that we overlook constant factors such as - i. The complexity of your algorithm is O(N²) (see also here).

The complexity is O(1). The O notation only makes sense for large inputs, with, where an increase or decrease is not only visible, but relevant.
If you were to extend it, it would be O(n^2), yes.

for multiple loops
n*m*..no.of loops
for above code in worst case its n*n=n^2
BigOh means the max bound.
so the maximum complexity can't be greater then this.

For
i=0 it runs for n time
i=1 it runs for n-1 time
i=2 it runs for n-2 time
....
So total Sum = (n) + (n-1) + (n-2) + .... + 1
sum = (n*n) - (1 + 2 + ...)
= n^2 -
So big O complexity = O(n^2) { upper bound; + or - gets ignored }

Related

Why does this nested loop have O(n) time complexity?

I have a test in computer sience about complexity and I have this question:
int counter = 0;
for (int i = 2; i < n; ++i) {
for (int j = 1; j < n; j = j * i) {
counter++;
}
}
My solution is O(nlogn) because the first for is n-2 and the second for is doing log in base i of n and it's n-2 * logn, that is O(nlogn)-
But my teacher told us it's n and when I tried in cLion to run it it gives me 2*n and it's O(n). Can someone explain why it is O(n)?
Empirically, you can see that this is correct (that's around the right value for the sum of the series), for n=100 and n=1,000
If you want more intuition, you can think about the fact that for nearly all the series, i > sqrt(2).
for example, if n = 100 then 90% of values have i > 10, and for n = 1,000 97% have i > 32.
From that point onwards, all iterations of the outer loop will have at most 2 iterations in the inner loop (since log(n) with base sqrt(n) is 2, by definition).
If n grows really large, you can also apply the same logic to show that from the cube root to the square root, log is between 2 and 3, etc...
This would be O(nlogn) if j was incremented by i each iteration, not multiplied by it. As it is now, the j loop increases much more slowly than n grows, which is why your teacher and CLion state the time complexity as O(n).
Note that it's j=j*i, not j=j*2. That means most of the time, the inner loop will only have one pass. For example, with n of 33, the inner loop will only have one pass when i is in [7,33).
n = 33
j = 32
j = 16
j = 8 27
j = 4 9 16 25
j = 2 3 4 5 6
j = 1 1 1 1 1 1 1 1 1 1 1 1
--------------------------------------------
i = 2 3 4 5 6 7 8 9 10 11 ... 28 29
If you think of the above as a graph, it looks like the complexity of algorithm is O( area under 1/log(n) ). I have no idea how to prove that, and calculating that integral involves the unfamiliar-to-me logarithmic integral function. But the Wikipedia page does say this function is O( n / log n ).
Let's do it experimentally.
#include <stdio.h>
int main( void ) {
for ( int n = 20; n <= 20000; ++n ) {
int counter = 0;
for ( int i = 2; i < n; ++i ) {
for ( int j = 1; j < n ; j *= i ) {
++counter;
}
}
if ( n % 1000 == 0 )
printf( "%d: %.3f\n", n, counter / (n-1) );
}
}
1000: 2.047
2000: 2.033
3000: 2.027
4000: 2.023
5000: 2.021
6000: 2.019
7000: 2.017
8000: 2.016
9000: 2.015
10000: 2.014
11000: 2.013
12000: 2.013
13000: 2.012
14000: 2.012
15000: 2.011
16000: 2.011
17000: 2.011
18000: 2.010
19000: 2.010
20000: 2.010
So it doubles plus a little. But the extra little shrinks as n grows. So it's definitely not O( n log n ). It's something of the form O( n / f(n) ), where f() produces some number ≥1. It looks like it could be O( n / log n ), but that's pure speculation.
Whatever f(n) is, O( n / f(n) ) approaches O( n ) as n approaches infinity. So we can also call this O( n ).
For some value of i, j will go like
1 i^1 i^2 i^3 ....
So the number of times the inner loop needs to execute is found like
log_i(n)
which would lead to the following:
log_2(n) + log_3(n) + log_4(n) + ....
But... there is the stop condition j < n which need to be considered.
Now consider n as a number that can be written as m^2. As soon a i reach the value m all remaining inner loop iterations will only be done for j equal 1 and j equal i (because i^2 will be greater than n). In other words - there will only be 2 executions of the inner loop.
So the total number of iterations will be:
2 * (m^2 - m) + number_of_iteration(i=2:m)
Now divide that by n which is m^2:
(2 * (m^2 - m) + number_of_iteration(i=2:m)) / m^2
gives
2 * (1 -1/m) + number_of_iteration(i=2:m) / m^2
The first part 2 * (1 -1/m) clear goes towards 2 as m goes to inifinity.
The second part is (at worst):
(log_2(n) + log_3(n) + log_4(n) + ... + log_m(n)) / m^2
or
(log_2(n) + log_3(n) + log_4(n) + ... + log_m(n)) / n
As log(x)/x goes towards zero as x goes towards infinity, the above expression will also go towards zero.
So the full expression:
(2 * (m^2 - m) + number_of_iteration(i=2:m)) / m^2
will go towards 2 as m goes towards infinity.
In other words: The total number of iterations divided by n will go towards 2. Consequently we have O(n).

What is the time complexity of following code:

What is the time complexity of following code:
int a = 0, i = N;
while (i > 0) {
a += i;
i /= 2;
}
It's complexity will be O(logn).
On every iteration, value of n will be divided by 2, i.e. n + n/2 + n/4 + n/8 ... 1.
To understand logn check this out

When is the sum of series 1+1/2+1/3+1/4+......+1/n=log n and when is the same sum equal to n, i.e. 1+1/2+1/3+1/4+......+1/n=n

I'm new to understanding asymptotic analysis, while trying to find the big O notation, in a few problems it is given as log n for the same simplification of series and n for another problem.
Here are the questions:
int fun(int n)
{
int count = 0;
for (int i= n; i> 0; i/=2)
for (int j = 0; j < i; j++)
count ++;
return count;
}
T(n)=O(n)
int fun2(int n)
{
int count = 0;
for(i = 1; i < n; i++)
for(j = 1; j <= n; j += i)
count ++;
return count;
}
T(n)=O(n log n)
I'm really confused. Why are the complexities of these seemingly similar algorithms different?
The series formed in both the cases are different
Time Complexity Analysis
In this case first i will be n and the loop for j will go till n, then i will be n/2 and loop will go till n/2 and so on , So the time complexity will be
= n + n/2 + n/4 + n/8.......
The result of this sum is 2n-1 and hence the time complexity O(n)
In this case when i is n, we will loop for j n times, next time i will be 2 and we will skip one entry at a time, which means we are iterating n/2 times, and so on. So the time complexity will be
= n + n/2 + n/3 + n/4........
= n (1 + 1/2 + 1/3 + 1/4 +....)
= O(nlogn)
The sum of 1 + 1/2 + 1/3... is O(logn). For solution see.
For the former, the inner loop runs approximately (exactly if n is a power of 2)
n + n/2 + n/4 + n/8 + ... + n/2^log2(n)
times. It can be factored into
n * (1 + 1/2 + 1/4 + 1/8 + ... + (1/2)^(log2 n))
The 2nd factor is called (a partial sum of) the geometric series which converges, meaning that as we approach the infinity it will approach a constant. Therefore it is θ(1); when you multiply this by n you get θ(n)
I've made an analysis of the latter algorithm just a couple days ago. The number of iterations for n in that algorithm are
ceil(n) + ceil(n / 2) + ceil(n/3) + ... + ceil(n/n)
It is quite close to a partial sum the harmonic series multiplied by n:
n * (1 + 1/2 + 1/3 + 1/4 + ... 1/n)
Unlike the geometric series the harmonic series does not converge, but it diverges as we add more terms. The partial sums of first n terms can be bounded above and below by ln n + C, hence the time complexity of the entire algorithm is θ(n log n).

sum's sum of divizors of numbers less than or equal to N

I really need some help at this problem:
Given a positive integer N, we define xsum(N) as sum's sum of all positive integer divisors' numbers less or equal to N.
For example: xsum(6) = 1 + (1 + 2) + (1 + 3) + (1 + 2 + 4) + (1 + 5) + (1 + 2 + 3 + 6) = 33.
(xsum - sum of divizors of 1 + sum of divizors of 2 + ... + sum of div of 6)
Given a positive integer K, you are asked to find the lowest N that satisfies the condition: xsum(N) >= K
K is a nonzero natural number that has at most 14 digits
time limit : 0.2 sec
Obviously, the brute force will fall for most cases with Time Limit Exceeded. I haven't find something better than it yet, so that's the code:
fscanf(fi,"%lld",&k);
i=2;
sum=1;
while(sum<k) {
sum=sum+i+1;
d=2;
while(d*d<=i) {
if(i%d==0 && d*d!=i)
sum=sum+d+i/d;
else
if(d*d==i)
sum+=d;
d++;
}
i++;
}
Any better ideas?
For each number n in range [1 , N] the following applies: n is divisor of exactly roundDown(N / n) numbers in range [1 , N]. Thus for each n we add a total of n * roundDown(N / n) to the result.
int xsum(int N){
int result = 0;
for(int i = 1 ; i <= N ; i++)
result += (N / i) * i;//due to the int-division the two i don't cancel out
return result;
}
The idea behind this algorithm can aswell be used to solve the main-problem (smallest N such that xsum(N) >= K) in faster time than brute-force search.
The complete search can be further optimized using some rules we can derive from the above code: K = minN * minN (minN would be the correct result if K = 2 * 3 * ...). Using this information we have a lower-bound for starting the search.
Next step would be to search for the upper bound. Since the growth of xsum(N) is (approximately) quadratic we can use this to approximate N. This optimized guessing allows to find the searched value pretty fast.
int N(int K){
//start with the minimum-bound of N
int upperN = (int) sqrt(K);
int lowerN = upperN;
int tmpSum;
//search until xsum(upperN) reaches K
while((tmpSum = xsum(upperN)) < K){
int r = K - tmpSum;
lowerN = upperN;
upperN += (int) sqrt(r / 3) + 1;
}
//Now the we have an upper and a lower bound for searching N
//the rest of the search can be done using binary-search (i won't
//implement it here)
int N;//search for the value
return N;
}

Need explaination for this code (algorithm)

The problem:
Larry is very bad at math - he usually uses a calculator, which worked well throughout college. Unforunately, he is now struck in a deserted island with his good buddy Ryan after a snowboarding accident. They're now trying to spend some time figuring out some good problems, and Ryan will eat Larry if he cannot answer, so his fate is up to you!
It's a very simple problem - given a number N, how many ways can K numbers less than N add up to N?
For example, for N = 20 and K = 2, there are 21 ways:
0+20
1+19
2+18
3+17
4+16
5+15
...
18+2
19+1
20+0
Input
Each line will contain a pair of numbers N and K. N and K will both be an integer from 1 to 100, inclusive. The input will terminate on 2 0's.
Output
Since Larry is only interested in the last few digits of the answer, for each pair of numbers N and K, print a single number mod 1,000,000 on a single line.
Sample Input
20 2
20 2
0 0
Sample Output
21
21
The solution code:
#include<iostream>
#include<stdlib.h>
#include<stdio.h>
using namespace std;
#define maxn 100
typedef long ss;
ss T[maxn+2][maxn+2];
void Gen() {
ss i, j;
for(i = 0; i<= maxn; i++)
T[1][i] = 1;
for(i = 2; i<= 100; i++) {
T[i][0] = 1;
for(j = 1; j <= 100; j++)
T[i][j] = (T[i][j-1] + T[i-1][j]) % 1000000;
}
}
int main() {
//freopen("in.txt", "r", stdin);
ss n, m;
Gen();
while(cin>>n>>m) {
if(!n && !m) break;
cout<<T[m][n]<<endl;
}
return 0;
}
How has this calculation been derived?
How has it come T[i][j] = (T[i][j-1] + T[i-1][j]) ?
Note: I only use n and k (lower case) to refer to some anonymous variable. I will always use N and K (upper case) to refer to N and K as defined in the question (sum and the number of portions).
Let C(n, k) be the result of n choose k, then the solution to the problem is C(N + K - 1, K - 1), with the assumption that those K numbers are non-negative (or there will be infinitely many solution even for N = 0 and K = 2).
Since the K numbers are non-negative, and the sum N is fixed, we can think of the problem as: how many ways to divide candy among K people. We can divide the candies, by lying them into a line, and put (K - 1) separator between the candies. The (K - 1) separators will divide the candies up to K portions of candies. Looking at another perspective, it is also like choosing (K - 1) positions among (N + K - 1) positions to put in the separators, then the rest of the positions are candies. So, this explains why the number of ways is N + (K - 1) choose (K - 1).
Then the problem reduce to how to find the least significant digits of C(n, k). (Since maximum of N and K is 100 as defined in maxn, we don't have to worry if the algorithm goes up to O(n3)).
The calculation uses this combinatorial identity C(n, k) = C(n - 1, k) + C(n, k - 1) (Pascal's rule). The clever thing about the implementation is that it doesn't store C(n, k) (table of result of combination, which is a jagged array), but it stores C(N, K) instead. The identity is actually present in the T[i][j] = (T[i][j-1] + T[i-1][j]):
The first dimension is actually K, the number of portions. And the second dimension is the sum N. T[K][N] will directly store the result, and according to the mathematical result derived above, is (least significant digits of) C(N + K - 1, K - 1).
Re-writing the T[i][j] = (T[i][j-1] + T[i-1][j]) back to equivalent mathematical result:
C(i + j - 1, i - 1) = C(i + j - 2, i - 1) + C(i + j - 2, i - 2), which is correct according to the identity.
The program will fill the array row by row:
The row K = 0 is already initialized to 0, using the fact that static array is initialized to 0.
It fills the row K = 1 with 1 (there is only 1 way to divide N into 1 portion).
For the rest of the rows, it sets the case N = 0 to 1 (there is only 1 way to divide 0 into K parts - all parts are 0).
Then the rest are filled with the expression T[i][j] = (T[i][j-1] + T[i-1][j]), which will refer to the previous row, and the previous element of the same row, both of which has been filled up in earlier iterations.
Let C(x, y) to be the result of x choose y, then the value of T[i][j] equals: C(i - 1 + j, j).
You can proove this by induction.
Base cases:
T[1][j] = C(1 - 1 + j, j) = C(j, j) = 1
T[i][0] = C(i - 1, 0) = 1
For the induction step, use the formula (for 0<=y<=x):
C(x,y) = C(x - 1, y - 1) + C(x - 1, y)
Therefore:
C(i - 1 + j, j) = C(i-1+j - 1, j - 1) + C(i-1+j - 1, j) = C(i-1+(j-1), (j-1)) + C((i-1)-1+j, j)
Or in other words:
T[i][j] = T[i,j-1] + T[i-1,j]
Now, as nhahtdh mentioned before, the value you are looking for is C(N + K - 1, K - 1)
which equals:
T[N+1][K-1] = C(N+1-1+K-1, K-1)
(modulo 1000000)
This is a famous problem - you can check solution here
How many ways to drop N identical balls to K boxes.
The following algorithm is a dynamic-programming solution to your problem:
Define D[i,j] to be the number of ways i numbers less than j, can sum up to j.
0 <= i < = N
1 <= j <= K
Where D[j,1] = 1 for every j.
And where j > 1 you get:
D[i,j] = D[i,j-1] + D[i-1,j-1] +...+ D[0,j-1]
The problem is known as "the integer partition problem". Basically there exists a recursive computation of the k-partition of n, but your solution is just the dynamic programming version of it (non-recursive and computing bottom-up for short).

Resources