Could you please help me in understanding the Time Complexity for Divide and Conquer algorithm.
Let's take example of this one.
http://www.geeksforgeeks.org/archives/4583 Method 2:
It gave T(n) = 3/2n -2 and i don't understand why?
I am sorry, if i gave you an extra page to open too but i really wanna understand atleast to a good high level so that i can find the complexity of such algorithms on my own, You answer is highly appreciated.
Can't open this link due to some reason. I'll still give it a try.
When you use the divide and conquer strategy, what you do is you break up the problem into many smaller problems and then you combine the solutions for the small problems to get the solution for the main problem.
How to solve the smaller problems: By breaking them up further. This process of breaking up continues until you reach a level where the problem is small enough to be handled directly.
How to compute time complexity:
Assume the time taken by your algo is T(n). Notice that time taken is a function of the problem size i.e. n.
Now, notice what you are doing. You break up the problems into let's say k parts each of size n/k (they may not be equal in size, in which case you'll have to add the time taken by them individually). Now, you'll solve these k parts. Time taken by each part would be T(n/k) because the problem size is reduced to n/k now. And you are solving k of these. So, it takes k * T(n/k) time.
After solving these smaller problems, you'll combine their solutions. This will also take some time. And that time will be a function of your problem size again. (It could also be constant). Let that time be O(f(n)).
So, total time taken by your algorithm will be:
T(n) = (k * T(n/k)) + O(f(n))
You've got a recurrence relation now which you can solve to get T(n).
As this link indicate:
T(n) = T(floor(n/2)) + T(ceil(n/2)) + 2
T(2) = 1
T(1) = 0
for T(2), it is a base with single comparison before returning. for T(1) it is a base without any comparison.
For T(n): You recursively call the method for two halves of the array, and compare the two (min,max) tuples to find the real min and max, which gives you the above T(n) equation
If n is a power of 2, then we can write T(n) as:
T(n) = 2T(n/2) + 2
This is well explaining itself.
T(n) = 3/2n -2
In here, you solve it with induction:
Base case: for n=2: T(2) = 1 = (3/2)*2 -2
We assume T(k) = (3/2)k - 2 for each k < n
T(n) = 2T(n/2) + 2 = (*) 2*((3/2*(n/2)) -2) + 2 = 3*(n/2) -4 + 2 = (3/2)*n -2
(*)induction assumption, is true because n/2 < n
Because we proved the induction correct, we can conclude: T(n) = (3/2)n - 2
Related
For linear search it makes sense that the run time is big O of N since it will always be one step. As for my understanding of bubble sort it's runtime is O of n^2 this makes sense to me because you'd iterate the number of elements in the an array and each time compare two values till the end of said array.
But for merge sort it's always splitting the data in half, so I'm confused as to explanation as to why the run time is n log n. Additionally I want to clarify my understanding of insertion sorts runtime big O of n^2. Since insertion sort looks for the smallest number then compares it to every single number of the array it would be n^2 because it will loop through the array contents for every iteration.
If I could be given some advice about merge sort, and general understanding of run times that'd be appreciated. I am an absolute newbie and wanted to throw that disclaimer out there.
Let's assume that sorting of an array of N elements is taking T(N) time. In merge sort we know that we need to sort two arrays of N/2 elements (that is 2*T(N/2)) and then merge them (in O(N) time complexity, that is c*N for some constant c).
So, T(N) = 2T(N/2) + c*N.
We could stop here, as it is basically the "equation" you asking about. But let's go a bit further.
To simplify things, we can show that T(N) = kN log N as follows (for some constant k):
Let's substitute T on both sides of the equation we have derived:
kN log N = 2 * k*(N/2) log (N/2) + c*N
and expand the right hand side (assuming log with base 2):
= k*N *(log N - log 2) + c*N = k*N *(log N - 1) + c*N = kNlog N + (c-k)N
That is for c=k the equality holds, and it proves that T(N) if of a form kN log N, that is O(N log N)
I need to calculate the time complexity of the f3 function:
My problem is that I can't succeed on calculate how many time I can appley sqrt() function on n till its lower than 1: n^0.5^k < 1
I can assume that the time complexity of sqrt() is 1.
any ideas how can I get the k value out of n^0.5^k < 1 ? if I succeed that, then I think value the sum of the series: n/2, (n^0.5)/2, (n^0.5^2)/2,... would be easier.
I will show the lower and upper bound.
First we compute the cost of g3.
Take for example, n = 2^16.
How many iterations we make in the for loop?
i=2^0, i=2^1, i=2^2, i=2^3... < 2^16
More or less, that would be 16 steps. So the cost of g3 is O(log(n)).
Now lets try to compute f3. Since it's using g3 inside the loop, it would go as follows:
log(n) + log(n^(1/2)) + log(n^(1/4)) + log(n^(1/8)) + ...
That's for sure greater than log(n), so we could take log(n) as the lower bound.
Now, in order to compute the upper bound we have to think, how many iterations does the loop do?
Take again 2^16 as an example:
2^16, 2^16^(1/2), 2^16^(1/4), 2^16^(1/8), 2^16^(1/16),
That turns out to be:
2^16, 2^8, 2^4, 2^2, 2^1
And in the next iteration we would stop because sqrt(2) rounds to 1.
So in general, if n=2^2^k, we make k iterations. That's log(log(n)). That means we could say log(n)*log(log(n)) as the upper bound.
There is probably a more adjusted solution but this should be pretty accurate.
We use the regular quicksort algorithm. The pivot chosen is the median, but in order to find the median it takes Theta(n^{2006/2005}) Worst case.
Why is the worst case of the algorithm equal Theta(n^{2006/2005}) and not Theta(n^{2006/2005} * logn)?
First, you need to understand that each "iteration" does NOT take N^2006/2005 where N is the size of the ORIGINAL array, in fact - since this is a superlinear function, finding the median in 2 halves of the array is easier than finding it in the big array.
To formally prove it, we will first, define the recursive complexity formula: (for simplicity we assume the median takes exactly n^2006/2005, but it is easy to modify this for upper bound of C*n^2006/2005.)
T(n) = n^2006/2005 + 2T(n/2)
Now, we can show it by induction by proving
T(n) <= 2* n^2006/2005
The base clause here is trivial for small enough value of n.
Assume for each k<n, the assumption T(k) <= 2*(n/2)^2006/2005 holds.
T(n) = n^2006/2005 + 2T(n/2) <= (i.h.)
<= n^2006/2005 + 2*(2*(n/2)^2006/2005) =
= n^2006/2005 + 4 * (n/2)^2006/2005 =
= (*) 2^(2004/2005) *n^(2006/2005) + n^(2006/2005)
<= 2*n^(2006/2005)
The (*) equality comes from wolfram alpha, and you can derive it as well using some algebra on the formula.
Also note, this does not contradict the fact that sorting is Omega(nlogn), since n^(1+epsilon) > nlogn for every epsilon>0, and for large enough values of n.
I am facing some problems in algorithm course.
How to compute the time complexity of the following algorithm:
I try to put a constant number instead of n, and try to know its complexity
but I find myself getting very confused with Big O questions.
x=0;
for(i=0;i<n*n;i++)
for(j=0;j<i;j++)
x=x+i;
I want to know the the steps to solve the problem so I can solve such problems.
The best thing to do is to have a pen and a paper, run it for several values of n and try to have a direction. Then, you do the following:
For n = 0, the inner loop won't be executed.
For n = 1, the inner loop will be executed 1 times.
For n = 2, the inner loop will be executed 4 times.
For n = 3, the inner loop will be executed 9 times.
"How many times does the outer loop execute?"
"n2"
"How many times the inner loop execute?"
"n2"
So you conclude that the time-complexity is O(n4).
The time complexity of your code in asymptotic big O notation will be O( n ^ 4 ). The actual number of operations ( which is 'x = x + 1' ) will be close to ( ( n ^ 4 ) / 2 ) times. Let me break it down for you, the first loop execute exactly n^2 times and for each of it's iteration, the nested loop will be iterated over i times. So at worst case it(second loop) will execute n^2 times. In total, it becomes O( n ^ 4 ).
If the variables x, i and j are unsigned integral types, the answer is O(1), since both loops can execute at most a limited number of times. For example sqrt(UINT_MAX) times if the variables are unsigned ints.
If the variables are signed integral types, the code produces undefined behavior for n large enough to produce an overflow, making the question unanswerable.
If you treat the variables as idealised, then you can count exactly the number of times the x=x+i statement is executed.
Namely,
0 + 1 + 2 + 3 + ... + (n^2 - 1)
This is (n^2 - 1) * (n^2) / 2 or (n^4)/2 - (n^2)/2, which is O(n^4).
The formal steps to infer the time complexity of your algorithm above:
Let A be an array of size N.
we call a couple of indexes (i,j) an "inverse" if i < j and A[i] > A[j]
I need to find an algorithm that receives an array of size N (with unique numbers) and return the number of inverses in time of O(n*log(n)).
You can use the merge sort algorithm.
In the merge algorithm's loop, the left and right halves are both sorted ascendingly, and we want to merge them into a single sorted array. Note that all the elements in the right side have higher indexes than those in the left side.
Assume array[leftIndex] > array[rightIndex]. This means that all elements in the left part following the element with index leftIndex are also larger than the current one in the right side (because the left side is sorted ascendingly). So the current element in the right side generates numberOfElementsInTheLeftSide - leftIndex + 1 inversions, so add this to your global inversion count.
Once the algorithm finishes executing you have your answer, and merge sort is O(n log n) in the worst case.
There is an article published in SIAM in 2010 by Cham and Patrascu entitled Counting Inversions, Offline Orthogonal Range Counting, and Related Problems that gives an algorithm taking O(n sqrt(log(n))) time. This is currently the best known algorithm, and improves the long-standing O(n log(n) / log(log(n))) algorithm. From the abstract:
We give an O(n sqrt(lg n))-time algorithm
for counting the number of inversions
in a permutation on n elements. This
improves a long-standing previous
bound of O(n lg n / lg lg n) that
followed from Dietz's data structure
[WADS'89], and answers a question of
Andersson and Petersson [SODA'95]. As
Dietz's result is known to be optimal
for the related dynamic rank problem,
our result demonstrates a significant
improvement in the offline setting.
Our new technique is quite simple: we
perform a "vertical partitioning" of a
trie (akin to van Emde Boas trees),
and use ideas from external memory.
However, the technique finds numerous
applications: for example, we obtain
in d dimensions, an algorithm to
answer n offline orthogonal range
counting queries in time O(n
lgd-2+1/d n);
an improved
construction time for online data
structures for orthogonal range
counting;
an improved update time
for the partial sums problem;
faster
Word RAM algorithms for finding the
maximum depth in an arrangement of
axis-aligned rectangles, and for the
slope selection problem.
As a bonus,
we also give a simple
(1 + ε)-approximation algorithm for
counting inversions that runs in
linear time, improving the previous
O(n lg lg n) bound by Andersson and
Petersson.
I think the awesomest way to do this (and thats just because I love the data structure) is to use a binary indexed tree. Mind you, if all you need is a solution, merge sort would work just as well (I just think this concept totally rocks!). The basic idea is this: Build a data structure which updates values in O(log n) and answers the query "How many numbers less than x have already occurred in the array so far?" Given this, you can easily answer how many are greater than x which contributes to inversions with x as the second number in the pair. For example, consider the list {3, 4, 1, 2}.
When processing 3, there's no other numbers so far, so inversions with 3 on the right side = 0
When processing 4, the number of numbers less than 4 so far = 1, thus number of greater numbers (and hence inversions) = 0
Now, when processing 1, number of numbers less than 1 = 0, this number of greater numbers = 2 which contributes to two inversions (3,1) and (4,1). Same logic applies to 2 which finds 1 number less than it and hence 2 greater than it.
Now, the only question is to understand how these updates and queries happen in log n. The url mentioned above is one of the best tutorials I've read on the subject.
These are the original MERGE and MERGE-SORT algorithms
from Cormen, Leiserson, Rivest, Stein Introduction to Algorithms:
MERGE(A,p,q,r)
1 n1 = q - p + 1
2 n2 = r - q
3 let L[1..n1 + 1] and R[1..n2 + 1] be new arrays
4 for i = 1 to n1
5 L[i] = A[p + i - 1]
6 for j = 1 to n2
7 R[j] = A[q + j]
8 L[n1 + 1] = infinity
9 R[n2 + 1] = infinity
10 i = 1
11 j = 1
12 for k = p to r
13 if L[i] <= R[j]
14 A[k] = L[i]
15 i = i + 1
16 else A[k] = R[j]
17 j = j + 1
and
MERGE-SORT(A,p,r)
1 if p < r
2 q = floor((p + r)/2)
3 MERGE-SORT(A,p,q)
4 MERGE-SORT(A,q + 1,r)
5 MERGE(A,p,q,r)
at line 8 and 9 in MERGE infinity is the so called sentinel card,
which has such value that all array elements are smaller then it.
To get the number of inversion one can introduce a global counter,
let's say ninv initialized to zero before calling MERGE-SORT
and than to modify the MERGE algorithm by adding one line
in the else statement after line 16, something like
ninv += n1 - i
than after MERGE-SORT is finished ninv will hold the number of inversions