Finding the time complexity of code - arrays

Given is an infinite sorted array containing only numbers 0 and 1. Find the transition point efficiently.
For example : 00000000000111111111111111
Output : 11 which is the index where the transition occurs
I have coded a solution for this ignoring some edge cases.
int findTransition(int start)
{
int i;
if(a[start]==1)return start;
for(i=1;;i*=2)
{
//assume that this condition will be true for some index
if(a[start+i]==1)break;
}
if(i==1)return start+1;
return findTransition(start+(i/2));
}
I am not really sure about the time complexity of this solution here. Can someone please help me in figuring this out?
Is it O(log(N))?

Let n be position of transition point
This block
for(i=1;;i*=2)
{
//assume that this condition will be true for some index
if(a[start+i]==1)break;
}
works for log2(n)
So we have
T(n) = log2(n) + T(n/2)
T(n) = log2(n) + log2(n/2) + T(n/4) = log2(n) + (log2(n) - 1) + (log2(n) - 2)...
T(n) = log2(n) * (log2(n) + 1) / 2
So there is O(log(n)^2) complexity (for worst case)
Note: you can use usual binary search instead of recursion call, then you will have log2(n) + log2(n/2) just O(log(n)) granted.

Related

Time complexity of finding index equal to array (sorted) value

This kind of recursion is similar to a binary search, but I'm not sure how to exactly solve the recursion using back substitution.
To find the index where it is equal to the array value (in a sorted array), the code would basically look like this:
find(array, low, high) {
if high < low
return -1
mid = (low + high) / 2
midval = array[mid]
if midval == mid
return mid
int left = find(array, low, min - 1)
if left >= 0
return left
int right = find(array, mid + 1, high)
return right
}
So the recurrence relation would look like this:
T(1) = b
T(n) = 2T(n/2) + c
= 4T(n/4) + c(1+2)
= 8T(n/8) + c(1+2+4)
= 16(n/16) + c(1+2+4+8)
= 2^k T(n/2^k) + (2^k - 1)c
= 2^(logn)T(1) + (2^(logn) - 1)c
= 2^(logn)(1+c) - c
I know the time complexity is suppose to be like O(logn) or O(nlogn), but I'm not sure how to get there from this using back subtitution.
With a sorted array finding an element with a naive implementation has at worst O(n). Hence, a better approach would have a worst-case complexity lower than O(n), so it cannot be O(n logn).
In a typically binary search, one takes advantage of the array being sorted and therefore one does not need to search in both sub-trees for each recursive call. One either goes left or right on the array. So instead of T(n) = 2T(n/2) + c one would have T(n) = T(n/2) + c.
Now your problem is different from a binary search, because you want to find a position on an array that matches its indices value. So, unlike the binary search in this context you might have to go both right and left as well in some recursive calls.
So in your case the worst case scenario is actually O(N), since 2^(log2N) is N as you can see here. Unless, there is a super clever way of improving your code, I would just go for a normal search, simpler and more readable code for a worst-case scenario of O(N) as well.
You search from the beginning of the array if the value x matches the index you return that value. Otherwise, if x > the current index, you can jump to the next index equals to the value x (i.e., array[x]), thus you skip array position that based on the fact that the array is sorted will not have an index matching its value.

Optimal Algorithm for finding peak element in an array

So far I haven't found any algorithm that solves this task: "An element is
considered as a peak if and only if (A[i]>A[i+1])&&(A[i]>A[i-1]), not
taking into account edges of the array(1D)."
I know that the common approach for this problem is using "Divide & Conquer" but that's in case of taking into consideration the edges as "peaks".
The O(..) complexity I need to get for this exercise is O(log(n)).
By the image above it is clear to me why it is O(log(n)), but without the edges complexity changes to O(n), because in the lower picture I run recursive
function on each side of the middle element, which makes it run in O(n) (worst case scenario in which the element is near the edge). In this case, why not to use a simple binary search like this:
public static int GetPeak(int[]A)
{
if(A.length<=2)//doesn't apply for peak definition
{
return -1;
}
else {
int Element=Integer.MAX_VALUE;//The element which is determined as peak
// First and Second elements can't be hills
for(int i=1;i<A.length-1;i++)
{
if(A[i]>A[i+1]&&A[i]>A[i-1])
{
Element=A[i];
break;
}
else
{
Element=-1;
}
}
return Element;
}
The common algorithm is written here: http://courses.csail.mit.edu/6.006/spring11/lectures/lec02.pdf, but as I said before it doesn't apply for the terms of this exercise.
Return only one peak, else return -1.
Also, my apologies if the post is worded incorrectly due to the language barrier (I am not a native English speaker).
I think what you're looking for is a dynamic programming approach, utilizing divide-and-conquer. Essentially, you would have a default value for your peak which you would overwrite when you found one. If you could check at the beginning of your method and only run operations if you hadn't found a peak, then your O() notation would look something like O(pn) where p is the probability that any given element of your array is a peak, which is a variable term as it relates to how your data is structured (or not). For instance, if your array only has values between 1 and 5 and they're distributed equally then the probability would be equal to 0.24 so you would expect the algorithm to run in O(0.24n). Note that this still appears to be equivalent to O(n). However, if you require that your data values are unique on the array then your probability is equal to:
p = 2 * sum( [ choose(x - 1, 2) for x in 3:n ] ) / choose(n, 3)
p = 2 * sum( [ ((x - 1)! / (2 * (x - 3)!)) for x in 3:n ] ) / (n! / (n - 3)!)
p = sum( [ (x - 1) * (x - 2) for x in 3:n ] ) / (n * (n - 1) * (n - 2))
p = ((n * (n + 1) * (2 * n + 1)) / 6 - (n * (n + 1)) + 2 * n - 8) / (n * (n - 1) * (n - 2))
p = ((1 / 3) * n^3 - 5.5 * n^2 + 6.5 * n - 8) / (n * (n - 1) * (n - 2))
So, this seems like a lot but if we take the limit as n approaches infinity then we wind up with a value for p that is near 1/3.
So, if we have a 33% chance of finding a peak at any element on the array, then at the bottom level of your recursion when you have a 1/3 probability of finding a peak. So, the expected value of this is around 3 comparisons before you find one, which means a constant time. However, you still have to get to the bottom level of your recursion before you can do the comparisons and that requires O(log(n)) time. So, a divide-and-conquer approach should run in O(log(n)) time in the average case with O(n log(n)) in the worst case.
If you cannot make any assumptions about your data (monotonicity of the number sequence, number of peaks), and if edges cannot count as peaks, then you cannot hope for a better average performance than O(n). Your data is randomly distributed, and any value can be a peak. You have to examine them one by one, and there is no correlation between the values.
Accepting edges as potential candidates for peaks changes everything: you know there will always be at least one peak, and a good enough strategy is to always search in the direction of increasing values until you start to go down or you reach an edge (this is the one of the document you provided). That strategy is O(nlog(n)) because you use binary search to look for a local max.

if else recursion worst time complexity

I have some trouble to figure out the worst time complexity of below code.
(This is not a homework, see https://leetcode.com/problems/integer-replacement/description/.)
int recursion (int n) {
if (n == 1)
return 0;
if (n % 2 == 0) {
return recursion(n/2) + 1
} else {
return min(recursion(n / 2 + 1) + 1, recursion(n - 1)) + 1;
}
}
The only thing I know is when N == 2 ^ k(k > 0), worst time complexity is O(logN).
However, I am unclear when N is not 2^k. Because even number / 2 can still get odd number. Some people said it is still O(LogN), but I am not convinced.
I know the code is not best solution, just wanna analyze the time complexity. I tried recursion tree and aggregate analysis, seems not help.
If n is even, we know that T(n) = T(n/2) + 1, and if n is odd we know that
T(n) = T(n/2 + 1) + T(n-1) + 1. In the latter case, as n is odd we know that n-1 must be even. if n/2 + 1 is even T(n) = T(n/4) + T(n/2) + 3 and if n/2 + 1 is odd T(n) = 2*T(n/4) + T(n/2) + 3.
From the above discussion, in the worst case T(n) is defined based on the T(n/2) and T(n/4) in a general case. From Akra-Bazzi Theorem we can say, T(n) = O(n^((log(1+sqrt(5))-log(2))/log(2))) ~ O(n^0.69) (from the first case) and T(n) = O(n) from the second case (which n/2 + 1 is odd).
However, for the more tight complexity, we should scrutinize more in our analysis.

time complexity of randomized array insertion

So I had to insert N elements in random order into a size-N array, but I am not sure about the time complexity of the program
the program is basically:
for (i = 0 -> n-1){
index = random (0, n); (n is exclusive)
while (array[index] != null)
index = random (0, n);
array[index] = n
}
Here is my assumption: a normal insertion of N numbers is of course strictly N, but how much cost will the collision from random positions cost? For each n, its collision rate increases like 0, 1/n, 2/n .... n-1/n, so expected number of insertions attempts will be 1, 2, 3 .. n-1, this is O(n), so total time complexity will be O(n^2), so is this the average cost? but wow this is really bad, am I right?
So what will happen if I do a linear search instead of keep trying to generate random numbers? Its worst case will obviously be O(n^2>, but I don't know how to analyze its average case, which depends on average input distribution?
First consider the inner loop. When do we expect to have our first success (find an open position) when there are i values already in the array? For this we use the geometric distribution:
Pr(X = k) = (1-p)^{k-1} p
Where p is the probability of success for an attempt.
Here p is the probability that the array index is not already filled.
There are i filled positions so p = (1 - (i/n)) = ((n - i)/n).
From the wiki, the expectation for the geometric distribution is 1/p = 1 / ((n-i)/n) = n/(n-i).
Therefore, we should expect to make (n / (n - i)) attempts in the inner loop when there are i items in the array.
To fill the array, we insert a new value when the array has i=0..n-1 items in it. The amount of attempts we expect to make overall is the sum:
sum_{i=0,n-1} n/(n-i)
= n * sum_{i=0,n-1}(1/(n-i))
= n * sum_{i=0,n-1}(1/(n-i))
= n * (1/n + 1/(n-1) + ... + 1/1)
= n * (1/1 + ... + 1/(n-1) + 1/n)
= n * sum_{i=1,n}(1/i)
Which is n times the nth harmonic number and is approximately ln(n) + gamma, where gamma is a constant. So overall, the number of attempts is approximately n * (ln(n) + gamma), which is O(nlog n). Remember that this is only the expectation and there is no true upper bound since the inner loop is random; it may never find an open spot.
The expected number of insertions attempt at step i is
sum_{t=0}^infinity (1-i/n)^t * (n-i)/n * t
= (n-i)/n * i/n * (1-i/n)^{-2}
= i/(n-i)
Summing over i you get
sum_{i=0}^{n-1} i/(n-1)
>= sum_{i=n/2}^n i / (n-i)
>= n/2 sum_{x=1}^n/2 1/x
>= n/2 * log(n) + O(n)
And
sum_{i=0}^{n-1} i/(n-i)
<= n * sum _{x=1}^n 1/x
<= n * log(n) + O(n)
So you get exactly n*log(n) as an asymptotic complexity. Which is not as bad as you feared.
About doing a linear search, I don't know how you would do it while keeping the array random. If you really want an efficient algorithm to shuffle your array, you should check out Fisher-Yates shuffle.

Space complexity of a given recursive program

Consider the following C-function:
double foo (int n) {
int i;
double sum;
if (n == 0) return 1.0;
else {
sum = 0.0;
for (i = 0; i < n; i++)
sum + = foo (i);
return sum;
}
}
The space complexity of the above function is
1) O(1)
2) O(n)
3) O(n!)
4) O(n^n)
in the above question, according to me, answer should be (2) but answer is given as (3) option. Although it is recursive function but stack will never have more than O(n) stack depth. Can anyone explain me why is this answer (3) and where am I thinking wrong?
If You needed time complexity then it is certainly not O(N!) as many suggest but way less then that it is O(2^N).
Proof:-
T(N) = T(N-1) + T(N-2) + T(N-3) + T(N-4)........T(1)
moreover by above formula
T(N-1) = T(N-2) + T(N-3)...... T(1)
hence T(N) = T(N-1) + T(N-1) = 2*T(N-1)
solving above gives T(N) = O(2^N)
Whereas if you needed space complexity then for recursive function space complexity is calculated by the amount of stack space at max occupied by it at a moment and that in this case cannot exceed of O(N)
But in any case the answer is not O(N!) because that many computations are not done at all so how can stack occupy that much space.
Note:- Try to run the function for n = 20 if it doesnt cause memory overflow then answer given in text will be 20! which is larger than any memory but i think it will run in O(2^20) time without any stack overflow.
Space complexity is O(N). at any given time the space used is limited to:
N*sizeof_function_call_which_is_a_constant.
Think of it like this:
To calculate foo(n) . The program have to calculate: foo(0)+foo(1)+foo(2) ... foo(n-1):
Similarly for foo(n-1). The program have to recursively calculate: foo(0) + foo(1) + ... foo(n-2).
Basically you will have O(foo(n)) = n! + (n-1)! + (n-2)! + ... 1! = O(n!).
Hope this is clear.

Resources