I made a function that loops through the array and prints any two values of the array that can add up to a value K. The outer for loop is O(n), but the inner loop is a bit confusing to me if the runtime is a O(Log n) or O(n). can you help please? Thank you!!
int canMakeSum(int *array, int n, int key){
int i, j;
for(i = 0; i < n; i++){
for(j = (i+1); j < n; j++){
if(array[i]+array[j] == key){
printf("%d + %d = %d\n", array[i], array[j], key);
}
}
}
}
As others have already shown, the inner loop is still O(n); it's a mean of n/2 iterations, the values 1 through n distributed evenly over the iterations of the outer loop.
Yes, you can solve the problem in O(n log n).
First, sort the array; this is n log n. Now, you have a linear (O(n)) process to find all combinations.
lo = 0
hi = n-1
while lo < hi {
sum = array[lo] + array[hi]
if sum == k {
print "Success", array[lo], array[hi]
lo += 1
hi -= 1
}
else if sum < k // need to increase total
lo += 1
else // need to decrease total
hi -= 1
As the inner loops is dependent to the value of the outer loop, you can't find the complexity of the total porgram without analyzing the both with together. The complexity of the inner loop is n - i - 1.
If you want to compute the complexity of the program, you can sum over n - i -1 from i = 0 to i = n - 1. Hence, the total complexity is T(n) = (n - 1) + (n-2) + ... + 1 + 0 = (n-1)n/2 = \Theta(n^2) (as the statment in the inner loop has a constant complexity (\Theta(1))).
Although the inner loop decreases in the number of items it scans for each iteration in the outer loop, it would still be O(n). The overall time complexity is O(n^2).
Imagine you've an array of 25000 elements. At the starting point at i = 0 and j = 1, the number of elements that j will iterate through (worst case no matches to key) is 24999 elements. Which is a small difference from the total number of elements, so it is 'like' going through n elements.
Related
The question is to find a pair of integers (a,b) from a set M of unsigned integers, where a-b is a multiple of n. Given a positive integer n, which is less than the length (m) of set M.
Here is the snippet I have written.
I am not too sure about the time complexity of this algorithm w.r.t the length of M and the value of n. In the exlude function, worst case is O(m). Then it is within a for loop over m, then O(m^2). In addition, X initialization scales with n, so O(n) here. In total: O(m^2) + O(n), ignoring the other O(1)s. Is this correct?
Also, should I take r = x % n as O(1)?
Any coding related advices on the codes here are welcome!!! Big thx!
//array X is intialized of size n, all -1. Here the code is omitted.
for (int i = 0; i < m; i++)
{
if (currentLength > 1)
{
index = rand() % currentLength;
x = setM[index];
exclude(setM, index, ¤tLength);
r = x % n;
if (X[r] == -1)
{
X[r] = x;
}
else
{
printf("The pair: (%i, %i)\n", X[r], x);
break;
}
}
else
{
break;
}
currentLength -= 1;
}
// to exclude an element based on index, then shift all elements behind by 1 slot
void exclude(int* array, int index, int* length_ptr)
{
if (index != *length_ptr - 1)
{
for (int i = index; i < *length_ptr - 1; i++)
{
array[i] = array[i + 1];
}
}
}
Also, should I take r = x % n as O(1)?
Yes, it's O(1)
I am not too sure about the time complexity of this algorithm ... In total: O(m^2) + O(n)?
Well, kind of but there is more to it than that. The thing is that m and n is not independent.
Consider the case n = 2 and let m be increasing. Your formula would give O(m^2) but is that correct? No. Since there will only be 2 possible results from % n (i.e. 0 and 1) the loop for (int i = 0; i < m; i++) can only run 3 times before we have a match. No matter how much you increase m there can never be more than 3 loops. In each of these loops the exclude function may move near m elements in worst case. In general the for (int i = 0; i < m; i++) can never do more than n+1 loops.
So for m being larger than n you rather have O(n*m) + O(n). When keeping n constant this turns into just O(m). So your algorithm is just O(m) with respect to m.
Now consider the case with a constant m and a large increasing n. In this case your formula gives O(m^2) + O(n). Since m is constant O(m^2) is also constant so your algorithm is just O(n) with respect to n.
Now if you increase both m and n your formula gives O(m^2) + O(n). But since m and n are both increased, O(m^2) will eventually dominate O(N), so we can ignore O(N). In other words, your algorithm is O(M^2) with respect to both.
To recap:
O(m) for constant n and increasing m
O(n) for constant m and increasing n
O(m^2) for increasing n and increasing m
Any coding related advices on the codes here are welcome
Well, this index = rand() % currentLength; is just a bad idea!
You should always test the last element in the array, i.e. index = currentLength - 1;
Why? Simply because that will turn exclude into O(1). In fact you won't even need it! The exclude will happen automatically when doing currentLength -= 1;
This change will improve complexicity like
O(1) for constant n and increasing m
O(n) for constant m and increasing n
O(m)+O(n) for increasing n and increasing m
The O(m)+O(n) can be said to be just O(m) (or just O(n)) as you prefer. The main thing is that it is linear.
Besides that you don't need currentLength. Change the main loop to be
for (int i = m-1; i >= 0; --i)
and use i as index. This simplifies your code to:
for (int i = m-1; i >= 0; --i)
{
r = setM[i] % n;
if (X[r] == -1)
{
X[r] = setM[i];
}
else
{
printf("The pair: (%i, %i)\n", X[r], setM[i]);
break;
}
}
You need to find two numbers x and y that x%n==y%n.
It easy. Use a hash table with a key of x %n. Add consequtive numbers from the set until you find a duplicate. It would be the desired pair. Complexity is O(M).
int dup_chk(int a[], int length)
{
int i = length;
while (i > 0)
{
i--;
int j = i -1;
while (j >= 0)
{
if (a[i] == a[j])
{
return 1;
}
j--;
}
}
return 0;
}
So what I think I know is the following:
line 1 is just 1.
First while loop is N+1.
i--; is N times since its inside the first while loop.
j = i -1; is also N.
Second while loop is (N+1)N = N^2+N since its a while loop within a while loop
if statement: ???
j--; is N(N) = N^2
return 0; is 1
I'm really new to calculating the time complexity of algorithms so I'm not even sure if what I think I know is completely right.
But what is messing with me is the if statement, I do not know how to calculate that (and what if there is an else after it as well?)
EDIT: The grand total is equal to 3/2N^2 + 5/2N+3
I understand that this function is O(N^2) but don't quite get how the grand total was calculated.
Usually such accurate analysis of time complexity is not required. It suffices to know it in terms of Big-O. However, I did some calculations for my own curiosity.
If your concern is just a worst case analysis to obtain the time complexity, consider an array with only unique elements. In such a scenario:
The return 1 statement never executes. The inner while loop executes N(N-1)/2 times (summation i-1 from 1 to N), and three things happen - the while condition is checked (and evaluates to true), the if condition is checked (and evaluates to false) and the variable j is decremented. Therefore, the number of operations is 3N(N-1)/2.
The outer while loop executes N times, and there are three statements apart from the condition check - i is decremented, j is assigned, and the inner while condition fails N times. That is 4N more operations.
Outside all loops, there are three more statements. Initialisation of i, the while condition fails once, and then the return statement. Add 3 more to our tally.
3/2N2 - 3/2N + 4N + 3.
That's 3/2N2 + 5/2N + 3. There is your 'grand total'.
To repeat myself, this calculation is completely unnecessary for all practical purposes.
Maybe this can help you understand what goes wrong in your code. I have added some printout that make easier to understand what happens in your code. I think this should be sufficient to find your error
int dup_chk(int a[], int length)
{
int j = 0;
int i = length;
char stringa[30];
printf("Before first while loop j = %d and i = %d \n", j, i);
while (i > 0)
{
i--;
j = i - 1;
printf("\tIn first while loop j = %d and i = %d\n", j, i);
while (j >= 0)
{
printf("\t\tIn second while loop j = %d and i = %d\n", j, i);
if (a[i] == a[j])
{
printf("\t\tIn if statment j = %d and i = %d\n", j, i);
return 1;
}
j--;
printf("\t\tEnd of second while loop j = %d and i = %d\n", j, i);
}
}
printf("After first while loop j = %d and i = %d \n", j, i);
printf("Press any key to finish the program and close the window\n");
return 0;
}
I should also recomend to debug your code understand what goes on better.
The if check is executed as many times as the inner while loop iterates.
The return 1 is by definition only executed once max. It appears you assume there are no duplicates in the input (ie. worst case), in which case the return 1 statement never executes.
You'll eventually get a feel for what parts of the code you can ignore, so you won't need to calculate this "grand total", and just realize there are two nested loops that each traverse the array - ie. O(N^2).
int dup_chk(int a[], int length)
{
int i = length;
while (i > 0) // Outer loop
{
i--;
int j = i -1;
while (j >= 0) // Inner loop
{
if (a[i] == a[j])
{
return 1;
}
j--;
}
}
return 0;
}
The above program is exactly your code with two comments I took the liberty to add.
Let's consider the worst case scenario (because that's what everyone cares / is worried about). If you notice carefully, you will observe that for every value of i, the Inner loop executes i - 1 times. Thus if your Outer loop executes n times, the Inner loop will execute n * (n - 1) times in total (i.e. n - 1 times for each value of n).
n * (n - 1) yields n^2 - n in general algebra. Now, n^2 increases in leaps and bounds (as compared to n) as you go on increasing the value of n. Asymptotic notation let's us consider the factor which will have the greatest impact on the number of steps to be executed. Thus, we can ignore n and say that this program has a worst case running time of O(n^2).
That's the beauty and simplicity of the Big-O notation. - Quoting Jonathan Leffler from the comments above.
Thorough evaluation:
This program has a special feature: it terminates if a pair (a[I], a[J]) of equal values is found. Assume that we know I and J (we will see later what if there is no such pair).
The outer loop is executed for all I <= i < L, hence L-I times. Each time, the inner loop is executed for all 0 <= j < i, hence i times, except for the last pass (i = I): we have J <= j < I hence I-J iterations.
We assume that the "cost" of a loop is of the form a N + b, where a is the cost of a single iteration and b some constant overhead.
Now for the inner loop, which is run L-I times with decreasing numbers of iterations, using the "triangular numbers" formula, the cost is
a (L-1 + L-2 + ... I+1 + I-J) + b (L - I) = a ((L-1)L/2 - I(I+1)/2 + I-J) + b (L-I)
to which we add the cost of the outer loop to get
a ((L-1)L/2 - I(I+1)/2 + I-J) + b (L-I) + c
(where b is a different constant than above).
In general, this function is quadratic in L, but if a pair is found quickly (say I = L-3), it becomes linear; in the best case (I = L-1,J = L-2), it is even the constant a + b + c.
The worst case occurs when the pair is found last (I = 1, J = 0), which is virtually equivalent to no pair found. Then we have
a (L-1)L/2 + b (L - 1) + c
obviously O(L²).
I understand how to use summation to extract the time complexity (and Big O) from linear for-loops using summation, but how would you use it for multiplication incremental loops to get O(logn). For example, the code below is O(nlogn), but I don't know why.
for (i = 0; i < n; i++)
for (j = 1; j < n; j*7)
/*some O(1) operations*/
Also, why is a while loop O(logn) and a do-while loop O(n^2).
At each iteration of the inner loop you perform j = j * 7 (I assume this is what you meant)
That is, at each iteration j = 7j
After n iterations, j = j*7*7*7*7*...*7*7 = j*(7 ^ n)
Let n be the number we want to reach and m the number of iterations, so:
n = j*7*7*7*...7 = j*(7 ^ m)
Let's take a log from both sides:
log(n) = log(j * (7 ^ m)) ~= m*log(7) = O(m)
So, as we can see - the inner loop runs O(log(n)) times.
#include <stdio.h>
int main() {
int N = 8; /* for example */
int sum = 0;
for (int i = 1; i <= N; i++)
for (int j = 1; j <= i*i; j++)
sum++;
printf("Sum = %d\n", sum);
return 0;
}
for each n value (i variable), j values will be n^2. So the complexity will be n . n^2 = n^3. Is that correct?
If problem becomes:
#include <stdio.h>
int main() {
int N = 8; /* for example */
int sum = 0;
for (int i = 1; i <= N; i++)
for (int j = 1; j <= i*i; j++)
for (int k = 1; k <= j*j; k++)
sum++;
printf("Sum = %d\n", sum);
return 0;
}
Then you use existing n^3 . n^2 = n^5 ? Is that correct?
We have i and j < i*i and k < j*j which is x^1 * x^2 * (x^2)^2 = x^3 * x^4 = x^7 by my count.
In particular, since 1 < i < N we have O(N) for the i loop. Since 1 < j <= i^2 <= N^2 we have O(n^2) for the second loop. Extending the logic, we have 1 < k <= j^2 <= (i^2)^2 <= N^4 for the third loop.
Inner to Outer loops, we execute up to N^4 times for each j loop, and up to N^2 times for each i loop, and up to N times over the i loop, making the total be of order N^4 * N^2 * N = N^7 = O(N^7).
I think the complexity is actually O(n^7).
The first loop executes N steps.
The second loop executes N^2 steps.
In the third loop, j*j can reach N^4, so it has O(N^4) complexity.
Overall, N * N^2 * N^4 = O(N^7)
For i = 1 inner loop runs 1^1 times, for i = 2inner loop runs 2^2 times .... and for i = N inner loop runs N^N times. Its complexity is (1^1 + 2^2 + 3^3 + ...... + N^N) of order O(N^3).
In second case, for i = N first inner loop iterates N^N times and hence the second inner loop(inner most) will iterate up to N * (N^N) * (N^N) times. Hence the complexity is of order N * N^2 * N^4, i.e, O(N^7).
Yes. In the first example, the i loop runs N times, and the inner j loop tuns i*i times, which is O(N^2). So the whole thing is O(N^3).
In the second example there is an additional O(N^4) loop (loop to j*j), so it is O(N^5) overall.
For a more formal proof, work out how many times sum++ is executed in terms of N, and look at the highest polynomial order of N. In the first example it will be a(N^3)+b(N^2)+c(N)+d (for some values of a, b, c and d), so the answer is 3.
NB: Edited re example 2 to say it's O(N^4): misread i*i for j*j.
Consider the number of times all loops will be called.
int main() {
int N = 8; /* for example */
int sum = 0;
for (int i = 1; i <= N; i++) /* Called N times */
for (int j = 1; j <= i*i; j++) /* Called N*N times for i=0..N times */
for (int k = 1; k <= j*j; k++) /* Called N^2*N^2 times for j=0..N^2 times and i=0..N times */
sum++;
printf("Sum = %d\n", sum);
return 0;
}
Thus sum++ statement is called O(N^4)*O(N^2)*O(N) times = O(N^7) and this the overall complexity of the program.
The incorrect way to solve this (although common, and often gives the correct answer) is to approximate the average number of iterations of an inner loop with its worst-case. Here, the inner loop loops at worst O(N^4), the middle loop loops at worst O(N^2) times and the outer loop loops O(N) times, giving the (by chance correct) solution of O(N^7) by multiplying these together.
The right way is to work from the inside out, being careful to be explicit about what's being approximated.
The total number of iterations, T, of the increment instruction is the same as your code. Just writing it out:
T = sum(i=1..N)sum(j=1..i^2)sum(k=1..j^2)1.
The innermost sum is just j^2, giving:
T = sum(i=1..N)sum(j=1..i^2)j^2
The sum indexed by j is a sum of squares of consecutive integers. We can calculate that exactly: sum(j=1..n)j^2 is n*(n+1)*(2n+1)/6. Setting n=i^2, we get
T = sum(i=1..N)i^2*(i^2+1)*(2i^2+1)/6
We could continue to compute the exact answer, by using the formula for sums of 6th, 4th and 2nd powers of consecutive integers, but it's a pain, and for complexity we only care about the highest power of i. So we can approximate.
T = sum(i=1..N)(i^6/3 + o(i^5))
We can now use that sum(i=1..N)i^p = Theta(N^{p+1}) to get the final result:
T = Theta(N^7)
I am trying to understand the subtle difference in the complexity of
each of the examples below.
Example A
int sum = 0;
for (int i = 1; i < N; i *= 2)
for (int j = 0; j < N; j++)
sum++;
My Analysis:
The first for loop goes for lg n times.
The inner loop is independent of outer loop and executes N times every time outer loop executes.
So the complexity must be:
n+n+n... lg n times
Therefore the complexity is n lg n.
Is this correct?
Example B
int sum = 0;
for (int i = 1; i < N; i *= 2)
for(int j = 0; j < i; j++)
sum++;
My Analysis:
The first for loop goes for lg n times.
The inner loop execution depends on outer loop.
So how do I calculate the complexity when no of times inner loop executes depends on outer loop?
Example C
int sum = 0;
for (int n = N; n > 0; n /= 2)
for (int i = 0; i < n; i++)
sum++;
I think example C and example B must have same complexity because no of times the inner loop executes depends on outer loop.
Is this correct?
In examples B and C, the inner loop executes 1 + 2 + ... + n/2 + n times. There happen to be lg n terms in this sequence, and that does mean that int i = 0 executes lg n times, however the sum for the statement(s) in the inner loop is 2n. So we get O(n + lg n) = O(n)
(a) Your analysis is correct
(b) The outer loop goes log(N) times. The inner loop goes in the sequence 1, 2, 4, 8, ... for log(N) times which is a geometric series and is equal to (approx) O(2^log(N)) or twice the amount of the highest multiple.
E.g. : 1 + 2 + 4 = (approx)2*4, 1 + 2 + 4 + 8 = (approx)2*8.
Hence the total complexity is O(2^log(N)) = O(N)
(c) This is same as (b) in reverse order
Fine Time complexity
I=1;
K=1;
While(k<n)
{
Stmt;
K=k+i;
I++;
}