Big O of duplicate check function - c

I would like to know exactly how to compute the big O of the second while when the number of repetitions keeps going down over time.
int duplicate_check(int a[], int n)
{
int i = n;
while (i > 0)
{
i--;
int j = i - 1;
while (j >= 0)
{
if (a[i] == a[j])
{
return 1;
}
j--;
}
}
return 0;
}

Still O(n^2) regardless of the smaller repetition.
The value you are computing is Sum of (n-k) for k = 0 to n.
This equates to (n^2 + n) / 2 which since O() ignores constants and minor terms is O(n^2).
Note you can solve this problem more efficiently by sorting the array O(nlogn) and then searching for two consecutive numbers that are the same O(n) so total O(nlogn)

Big O is an estimate/theoretical speed, it's not the exact calculation.
Like twain249 said, regardless, the time complexity is O(n^2)

BigO shows the worst case time complexity of an algorithm that means the maximum time an algorithm can take ever.It shows upper bound which indicates that whatever the input is time complexity will always be under that bound.
In your case the worst case will when i will iterate until 0 then complexity will be like:
for i=n j will run n-1 times for i=n-1 j will run n-2 times and so on.
adding all (n-1)+(n-2)+(n-3)+............(n-n)=(n-1)*(n)/2=n^2/2-n/2
after ignoring lower term that is n and constant that is 1/2 it becomes n^2.
So O(n^2) that's how it is computed.

Related

time complexity of nested loops - always just a multiplication of each of them seperated?

When looking at this code for example :
for (int i = 1; i < n; i*=2)
for (int j = 0; j < i; j +=2)
{
// some contstant time operations
}
Is it as simple as saying that because the outer loop is log and and inner loop is n , that combined the result is big(O) of nlogn ?
Here is the analysis of the example in the question. For simplicity I will neglect the increment of 2 in the inner loop and will consider it as 1, because in terms of complexity it does not matter - the inner loop is linear in i and the constant factor of 2 does not matter.
So we can notice, that the outer loop is producing is of values which are powers of 2 capped by n, that is:
1, 2, 4, 8, ... , 2^(log2 n)
these numbers are also the numbers that the "constant time operation" in the inner loop is running for each i.
So all we have to do is to sum up the above series. It is easy to see that these are geometric series:
2^0 + 2^1 + 2^2 + ... + 2^(log2 n)
and it has a well known solution:
(from Wiki )
We have a=1, r=2, and... well n_from_the_image =log n. We have a same name for different variables here, so it is a bit of a problem.
Now let's substitute and get that the sum equals
(1-2^((log2 n) + 1) / (1 - 2) = (1 - 2*n) / (1-2) = 2n-1
Which is a linear O(n) complexity.
Generally, we take the O time complexity to be the number of times the innermost loop is executed (and here we assume the innermost loop consists of statements of O(1) time complexity).
Consider your example. The first loop executes O(log N) times, and the second innermost loop executes O(N) times. If something O(N) is being executed O(log N) times, then yes, the final time complexity is just them multiplied: O(N log N).
Generally, this holds true with most nested loops: you can assume their big-O time complexity to be the time complexity of each loop, multiplied.
However, there are exceptions to this rule when you can consider the break statement. If the loop has the possibility of breaking out early, the time complexity will be different.
Take a look at this example I just came up with:
for(int i = 1; i <= n; ++i) {
int x = i;
while(true) {
x = x/2;
if(x == 0) break;
}
}
Well, the innermost loop is O(infinity), so can we say that the total time complexity is O(N) * O(infinity) = O(infinity)? No. In this case we know the innermost loop will always break in O(log N), giving a total O(N log N) time complexity.

How to check Big O notation of a sorting algorithm?

I don't know how to solve it's complexity and such. How to know if it is faster than other sorting algorithms?
I find difficulty finding it because I'm a little bit bad at math.
#include <stdio.h>
int func(int arr[])
{
int temp;
int numofarrays=9999;
for(int runtime=1; runtime<=numofarrays/2; runtime++)
{
for(int i=0; i<=numofarrays; i++)
{
if(arr[i] > arr[i+1])
{
temp=arr[i];
arr[i]=arr[i+1];
arr[i+1]=temp;
}
if(arr[numofarrays-i-1] > arr[numofarrays-i])
{
temp=arr[numofarrays-i-1];
arr[numofarrays-i-1]=arr[numofarrays-i];
arr[numofarrays-i]=temp;
}
}
}
for(int i=0; i<=9999; i++)
{
printf("%i\n",arr[i]);
}
}
int main()
{
int arr[10000];
for(int i=0; i<=9999; i++)
{
arr[i]=rand() % 10;
}
func(arr);
}
The big o notation is where the limit of steps of the code you write goes to infinity. Since the limit goes to infinity, we neglect the coefficients and look at the largest term.
for(int runtime=1; runtime<=numofarrays/2; runtime++)
{
for(int i=0; i<=numofarrays; i++)
{
Here, the largest term of the first loop is n, and the largest term of the second is n. But since loop 2 makes n turns for each turn of the first loop, we're multiplying n by n.
Result is O(n^2).
Here:
for(int runtime=1; runtime<=numofarrays/2; runtime++)
{
for(int i=0; i<=numofarrays; i++)
{
you have two nested for loops that both depends of the array size. So the complexity is O(N^2).
Further you ask:
How to know if it is faster than other sorting algorithms?
Well big-O does not directly tell you about execution time. For some values of N an O(N) implementation may be slower than an O(N^2) implementation.
Big-O tells you how the execution time increases as N increase. So you know that an O(N^2) will be slower than an O(N) when N gets above a apecific value but you can't know what that value is! It could be 1, 5, 10, 100, 1000, ...
Example:
The empirical approach is to time (t) your algorithm for a given input of n. Then do that experiment for larger and large n, say, 2^n for n = 0 to 20. Plot (2^n, t) on a graph. If you get a straight line, it's a linear algorithm. If you get something that grows faster, try graph ln(t) and see if you get a line, that would be an exponential algorithm, etc.
The analytical approach looks scary and sometimes math heavy, but it's doesn't really have to be. You decide what your expensive operation is, then you count how of many of those you do. Anything that runs in loops including recursive functions are the main candidates the run-time of that goes up as the loop runs more times:
Let's say we want count the number of swaps.
The inner loop swap numofarrays times which is O(n).
The outer loop runs numofarrays/2 times which is also O(n) as we drop the factor 1/2 as it's doesn't matter when n gets large.
This mean we do O(n) * O(n) = O(n^2) number of swaps.
Your print loop, is not doing any swaps so we consider them "free", but if determine that they are as expensive as swaps then we want to count those too. Your algorithm does numofarrays print operations which is O(n).
O(n^2) + O(n) is just O(n^2) as the larger O(n^2) dominates O(n) for sufficiently large n.

Is the time complexity of this code correct?

Calculate the time complexity of this code fragment if the function "process" has a complexity of O(logn)
void funct (int a[], int n)
{
int i=0
while (i < n){
process (a, n);
if (a[i]%2 == 0)
i = i*2+1;
else
i = i+1;
}
I tried to calculate the best and worst case for time complexity;
Worst case is when the "else" statement get called so it should just be:
Worst case : T(n) = O(nlogn)
I have some problems with the best case. I tried this way but i don't know if this is correct
Since in the "if" statement "i" get incremented by "2i+1" it should be
i=2^k-1
2^k < n+1
so k < log_2(n+1)
Is correct to say that the while loop get executed (log_2(n+1)-2)/2 times because this is the last possible value for which i < n ?
if so the time complexity is O(lognlogn) in best case?
The best case is if the sampled values in a are all even. In that case, the complexity is O(log(n)*log(n)), since the loop trip count is O(log(n)).
The worst case is if the sampled values in a are all odd. In that case, the complexity is O(n*log(n)), since the loop trip count is O(n).

Checking duplicates in an array, and removing them is worst-case complexity O(n^2) or O(n^3)?

I am trying to evaluate this algorithm:
checking equality is O(n2)
removing an element is O(n)
So I think the entire algorithm will be O(n^3) in the worst case.
for (i = 0; i < ne-1; i++)
{
for (j = i+1; j < ne; j++)
{
if (strcmp(array[i].id, array[j].id)==0)
{
cont++;
for (k = j; k < ne - 1; k++)
array[k] = array[k + 1];
ne--;
}
}
}
Although you are correct that the cost of comparison is O(n2) and the cost of deleting an element is O(n), the inter-relationship between the two actions results in the entire algorithm being O(n2). Since O(n2) is in O(n3), it is not incorrect to say that the algorithm is O(n3), but that is not a tight bound.
To see why, consider the cost presented by some element of the array. It will either be compared (as array[i]) with every following element, or it will be removed, involving a shift of all following elements. But not both; once it has been removed, it will never be the element used in the outer loop.
In either case, the cost of the element is the number of following elements, and the total cost of the algorithm is worst-case n(n-1)/2, which is O(n2). (If elements are deleted, the actual cost will be less; the worst case occurs if there are no duplicates.)
As #Amit notes, if the cost of performing a comparison or a move is not O(1), that will have to be taken into account, resulting in O(n2 m) where m is the cost of a comparison or assignment. But it would be normal to consider that cost to be fixed.
As I noted in a comment, the algorithm as presented is incorrect. The correct algorithm would be:
for (i = 0; i < n - 1; ++i) {
for (j = i + 1; j < n; ) {
if (IsEqual(a[i], a[j])) {
for (k = j; k < n - 1; ++k)
a[k] = a[k + 1];
--n;
} else {
++j;
}
}
}
Normally, a better solution is to sort the array, which will mean that equal elements will be adjacent, and then do a single O(n) pass to compress the result; that is O(n log n) (from the sort) but does not preserve order. (You can preserve order with an auxiliary array, though.)
Actually it would be O(n3 m) where m is the length of the id string (or the average length of the id strings).
The ith part is O(n), and so is j. You the compare m characters for each iteration and then k is another O(n). It might be possible to show that the amortized complexity is lower though since every iteration of k reduces n, but that requires deeper analysis.

Time complexity of this function

I am pretty sure about my answer but today had a discussion with my friend who said I was wrong.
I think the complexity of this function is O(n^2) in average and worst case and O(n) in best case. Right?
Now what happens when k is not length of array? k is the number of elements you want to sort (rather than the whole array).
Is it O(nk) in best and worst case and O(n) in best case?
Here is my code:
#include <stdio.h>
void bubblesort(int *arr, int k)
{
// k is the number of item of array you want to sort
// e.g. arr[] = { 4,15,7,1,19} with k as 3 will give
// {4,7,15,1,19} , only first k elements are sorted
int i=k, j=0;
char test=1;
while (i && test)
{
test = 0;
--i;
for (j=0 ; j<i; ++j)
{
if ((arr[j]) > (arr[j+1]))
{
// swap
int temp = arr[j];
arr[j]=arr[j+1];
arr[j+1]=temp;
test=1;
}
} // end for loop
} // end while loop
}
int main()
{
int i =0;
int arr[] = { 89,11,15,13,12,10,55};
int n = sizeof(arr)/sizeof(arr[0]);
bubblesort(arr,n-3);
for (i=0;i<n;i++)
{
printf("%d ",arr[i]);
}
return 0;
}
P.S. This is not homework, just looks like one. The function we were discussing is very similar to Bubble sort. In any case, I have added homework tag.
Please help me confirm if I was right or not. Thank you for your help.
Complexity is normally given as a function over n (or N), like O(n), O(n*n), ...
Regarding your code the complexity is as you stated. It is O(n) in best case and O(n*n) in worst case.
What might have lead to misunderstanding in your case is that you have a variable n (length of array) and a variable k (length of part in array to sort). Of course the complexity of your sort does not depend on the length of the array but on the length of the part that you want to sort. So with respect to your variables the complexity is O(k) or O(k*k). But since normally complexity notation is over n you would say O(n) or O(n*n) where n is the length of the part to sort.
Is it O(nk) in best and worst case and O(n) in best case?
No, it's O(k^2) worst case and O(k) best case. Sorting the first k elements of an array of size n is exactly the same as sorting an array of k elements.
That's O(n^2), the outer while goes from k down to 1 (possibly stopping earlier for specific data, but we're talking worst case here), and the inner for goes from 0 to i (which in turn goes up to k), so multiplied they're k^2 in the worst case.
If you care about the best case, that's O(n) because the outer while loop only executes once then gets aborted.

Resources