Supposed an array is initially empty with a size 5, and it expands by 5 everytime all slots are filled.
I understand that if we are only considering any sequence of n append() operations, the amortized cost would be O(n) because the total cost would be:
5+(5+1*5)+(5+2*5)+...+(5+(floor(n/5)-1)*5) = O(n^2).
*where floor(n/5) is the number of array expansions.
However, what if it's any sequence of n operations contains pop() as well? Assume pop() doesn't change array size.
My way obviously wouldn't work and I have read the CLRS but am still quite stucked. Any help would be appreciated.
The answer, somewhat disappointingly, is if your sequence contains s many push or pop operations then the amortized cost of each operation is O(s).
To cherry-pick a line from another question's very good answer:
Amortized analysis gives the average performance (over time) of each operation in the worst case.
The worst case is pretty clear: repeated pushes. In which case, your original analysis stands.
Related
I am learning about BIG-O and got confused below:
int arr=[1,2,3,4,5]
-- simple print
print(arr[0])
print(arr[1])
print(arr[2])
print(arr[3])
print(arr[4])
-- loop - BIG-O O(n)
for i in length(arr) {
print( arr[i] )
}
would the simple print also give me O(n) or O(1) ?
Big-O is all about loops and growth.
If you have a fixed n then it is O(1) to print them — it will always take the same amount of time (n=5) to print n=5 items.
But for some variable n, it takes more time the larger n gets, so it becomes O(n).
If you are talking about storage, then an array of n items is O(n), even for fixed n=5.
Context matters. This context was me being stupid. O(5) is O(1).
Your print statement is constant (O(1)), because you are accessing a value by index, and indexed access of an array is typically a constant-time operation.
Your loop is O(n), as you guessed. So your code in total would also be O(n), since the loop performs a constant action.
As a disclaimer, this can get more complex depending on our computation model and assumptions. For instance, not all computation model assume constant-time random access, and you could argue that printing a number also depends on the size of that number and turns out to be O(log n). But I think that in the scope of your question, you don't need to worry about this.
I sometimes get confused with the time complexity analysis for the code that includes arrays.
For example:
ans = [0] * n
for x in range(1, n):
ans[x] = ans[x-1] + 1
I thought the for-loop had a time complexity of O(n^2) because it accesses elements in the array with n elements, and it repeats the same thing for n times.
However, I've seen some explanations saying it takes just O(n); thus, my question is: when we analyze the time complexity of a program that accesses elements in an array (not necessarily the first or the last element), should we include the time to access those array elements, or is it often ignored?
Indexed access is usually a constant-time operation, due to the availability of random access memory in most practical cases. If you were to run this e.g. in Python and measure the time it takes for different values of n, you will find that this is the case.
Therefore, your code only performs one loop from 1 to n and all other operations are constant-time, so you get a time complexity of O(n).
Your thinking is otherwise right - if this was a linked list and you had to iterate through it to find your value, then it would be O(n2).
time complexity
Big-O cheat sheet
I know this is a general question but I really do need to clear my doubt as I am studying about time complexities. I did try to look it up before posting here but found mixed answers.
My question is, when inserting items in an unsorted array considering it is not full the complexity would be O(1) however if it is full it would be O(n) since we would need to copy all the items to a new array. So would we say that the best case complexity of insertion in an array is O(1) and worst case is O(n) or should we say both best and worst case are both O(n)?
Indeed worst case insertion is O(n) if you have to copy the whole array into a larger array. But you must remember, it is the amortize cost we care about.
Think about it this way: how often do I have to copy the whole array ? Once in n times.
So for n-1 I will pay O(1) and for the final insertion O(n).
In total ~2n for n insertions. Average O(1) per op.
Maybe it is easier for you to think about it as O(2).
Now to maintain that, each time the array is filled, twice that size must allocated, so for each item you insert, you pay extra 1 for the time you might need to copy the corresponding item in the first half of the array.
The question is pretty much what the title says, with a slight variation. If I remember correctly, finding an entry in an array of size 'n' has as the average case the complexity O(n).
I assume that is also the case if there is a fixed number of elements in the vector, of which we want to find one.
But how is it if the amount of entries, of which we still only try to find one, is in some way related to the size of the vector, i.e. grows in some way with it?
I have such a case at hand, but I don't know the exact relation between array size and number of searched-for entries. Might be linear, might be logarithmically.. Is the average case still O(n)?
I would be grateful for any insights.
edit: an example
array size: 100
array content: at each position, a number of 1-10, completely random which one.
what we seek: the first occurrence of "1"
from a naive point of view, we should on average find an entry after 10 lookups in any kind of linear searches (which we have to do, as the content is not sorted.)
As factors are usually omitted in big-O, does that mean that we still need O(n) in time, even though it should be O(n)
It is O(n) anyway.
Think about finding 1 here:
[9,9,9,9,9,1]
If you're doing a linear search through the array, then the average time complexity of finding one of M elements in an array with N elements will be O(I) where I is the average index of the first of the sought elements. If the array is randomly ordered, then I will be O(N/M) on average, and so the time complexity will also be O(N/M) on average and O(N-M) in the worst case.
I have two minds over this question.
First, if you'll consider an unsorted array (which the case seems here), the asymptotic complexity for average case will be surely O(n).
Let's take an example.
We have n elements in the array or better to say Vector. Now,average case will be searching in a linear fashion by node to node. Which appears to be n/2 in general for average or O(n) as an average case. See,if the elements are added, then the complexity's nature won't change but, the effect is clear,it's n/2 comparisons for average---which is directly 1/2 (half) of n. The effect for m elements now after insertion in array will be O(n-m),or in comparison wise,(n-m)/2 comparisons added as a result to addition of elements in the Vector!
So,we find that with increase in size of array or better to say Vector---the complexity's nature won't change though the no. of comparisons required would be more as it is equal to n/2 in average case.
Second, if the array or vector is sorted, then performing binary-searches will have worst-cases of order log(n+1)---again dependent on n. Also, the average case will increase the comparisons logarithmically,but the complexity order O(log n) won't change!
I am trying to sort an array which has properties like
it increases upto some extent then it starts decreasing, then increases and then decreases and so on. Is there any algorithm which can sort this in less then nlog(n) complexity by making use of it being partially ordered?
array example = 14,19,34,56,36,22,20,7,45,56,50,32,31,45......... upto n
Thanks in advance
Any sequence of numbers will go up and down and up and down again etc unless they are already fully sorted (May start with a down, of course). You could run through the sequence noting the points where it changes direction, then then merge-sort the sequences (reverse reading the backward sequences)
In general the complexity is N log N because we don't know how sorted it is at this point. If it is moderately well sorted, i.e. there are fewer changes of direction, it will take fewer comparisons.
You could find the change / partition points, and perform a merge sort between pairs of partitions. This would take advantage of the existing ordering, as normally the merge sort starts with pairs of elements.
Edit Just trying to figure out the complexity here. Merge sort is n log(n), where the log(n) relates to the number of times you have to re-partition. First every pair of elements, then every pair of pairs, etc... until you reach the size of the array. In this case you have n elements with p partitions, where p < n, so I'm guessing the complexity is p log(p), but am open to correction. e.g. merge each pair of paritions, and repeat based on half the number of partitions after the merge.
See Topological sorting
If you know for a fact that the data are "almost sorted" and the set size is reasonably small (say an array that can be indexed by a 16-bit integer), then Shell is probably your best bet. Yes, it has a basic time complexity of O(n^2) (which can be reduced by the sequence used for gap sizing to a current best-worst-case of O(n*log^2(n))), but the performance improves with the sortedness of the input set to a best-case of O(n) on an already-sorted set. Using Sedgewick's sequence for gap size will give the best performance on those occasions when the input is not as sorted as you expected it to be.
Strand Sort might be close to what you're looking for. O(n sqrt(n)) in the average case, O(n) best case (list already sorted), O(n^2) worst case (list sorted in reverse order).
Share and enjoy.