I read it somewhere that deletion of an element is faster in an unsorted array but I am not sure if that's correct.
According to my understanding if we want to delete some particular element then in the case of a sorted array it will take O(log N) time to search it and finally delete it but in the case of the unsorted array it may take in worst case O(N) time to search it linearly before we finally delete it.
So how is this possible?
Summarising all existing answers,
(And adding a few of my points)
There are two processes we need to consider when we delete an element from any array.
Searching for the element in consideration
Deletion of that element
Note : In the below explanation, n is the total number of elements in the array.
Searching
In a sorted array, we can use Binary Search to find the array element to be deleted. The time complexity of Binary Search is O(log n)
In an unsorted array, we must use Linear Search to find the array element to be deleted. The time complexity of Linear Search is O(n)
Deletion
The removal of an element from any array, is done with a time complexity of O(1).
But the deletion process also includes what must be done after the removal of the element!
In a sorted array, all the elements to the right of the deleted element must be shifted one index to the left, to fill the space left behind by the deleted element and to maintain sorted order. Therefore, the worst-case time complexity is O(n)
In an unsorted array, we can fill the space left by the deleted element, with the last element in the array, since the array is unsorted anyway. The time complexity for this process is O(1)
CONCLUSION :
Sorted Array :
Searching : O(log n)
Removal : O(n)
Unsorted Array :
Searching : O(n)
Removal : O(1)
Deletion of an element is faster in a sorted array than in an unsorted array.
This is because you can binary search over a sorted array to find the element specified.
An unsorted array has to check every single element one by one (linear search) to find the element to delete.
The delete operation itself is the same time complexity for both.
O(log N) takes less time to execute than O(N).
Deleting an element from an array is a O(n) operation if you physically remove that element from the array. That's because all the elements to the right of the deleted element must be shifted. It's only an O(1) operation if the element is at the end of the array so we can pop it.
Now, in an unsorted array you can just swap the found element with the element at the end of the array in O(1) and pop from the end one element, also in constant time.
But in a sorted array, if you want to keep it sorted, you can't do that. You have to physically remove the element to keep the array sorted.
So, to make it clear, you need explicitly say that removing an element from a sorted array and keeping it sorted is O(n). If you don't care about it being sorted, you can remove it in O(1). The searching is logarithmic, so this becomes logarithmic. But you can only do this once. Because after that the array isn't sorted anymore and you can't search for your element in log n time.
Sorted array has an order that help us to find given element in O(logN) time. But in unsorted array we have no any order that can help us to find given element so each of the elements of array can be given element, so this enforce us to check all the elements in the array.
Note that, lower bound finding an element in unsorted array is Ω(N), because we need check all the elements
Also, because of Array is a static structure, so deletion or insertion is cost-full for us. So for delete in unsorted array you need at the first find it then remove it that has cost O(N).
For deletion in sorted array you can find it in O(logN) and then remove it that has cost O(logN). Note that in this approach, we doesn't shift element of array because this operation is expensive.
Related
Is it possible to remove duplicates from an unsorted array in O(n) time, O(1) space complexity, using the Floyd's tortoise and hare algorithm?
Consider the array [3,1,3,4,2]. After removing duplicates, the function "remove_dups" must return [3,1,4,2]. Also, the function should work on negative integers in the array.
Yes it is possible. The idea is to consider array items as linked list nodes. Any particular index is pointing to the value at that index.
And you will see that there is loop in case of duplicate, two indexes will have same value, and they will form a cycle just like in the image given below.
Example:
1->2->3->4->5->6->3
So we can find the entry point of cycle in the linked list and that will be our duplicate element.
Source : https://www.geeksforgeeks.org/find-duplicates-constant-array-elements-0-n-1-o1-space/
I am trying Index mapping in Hashing to search an element in an array. A linear search would take O(n) to search an element in an array of size n. In hashing what we're doing is basically reducing the time complexity to O(1) by creating a 2d matrix of zeros (say hash[1000][2]) and reassigning hash[a[i]][0] to 1 if a[i] is positive and hash[-a[i]][1] if a[i] is negative. Here a[i] is the array from which we are supposed to search an element.
for(i=0 ;i<n ;i++)
{
if(a[i]>=0)
has[a[i]][0]=1;
else
has[-a[i]][1]=1;
}
How much time does the above code take to execute?
Even by hashing, aren't we having a time complexity of O(n) just like linear search? Isn't what has consumed to assign n 1's in a 2d array of zeros equal to the time taken to search an element in a linear fashion?
A typical thing to do here would be to maintain an array sorted by hash key, and then use a binary search to locate elements, giving worst-case time complexity of O(log n).
You can maintain the sort order by using the same binary search for inserting new elements as you use for finding existing elements.
That last point is important: as noted in comments, sorting before each search degrades search time to the point where brute-force linear search is faster for small datasets. But that overhead can be eliminated; the array never needs to be sorted if you maintain sort order when inserting new elements.
Given an sorted array in descending order, what will be the time complexity to delete the minimum element from this array?
================================================
My take the minimum element will be at last position so O(n) to find it? or should I apply Binary search since array is sorted or simply O(1) to reach at the end?
It really depends on what you mean "delete from the array." You have an array, sorted in descending order, and you want to delete the minimum element.
Let's say that the array is [5, 4, 3, 2, 1]. You know how large the array is, so finding the minimum element is O(1). You just index a[a.length-1].
But what does it mean to delete the last element? You can easily replace the value there with a sentinel value to say that the position is no longer used, but then the next time you tried to get the minimal element, you'd have to visit a[a.length-1], and then do a reverse scan to find the first used element. That approaches O(n) as you remove more items.
Or do you keep a counter that tells you how many values in the array are actually used? So you'd have a variable, count, to tell you which is the last element. And when you wanted to get the minimal element you'd have:
smallest = a[count];
count = count-1;
That's still O(1), but it leaves unused items in the array.
But if you want to ensure that the array length always reflects the number of items are in it, then you have to re-allocate the array to be smaller. That is O(n).
So the answer to your question is "It depends."
Since the array is sorted in descending order, the minimum element will be always guaranteed to be in last location of array, assuming there are no duplicates in the array. Deletion can be done in O(1).
If you need to do a series of deletions on the array, then you may want to adjust the deleted indices and point to the correct end location of the array. This can be done in constant time.
Hence to sum it up, the total time complexity would be O(1)
I understand that binary search cannot be done for an unordered array.
I also understand that the complexity of a binary search in an ordered array is O(log(n)).
Can I ask
what is the complexity for binary search(insertion) for an
ordered array? I saw from a textbook, it stated that the complexity
is O(n). Why isn't it O(1) since, it can insert directly, just like
linear search.
Since binary search can't be done in unordered list, why is it
possible to do insertion, with a complexity of O(N)?
insertion into list complexity depends on used data structure:
linear array
In this case you need to move all the items from index of inserting by one item so you make room for new inserted item. This is complexity O(n).
linked list
In this case you just changing the pointers of prev/next item so this is O(1)
Now for the ordered list if you want to use binary search (as you noticed) you can use only array. The bin-search insertion of item a0 into ordered array a[n] means this:
find where to place a0
This is the bin search part so for example find index ix such that:
a[ix-1]<=a0 AND a[ix]>a0 // for ascending order
This can be done by bin search in O(log(n))
insert the item
so you need first to move all the items i>=ix by one to make place and then place the item:
for (int i=n;i>ix;i--) a[i]=a[i-1]; a[ix]=a0; n++;
As you can see this is O(n).
put all together
so O(n+log(n)) = O(n) that is why.
BTW. search on not strictly ordered dataset is possible (although it is not called binary search anymore) see
How approximation search works
I have an array of structs called struct Test testArray[25].
The Test struct contains a member called int size.
What is the fastest way to get another array of Test structs that contain all from the original excluding the 5 largest, based on the member size? WITHOUT modifying the original array.
NOTE: Amount of items in the array can be much larger, was just using this for testing and the values could be dynamic. Just wanted a slower subset for testing.
I was thinking of making a copy of the original testArray and then sorting that array. Then return an array of Test structs that did not contain the top 5 or bottom 5 (depending on asc or desc).
OR
Iterating through the testArray looking for the largest 5 and then making a copy of the original array excluding the largest 5. This way seems like it would iterate through the array too many times comparing to the array of 5 largest that had been found.
Follow up question:
Here is what i am doing now, let me know what you think?
Considering the number of largest elements i am interested in is going to remain the same, i am iterating through the array and getting the largest element and swapping it to the front of the array. Then i skip the first element and look for the largest after that and swap it into the second index... so on so forth. Until i have the first 5 largest. Then i stop sorting and just copy the sixth index to the end into a new array.
This way, no matter what, i only iterate through the array 5 times. And i do not have to sort the whole thing.
Partial Sorting with a linear time selection algorithm will do this in O(n) time, where sorting would be O(nlogn).
To quote the Partial Sorting page:
The linear-time selection algorithm described above can be used to find the k smallest or the k largest elements in worst-case linear time O(n). To find the k smallest elements, find the kth smallest element using the linear-time median-of-medians selection algorithm. After that, partition the array with the kth smallest element as pivot. The k smallest elements will be the first k elements.
You can find the k largest items in O(n), although making a copy of the array or an array of pointers to each element (smarter) will cost you some time as well, but you have to do that regardless.
If you'd like me to give a complete explanation of the algorithm involved, just comment.
Update:
Regarding your follow up question, which basically suggests iterating over the list five times... that will work. But it iterates over the list more times than you need to. Finding the k largest elements in one pass (using an O(n) selection algorithm) is much better than that. That way you iterate once to make your new array, and once more to do the selection (if you use median-of-medians, you will not need to iterate a third time to remove the five largest items as you can just split the working array into two parts based on where the 5th largest item is), rather than iterating once to make your new array and then an additional five times.
As stated sorting is O(nlogn +5) iterating in O(5n + 5). In the general case finding m largest numbers is O(nlog +m) using the sort algorithm and O(mn +m) in the iteration algoritm. The question of which algorithm is better depends on the values of m and n. For a value of five iterating is better for up to 2 to the 5th numbers I.e. a measly 32. However in terms of operations sorting is more intensive than iterating so it'll be quite a bit more until it is faster.
You can do better theoretically by using a sorted srray of the largest numbers so far and binary search to maintain the order that will give you O(nlogm) but that again depends on the values of n and m.
Maybe an array isn't the best structure for what you want. Specially since you need to sort it every time a new value is added. Maybe a linked list is better, with a sort on insert (which is O(N) on the worst case and O(1) in the best), then just discard the last five elements. Also, you have to consider that just switching a pointer is considerably faster than reallocating the entire array just get another element in there.
Why not an AVL Tree? Traverse time is O(log2N), but you have to consider the time of rebalancing the tree, and if the time spent coding that is worth it.
With usage of min-heap data structure and set heap size to 5, you can traverse the array and insert into heap when the minimum element of heap is less than the element in the array.
getMin takes O(1) time and insertion takes O(log(k)) time where k is the element size of heap (in our case it is 5). So in the worst case we have complexity O(n*log(k)) to find max 5 elements. Another O(n) will take to get the excluded list.