Non-optimized sorting algorithm, which one to use? - arrays

The last element of the sorted array is replaced with a random value that does not occur in the array. Which classical, ie non-optimized, version of the sorting algorithm should be used to sort this array as efficiently as possible?

Since the array is sorted except for the last element, no sorting is needed.
Simply remove take the last element form the array, and insert it at the right location. Takes O(log n) time to find the location to insert it at.
P.s.
As pointed out by #Henry the actual inserting in an array (at least in most programming-languages) will take another O(n) time because it most likely means to shift all elements one to the right to free the position we want to insert our element.

Related

Find duplicates in an array in linear time

Problem: You are given an array of n+1 integers from range 1..n. At least one number has duplicate. All array values can be same. Print all duplicates in linear time and constant space. Array can't be modified.
The obvious solution would be to create a bit array with default value false, set 1 in bitarray[array[i]] for each element, then check if it's already 1. That requires additional space, so no good. My another thought: reorder the array by hash and check if a current element and the element array[hash % n] are equal. This is also no good since we can't modify the original array. Now I think that it looks like an impossible task. Is there even a solution to this?

Time Complexity in Sorted Array

Given an sorted array in descending order, what will be the time complexity to delete the minimum element from this array?
================================================
My take the minimum element will be at last position so O(n) to find it? or should I apply Binary search since array is sorted or simply O(1) to reach at the end?
It really depends on what you mean "delete from the array." You have an array, sorted in descending order, and you want to delete the minimum element.
Let's say that the array is [5, 4, 3, 2, 1]. You know how large the array is, so finding the minimum element is O(1). You just index a[a.length-1].
But what does it mean to delete the last element? You can easily replace the value there with a sentinel value to say that the position is no longer used, but then the next time you tried to get the minimal element, you'd have to visit a[a.length-1], and then do a reverse scan to find the first used element. That approaches O(n) as you remove more items.
Or do you keep a counter that tells you how many values in the array are actually used? So you'd have a variable, count, to tell you which is the last element. And when you wanted to get the minimal element you'd have:
smallest = a[count];
count = count-1;
That's still O(1), but it leaves unused items in the array.
But if you want to ensure that the array length always reflects the number of items are in it, then you have to re-allocate the array to be smaller. That is O(n).
So the answer to your question is "It depends."
Since the array is sorted in descending order, the minimum element will be always guaranteed to be in last location of array, assuming there are no duplicates in the array. Deletion can be done in O(1).
If you need to do a series of deletions on the array, then you may want to adjust the deleted indices and point to the correct end location of the array. This can be done in constant time.
Hence to sum it up, the total time complexity would be O(1)

Deletion in dynamic array

Can anyone explain the time complexity of (deletion at ending in dynamic array)?
I think the answer is O(1)
But in book its mentioned O(n).
Since we are talking about dynamic arrays, that is, arrays with the ability to add/remove elements to/from them, there are two possible solutions to implement a dynamic array:
You allocate enough memory to hold all the current and future elements. Also, you need to know the last possible index. Using this setup, the complexity of removing the last element is O(1), since you just decrement the last index. However, removing a non-last element has a linear complexity, since you need to copy all the later elements to the previous before you decrement the last index. Also, you might have difficulty in identifying the maximum possible size at allocation time, possibly leading to overflow issues or memory waste.
You can implement it using a list. This way you will not know what the address of the last element is, so you will need to iterate your list till the penultimate item and then free the memory of the last item and set the next of the penultimate item to point to nil. Since the book mentioned a complexity of O(n) to remove the last element, we can safely assume that by dynamic array, the book meant this second option.

Solution for checking repetitive elements in the matrix[1000][1000]

Need to write function for checking repetitive elements in the matrix[1000][1000]. Return True if found any repetitive elements, and False if not.
I think need to make the solution with two steps:
1. Sort every elements in the matrix from smaller to bigger (during sorting we can check the elements on equality) Using merge sort for example.
2. Compare previous and next element one by one from first to last element in the matrix.
Is efficiency of this solution good enough?
This is the element distinctness problem, and it does not seem it is important that your array is 2D, you can regard it as a "regular" array, and solve it with the "regular" solution of element distinctness, which is:
Sort the array, then iterate the elements and check if there is an index i such that arr[i] == arr[i+1].
This solution is O(nlogn) time, assuming efficient sort, with little extra space needed.
Store the elements in a hash-set, and when inserting each element - check if the element already exists in the set.
This solution is O(n) time on average, and O(n^2) time worst case, and needs O(n) extra space.

Quickest way to find 5 largest values in an array of structs

I have an array of structs called struct Test testArray[25].
The Test struct contains a member called int size.
What is the fastest way to get another array of Test structs that contain all from the original excluding the 5 largest, based on the member size? WITHOUT modifying the original array.
NOTE: Amount of items in the array can be much larger, was just using this for testing and the values could be dynamic. Just wanted a slower subset for testing.
I was thinking of making a copy of the original testArray and then sorting that array. Then return an array of Test structs that did not contain the top 5 or bottom 5 (depending on asc or desc).
OR
Iterating through the testArray looking for the largest 5 and then making a copy of the original array excluding the largest 5. This way seems like it would iterate through the array too many times comparing to the array of 5 largest that had been found.
Follow up question:
Here is what i am doing now, let me know what you think?
Considering the number of largest elements i am interested in is going to remain the same, i am iterating through the array and getting the largest element and swapping it to the front of the array. Then i skip the first element and look for the largest after that and swap it into the second index... so on so forth. Until i have the first 5 largest. Then i stop sorting and just copy the sixth index to the end into a new array.
This way, no matter what, i only iterate through the array 5 times. And i do not have to sort the whole thing.
Partial Sorting with a linear time selection algorithm will do this in O(n) time, where sorting would be O(nlogn).
To quote the Partial Sorting page:
The linear-time selection algorithm described above can be used to find the k smallest or the k largest elements in worst-case linear time O(n). To find the k smallest elements, find the kth smallest element using the linear-time median-of-medians selection algorithm. After that, partition the array with the kth smallest element as pivot. The k smallest elements will be the first k elements.
You can find the k largest items in O(n), although making a copy of the array or an array of pointers to each element (smarter) will cost you some time as well, but you have to do that regardless.
If you'd like me to give a complete explanation of the algorithm involved, just comment.
Update:
Regarding your follow up question, which basically suggests iterating over the list five times... that will work. But it iterates over the list more times than you need to. Finding the k largest elements in one pass (using an O(n) selection algorithm) is much better than that. That way you iterate once to make your new array, and once more to do the selection (if you use median-of-medians, you will not need to iterate a third time to remove the five largest items as you can just split the working array into two parts based on where the 5th largest item is), rather than iterating once to make your new array and then an additional five times.
As stated sorting is O(nlogn +5) iterating in O(5n + 5). In the general case finding m largest numbers is O(nlog +m) using the sort algorithm and O(mn +m) in the iteration algoritm. The question of which algorithm is better depends on the values of m and n. For a value of five iterating is better for up to 2 to the 5th numbers I.e. a measly 32. However in terms of operations sorting is more intensive than iterating so it'll be quite a bit more until it is faster.
You can do better theoretically by using a sorted srray of the largest numbers so far and binary search to maintain the order that will give you O(nlogm) but that again depends on the values of n and m.
Maybe an array isn't the best structure for what you want. Specially since you need to sort it every time a new value is added. Maybe a linked list is better, with a sort on insert (which is O(N) on the worst case and O(1) in the best), then just discard the last five elements. Also, you have to consider that just switching a pointer is considerably faster than reallocating the entire array just get another element in there.
Why not an AVL Tree? Traverse time is O(log2N), but you have to consider the time of rebalancing the tree, and if the time spent coding that is worth it.
With usage of min-heap data structure and set heap size to 5, you can traverse the array and insert into heap when the minimum element of heap is less than the element in the array.
getMin takes O(1) time and insertion takes O(log(k)) time where k is the element size of heap (in our case it is 5). So in the worst case we have complexity O(n*log(k)) to find max 5 elements. Another O(n) will take to get the excluded list.

Resources