I am struggling to find a proof for the following statement. Prove that there is no comparison-based algorithm that receives an n sized array and emits an array of the same elements, in which all the elements found in indexes divided by 3 now appear in their sorted form in these indexes in linear time.
For example the array : 8 6 1 3 0 9 4
After executing the algorithm the array will look like : 3 6 1 4 0 9 8
originally the elements 8,3,4 appear in the array in indexes that are a multiply of 3 and after executing the algorithm they still will appear in indexes that are a multiply of 3 but this time they will appear in their sorted form in this case 3,4,8.
I need to prove that such algorithm does not exist. I tried assuming that statement was right so that at some point I will get contradiction but it did not work out with me. Thanks for any help.
There are two arguments that prove such an algorithm cannot exist. First, as pointed out in comments, if such an algorithm did exist, then you could do the following:
Run the algorithm on the 0 mod 3 elements
Run the algorithm on the 1 mod 3 elements
Run the algorithm on the 2 mod 3 elements
Merge the three lists to create a fully sorted array.
Each of the first three steps would run in O(n/3) time, and the last step would run in O(n) time. That would give you an O(n) comparison sorting algorithm. But comparison sorting is proven to be O(n log n).
Second argument is that given an array of n items, you want to sort n/3 items in O(n) time. As mentioned, comparison sorting is O(n log n). So sorting n/3 items will be O((n/3) * log(n/3)). So to sort those n/3 items in O(n) time, log(n/3) must be less than 3. The base 2 log of 8 is equal to 3. So if n > 24, then sorting n/3 items will take more than O(n) comparisons.
Related
Given an array of n integers, all numbers are unique exception one of them.
The repeating number repeats n/2 times if n is even
The repeating number repeats (n-1)/2 or (n+1)/2 times if n is odd
The repeating number is not adjacent to itself in the array
Write a program to find the repeating number without using extra space.
This is how I tried to solve the problem.
If n is even, then there are n/2 repeating elements. Also, the repeating elements should not be adjacent. So if say there are 6 elements, 3 elements are repeated. The elements can be either at indices 0,2 and 4 or 1,3 and 5. So if I just check if any element repeats at index 0 and 2, and then at index 1 and 3, I can get the repeating element.
If n is odd, then there are 2 choices.
If (n+1)/2 elements are repeating, then we can just check indices 0 and 2. For example say there are 7 elements, 4 of them are repeated, then repeating elements have to be at indices 0,2,4 and 6.
However I cannot find a way to find the (n-1)/2 repeating elements when n is odd. I have thought of using xor and sums but can't find a way.
Let us call the element that repeats as the "majority".
Boyer–Moore majority vote algorithm can help here. The algorithm finds an element that occurs repeatedly for more than half of the elements of the input if any such element exists.
But in your case the situation is interesting. The majority may not occur more than half the times. All elements are unique except the repeating one and repeating numbers are not adjacent. Also, majority element exists for sure.
So,
Run majority vote algorithm on numbers at even index in the array. Makes a second pass through the input array to verify that the element reported by the algorithm really is a majority.
If in the above step we don't get the majority element, you can repeat the above procedure for numbers at odd index in the array. You can do this second step a bit more smartly because we know for sure that majority element exists. So, any number that repeats would be the result.
In the implementation of above, there is a good scope for small optimizations.
I think I should not explain the majority vote algorithm here. If you want me to, let me know. Apparently, without knowing this majority algorithm we should be able to do it with some counting logic (which would most probably end up the same as the majority algorithm). But just that it's a standard algorithm, we can leverage it.
Suppose you have an array of integers (for eg. [1 5 3 4 6]). The elements are rearranged according to the following rule. Every element can hop forward (towards left) and slide the elements in those indices over which it hopped. The process starts with element in second index (i.e. 5). It has a choice to hop over element 1 or it can stay in its own position.If it does choose to hop, element 1 slides down to index 2. Let us assume it does choose to hop and our resultant array will then be [5 1 3 4 6]. Element 3 can now hop over 1 or 2 positions and the process repeats. If 3 hops over one position the array will now be [5 3 1 4 6] and if it hops over two positions it will now be [3 5 1 4 6].
It is very easy to show that all possible permutation of the elements is possible in this way. Also any final configuration can be reached by an unique set of occurrences.
The question is, given a final array and a source array, find the total number of hops required to arrive at the final array from the source. A O(N^2) implementation is easy to find, however I believe this can be done in O(N) or O(NlogN). Also if it is not possible to do better than O(N2) it will be great to know.
For example if the final array is [3,5,1,4,6] and the source array [1,5,3,4,6], the answer will be 3.
My O(N2) algorithm is like this: you loop over all the positions of the source array from the end, since we know that is the last element to move. Here it will be 6 and we check its position in the final array. We calculate the number of hops necessary and need to rearrange the final array to put that element in its original position in the source array. The rearranging step goes over all the elements in the array and the process loops over all the elements, hence O(N^2). Using Hashmap or map can help in searching, but the map needs to be updated after every step which makes in O(N^2).
P.S. I am trying to model correlation between two permutations in a Bayesian way and this is a sub-problem of that. Any ideas on modifying the generative process to make the problem simpler is also helpful.
Edit: I have found my answer. This is exactly what Kendall Tau distance does. There is an easy merge sort based algorithm to find this out in O(NlogN).
Consider the target array as an ordering. A target array [2 5 4 1 3] can be seen as [1 2 3 4 5], just by relabeling. You only have to know the mapping to be able to compare elements in constant time. On this instance, to compare 4 and 5 you check: index[4]=2 > index[5]=1 (in the target array) and so 4 > 5 (meaning: 4 must be to the right of 5 at the end).
So what you really have is just a vanilla sorting problem. The ordering is just different from the usual numerical ordering. The only thing that changes is your comparison function. The rest is basically the same. Sorting can be achieved in O(nlgn), or even O(n) (radix sort). That said, you have some additional constraints: you must sort in-place, and you can only swap two adjacent elements.
A strong and simple candidate would be selection sort, which will do just that in O(n^2) time. On each iteration, you identify the "leftiest" remaining element in the "unplaced" portion and swap it until it lands at the end of the "placed" portion. It can improve to O(nlgn) with the use of an appropriate data structure (priority queue for identifying the "leftiest" remaining element in O(lgn) time). Since nlgn is a lower bound for comparison based sorting, I really don't think you can do better than that.
Edit: So you're not interested in the sequence of swaps at all, only the minimum number of swaps required. This is exactly the number of inversions in the array (adapted to your particular needs: "non natural ordering" comparison function, but it doesn't change the maths). See this answer for a proof of that assertion.
One way to find the number of inversions is to adapt the Merge Sort algorithm. Since you have to actually sort the array to compute it, it turns out to be still O(nlgn) time. For an implementation, see this answer or this (again, remember that you'll have to adapt).
From your answer I assume number of hops is total number of swaps of adjacent elements needed to transform original array to final array.
I suggest to use something like insert sort, but without insertion part - data in arrays will not be altered.
You can make queue t for stalled hoppers as balanced binary search tree with counters (number of elements in subtree).
You can add element to t, remove element from t, balance t and find element position in t in O(log C) time, where C is the number of elements in t.
Few words on finding position of element. It consists of binary search with accumulation of skipped left sides (and middle elements +1 if you keep elements on branches) counts.
Few words on balancing/addition/removal. You have to traverse upward from removed/added element/subtree and update counters. Overall number of operations still hold at O(log C) for insert+balance and remove+balance.
Let's t is that (balanced search tree) queue, p is current original array index, q is final array index, original array is a, final array is f.
Now we have 1 loop starting from left side (say, p=0, q=0):
If a[p] == f[q], then original array element hops over the whole queue. Add t.count to the answer, increment p, increment q.
If a[p] != f[q] and f[q] is not in t, then insert a[p] into t and increment p.
If a[p] != f[q] and f[q] is in t, then add f[q]'s position in queue to answer, remove f[q] from t and increment q.
I like the magic that will ensure this process will move p and q to ends of arrays in the same time if arrays are really permutations of one array. Nevertheless you should probably check p and q overflows to detect incorrect data as we have no really faster way to prove data is correct.
This question already has answers here:
O(n) algorithm to find the median of n² implicit numbers
(3 answers)
Closed 7 years ago.
is there a way to find the Median of an unsorted array:
1- without sorting it.
2- without using the select algorithm, nor the median of medians
I found a lot of other questions similar to mine. But the solutions, most of them, if not all of them, discussed the SelectProblem and the MedianOfMedians
You can certainly find the median of an array without sorting it. What is not easy is doing that efficiently.
For example, you could just iterate over the elements of the array; for each element, count the number of elements less than and equal to it, until you find a value with the correct count. That will be O(n2) time but only O(1) space.
Or you could use a min heap whose size is just over half the size of the array. (That is, if the array has 2k or 2k+1 elements, then the heap should have k+1 elements.) Build the heap using the first array elements, using the standard heap building algorithm (which is O(N)). Then, for each remaining element x, if x is greater than the heap's minimum, replace the min element with x and do a SiftUp operation (which is O(log N)). At the end, the median is either the heap's minimum element (if the original array's size was odd) or is the average of the two smallest elements in the heap. So that's a total of O(n log n) time, and O(n) space if you cannot rearrange array elements. (If you can rearrange array elements, you can do this in-place.)
There is a randomized algorithm able to accomplish this task in O(n) steps (average case scenario), but it does involve sorting some subsets of the array. And, because of its random nature, there is no guarantee it will actually ever finish (though this unfortunate event should happen with vanishing probability).
I will leave the main idea here. For a more detailed description and for the proof of why this algorithm works, check here.
Let A be your array and let n=|A|. Lets assume all elements of A are distinct. The algorithm goes like this:
Randomly select t = n^(3/4) elements from A.
Let T be the "set" of the selected elements.Sort T.
Set pl = T[t/2-sqrt(n)] and pr = T[t/2+sqrt(n)].
Iterate through the elements of A and determine how many elements are less than pl (denoted by l) and how many are greater than pr (denoted by r). If l > n/2 or r > n/2, go back to step 1.
Let M be the set of elements in A in between pl and pr. M can be determined in step 4, just in case we reach step 5. If the size of M is no more than 4t, sort M. Otherwise, go back to step 1.
Return m = M[n/2-l] as the median element.
The main idea behind the algorithm is to obtain two elements (pl and pr) that enclose the median element (i.e. pl < m < pr) such that these two are very close one two each other in the ordered version of the array (and do this without actually sorting the array). With high probability, all the six steps only need to execute once (i.e. you will get pl and pr with these "good" properties from the first and only pass through step 1-5, so no going back to step 1). Once you find two such elements, you can simply sort the elements in between them and find the median element of A.
Step 2 and Step 5 do involve some sorting (which might be against the "rules" you've mysteriously established :p). If sorting a sub-array is on the table, you should use some sorting method that does this in O(slogs) steps, where s is the size of the array you are sorting. Since T and M are significantly smaller than A the sorting steps take "less than" O(n) steps. If it is also against the rules to sort a sub-array, then take into consideration that in both cases the sorting is not really needed. You only need to find a way to determine pl, pr and m, which is just another selection problem (with respective indices). While sorting T and M does accomplish this, you could use any other selection method (perhaps something rici suggested earlier).
A non-destructive routine selip() is described at http://www.aip.de/groups/soe/local/numres/bookfpdf/f8-5.pdf. It makes multiple passes through the data, at each stage making a random choice of items within the current range of values and then counting the number of items to establish the ranks of the random selection.
any easy method to calculate time complexity but not using searching and sorting
for eg: An array of size n initialized with 0.
write a code which inserts the
value 3k at position 3k in the array, whe re k=0,1…
Can you make this a bit more clear?
The way I am understanding this right now, are you asking how I could for example given the array
[1 2 2 1 4 5 6 3 2 4 ...]
multiply the value at k by 3?
If that is so, that would be a matter of indexing into the array.
If you are trying to find the value 3 in the array, there are multiple ways to go about it.
You could simply traverse the array, which is going to be at most o(n) time, but if it is sorted, just do a binary search.
edit:
By time complexity, the first one would be o(1) and the binary search would
be o(log n)
I want to sort the elements in an array by their frequency.
Input: 2 5 2 8 5 6 8 8
Output: 8 8 8 2 2 5 5 6
Now one solution to this would be:
Sort the elements using Quick sort or Merge sort. O(nlogn)
Construct a 2D array of element and count by scanning the sorted array. O(n)
Sort the constructed 2D array according to count. O(nlogn)
Among the other probable methods that I have read, one uses a Binary Search Tree and the other uses Hashing.
Could anyone suggest me a better algorithm? I know the complexity can't be reduced. But I want to avoid so many traversals.
You can perform one pass on the array without sorting it, and on a separate structure go counting how many times you find an element. This could be done on a separate array, if you know the ranges of the elements you'll find, or on a hash table, if you don't. In any case this process will be O(n). Then you can perform a sort of the second structure generated (where you have the count), using as sort parameter the amount that each element has associated. This second process is, as you said O(nlogn) if you choose a proper algorithm.
For this second phase I would recommend using Heap sort, by the means of a priority queue. You can tell the queue to order the elements by the count attribute (the one calculated on step one), and then just add the elements one by one. When you finish adding, the queue will be already sorted, and the algorithm has the desired complexity. TO retrieve your elements in order you just have to start popping.